top of page


Brief introducton

This solution provides the following:

  • Detect faces in the image. 

  • Track faces and label them frame by frame with a unique identifier. 

  • Detect subtle colour changes in the faces caused by blood flow to measure heart rate. 

  • Provides the heart beats per minute (BPM) and the heart rate diagram measured from the changes in face colour. 

  • The limitation is that the user cannot move during the measurement. 


Business case

Monitoring heart rate (HR) and heart rate variability (HRV) on daily basis is very valuable. HRV is the indicator of the balance of mental and physical conditions, being a marker of health and stress. When the sympathetic nervous system is predominant like when people feel stressed, HRV decreases. When the parasympathetic nervous system is predominant like when people are relaxed, HRV increases.

There has been growing attention for HRV as an indicator of cardiovascular disease, and internal emotions such as depression and work stress, etc. Measuring heart rate or heart rate variability doesn't require complicated tools, in fact all we need is a web camera. The face in the captured webcam image is detected and modeled in order to determine facial landmarks and head orientation. The region of interest is approximately the top two-thirds of the face, where most of the blood vessels are concentrated. Analyzing this area with advanced AI and deep learning algorithms, including computer vision technology and signal processing, Heart rate, Heart Rate Variability or even Mental Stress can be determined. In addition, constant measurement is possible this way, without discomfort for the user. The measured data then can be used for analysis , and for example search for indicators of chronic diseases. What's more, this method can be very useful for people who experience anxiety at the doctor. This is known as white coat syndrome, or the white coat effect.

This technology is also suitable for spoofing detection, because a photo, display or 3d face replica has no pulse. Moreover, if someone masks themselves to look like someone else, it will also obscure the skin surface and in these cases the measured pulse will be 0. 


The approach to remote detection is the following:

  • [Skin pixel selection] The face in the captured webcam image is detected and modeled in order to determine facial landmarks and head orientation. Subsequently, approximately the top two-thirds of the face, where most of the blood vessels are concentrated, is selected as the region of interest.

  • [Signal extraction] The average of each pixel colors (red, green, blue) of the region is measured over time (both specular + diffuse reflections).

  • [Signal filtering] The noise from the head motions is detected by fitting the facial model and then noise-free heart rate is produced.

  • [Output calculations] By detecting peaks, inter-beat intervals are measured and then the heart rates and heart rate variability are estimated.


Technical detailes

Input (video to process, capable of processing)​:

  • mjpeg stream

  • rtsp stream

  • USB camera devices

  • video files (avi, mp4, mkv formats supported)


  • Processed video frame

  • The faces in the frame (boinding boxes)

  • For each face:

    • Unique Tracking ID (when processing a video file, the same ID on each frame belongs to the same person)

    • 5 facial landmark points

    • Provides the heart beats per minute (BPM)

    • Provides the heart rate diagram 

  • The system is able to to write the processed video to a video file. 

The demo video was recorded on a HP Laptop 15-DA0042NH (Processor: Intel(R) Core(TM) i7-8550U CPU, RAM: 8 Gb). 

It used 500 Mb RAM and the CPU usage was 65% during the recording. 

The input video was captured using a Xiaomi CMSXJ22A web camera. The input resolution was 1080p.

During recording, the system processing speed was stable above 40 FPS. When processing a single face, the system can maintain this speed on this hardware. When processing multiple faces, the system may be slower. The visualization was added to the video afterwards. The visualization in the video can be done live, but may slow down processing. 

The system is written entirely in C++ and uses the following libraries/technologies:


bottom of page