Category: Blog, iOS, Development

Face tracking with AVFoundation

Face tracking with AVFoundation

Face tracking is an interesting feature which is available on iOS since it’s 5th version. In this tutorial, I would like to show you how to implement it in Swift 3.0.

The biggest issue with the implementation of this feature is a fact that you have to implement camera support by using AVFoundation which is an alternative for UIImagePicker but it’s much more customizable and allow you for doing almost everything with your cameras, and also requires a bit more time…

Ok so let’s code something.

First of all, we can create an instance of CIDetector object which will be used later for our face-detecting features. We have to create it by setting detector type to CIDetectorTypeFace (in the same way we are able to detect rectangles, QR codes or texts) and also specify it’s accuracy (which can be low or high).

Now it’s time to focus on camera support implementation.  We have to create AVCaptureSession instance and set the sessionPreset which defines the quality of captured images.

Next step will be getting an access to AVCaptureDevice. For face tracking purpose we will use the front camera. We can get it by filtering the list of all possible devices (cameras) in our real-device.

Once we have an instance of captureDevice (front camera) we have to create AVCaptureDeviceInput which will be added to our AVCaptureSession.

We should lock the session for our changes by using beginConfiguration method.

Good and safe way to add a new input to our session is checking if we are able to add it before using addInput method.

If our input is created we have to create output to capture data from the camera.

We can use AVCaptureVideoDataOutput instance for that. We should also add some video settings like pixels format type, and add it in the same way as in the case of input.

Once we are finished with the configuration we have to call commitConfiguration to let our session instance know that everything is set.

Because our output will be collecting data all the time once session will be running, we have to create a special dispatch queue for it.

Next, we have to implement AVCaptureVideoDataOutputSampleBufferDelegate to have an access to raw data which is gathered by the camera.

Now the magic begins…

Our face detector is able to look for features in the instance of CIImage so we have to convert our sampleBuffer form delegate method to it.

By features I mean mouths, eyes, heads (yes, we can detect for more than one person at once).

To get CIImage on which faceDetector will be looking for features, we have to use CMSampleBuggerGetImageBuffer.

We also have to create an options object which will define what exactly on our faces we have to be looking for.

In example app which is available on our GitHub (link at the bottom of page), I focused on detecting smile and eyes blink.

Once we have our ciImage and options set up, we can start real tracking.  CIDetector object has a function called features which returns an array with all found features.

Now we can loop through the array to examine the bounds of each face and each feature in the faces. I’m focused only on displaying  details about one person.

To have an access to properties like mouthPosition, hasSmile or left/right eye closed we need to cast feature to CIFaceFeature before.

Inside for loop, I have also helper function which can calculate proper face rest and update label.

So as you can see here, the implementation of face features tracking is really easy but only once AVFoundation camera support is implemented.