What is semantic segmentation? The smartphone camera tech explained
Qualcomm recently announced its latest flagship platform for smartphones, the Snapdragon 8 Gen 2.
One of the biggest updates here was the new Cognitive ISP, a smart image signal processor that delivers up to 200-megapixel images, more intelligent face detection from the Always-Sensing camera and all the benefits of real-time semantic segmentation.
But, what is semantic segmentation and what does it mean for your phone’s camera?
What is semantic segmentation?
Semantic segmentation is an imaging technology that enables a camera to recognise individual aspects within a frame, such as faces, hair, clothes, backgrounds and more. Those aspects can then be optimised separately, meaning the camera will modify the colour, tone, sharpness and amount of noise in the image on a case-by-case basis.
Think of semantic segmentation like Photoshop layers within your phone camera, allowing different parts of an image to be adjusted individually rather than the camera editing the image as a whole.
Real-time semantic segmentation is exactly what it sounds like – the same as the above but everything happens as you use the camera and not after you snap the pic or finish recording the video.
There are lots of instances in which real-time semantic segmentation can improve your mobile photography.
The tech can be used to smooth skin, remove reflections from glasses, sharpen hair and fur, bring more blue to the sky and improve the readability of text on screens and in books, to name a few real-world uses.
You can see semantic segmentation in action in the video above.
How accurate is it?
We had the opportunity to test real-time semantic segmentation at Qualcomm’s Snapdragon Summit in Hawaii this year and found it did an okay job of separating skin from hair, clothes and the background behind us, but it wasn’t perfect.
In the demo, the camera was set to turn everything it assumed to be skin green to give us an idea of how accurate it is.
The camera did a good job of highlighting all the skin in the image, aside from a small section peeking out from behind the glasses. However, it struggled a lot with similarly coloured elements, mistakenly highlighting the beige dress and orange iPhone as skin because their tones matched too closely with the face and arms in the image.
In another demo, we tested the skin smoothing feature and found it to be very subtle and natural, resulting in something a bit more convincing than some of the more overzealous smoothing effects found on TikTok, Instagram and some OEM’s Android camera software.
Ultimately, it’s down to each OEM to tune the features enabled by the Cognitive ISP the way they like, VP of product management for camera Judd Heape told us at the Summit, meaning we’ll need to wait for the first batch of 8 Gen 2 phones to arrive to see how real-time semantic segmentation will improve images with real-world use.