Tuesday, July 26, 2011

Biometrics, object recognition and search

One of the limitations of the internet search we all know and love is that it is bound to text. If you have a picture of something, you can't find out what it is with a search engine. If you've recorded a bird's song, there's no way to get that into the search field to identify the species of bird that created it.

Google obviously recognizes this limitation of its technology and the obvious benefits of extending its capabilities.

Two main types of technology have been developed to help get real-world (non-text) inputs into a form that computers are good at using: Optical Character Recognition and Biometrics.

Optical Character Recognition (OCR) involves taking an image of text and converting it to textual data. Banks use OCR to scan the numbers at the bottom of checks, document scanners use it to make better use of scanned documents, and more recently, it has been used with vehicle license plates in law enforcement. OCR was the logical first step along the path of using pictures as computer inputs because text is fairly easy to break into its constituent parts (characters) and there simply aren't that many different characters to identify. So the tech is pretty easy and the ROI (Return on Investment) of using computers to transcribe text instead of humans is very straightforward.

Biometrics are the second generation of automatic object recognition. Biometrics are far more complicated than OCR but far less complicated than a theoretical no-holds-barred image search engine. With biometrics the tech is more difficult but the ROI of using computers to help manage identity can be very substantial indeed.

Techniques developed for biometric identity management are also being applied to recognition tasks that do not deal with human identity management. Leafsnap is an app that uses visual recognition software to help identify tree species from photographs of their leaves. StripeSpotter is a free open-source system with an algorithm that can identify animals in the wild and build biometric databases using photos of the different animals. Here, the tech developed for high ROI applications is being applied to new, though lower value challenges.

Which brings us to Google's acquisition of PittPatt, a Pittsburgh pattern recognition company.

Google Buys PittPatt Facial Recognition Tech (PCMag.com)
In late March, Google denied plans for a dedicated facial-recognition app, although the company has said it could do so as far back as the launch of Google Goggles, which used object recognition to identify real-world objects.
With this acquisition, I suspect Google doesn't so much have facial recognition for identity search as they have object recognition in mind. First, Google has been wary of face recognition in public search. Whether this is due to the technical challenges of an unbound face rec application or a respect for the privacy of their users, I'll leave it to the reader to judge. It is also a much different challenge to return the result "This is a human face" than it is to say "This is a human face and that face belongs to Guy Herbert." In most object search the first type of result will be the most desirable anyway. If you submit a photo of an insect to a search, you aren't asking about the insect's individual identity, you probably just want to know what type of insect it is.

I'll bet Google's more interested in the object recognition capabilities of PittPatt than they are in facial recognition.