In Computer Vision applications the local keypoint descriptors are well known. They encode in a vector some visual information contained around a selected keypoint in the image. We outline “some” since there are many assumptions and information types that can be extracted and described compactly from an image patch (e.g. color histogram, edge directions, gradient histograms etc etc).
Local descriptors are there for at least a decade and many of them were proposed over the time. The best known is SIFT (Scale-Invariant Feature Transform) [PDF] [PDF] that attempts to capture dominating gradients in a structure of cells forming a keypoint. Another more recent is SURF (Speeded Up Robust Features) [PDF] that improves in keypoint detection and descriptor stages by building distribution based descriptor while being much faster to extract and compact.
There are many more keypoint detectors and descriptors. In this article we focus on a novel descriptor named FREAK (Fast Retina Keypoint) [PDF]. It is a binary descriptor encoding simple image intensity comparisons on a specific pattern at large scale and small scale. It features human-like manner of capturing visual information – coarse in peripheral regions of retina and fine in central or fovea region. Its main practical interest stems from extremely fast extraction and matching which is long awaited in embedded devices or in large-scale applications. We refer reader to the original paper for more exhaustive discussion and comparison and technical details.
As the FREAK descriptor is very recent and as of this writing was recently integrated into OpenCV, I decided to write a MEX interface for MATLAB. For descriptor extraction, there are two MEX files – detected and dense feature version. The first uses SURF detector to get the keypoints, while the second samples densely the keypoints from all the image.
Additionally, I would like to put forward three additional functions to work with those binary descriptor vectors:
- Hamming distance [WWW] (highly efficient with SSE3 enabled!)
- k-Nearest Neighbor with Hamming distance
- Bi-directional FREAK descriptor matching for two sets using the Hamming distance
The code can be downloaded [HERE]. The code is distributed under BSD license [WWW].
For compilation, you will need OpenCV 2.4.2 [LINK] at least! To get the efficient Hamming distance computation compiled and working, you will need a recent compiler and Intel processor supporting SSE3 instructions!
Feel free to write me if you find any bugs that I did not notice yet. Porting the whole thing to a CMake project could be a nice thing among other things to do.
EPILOG: I would like to ask one question to all: Has anybody worked with large amounts of binary data and performed clustering with Hamming distance (or any other metric suited for binary data)? I could not find any implementation apart the Matlab’s from Statistical toolbox which seems not to be suited to large amount of data.