打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Feature descriptor comparison report

Introduction

For this test i have written special test framework, which allows me to easily add the new kind of descriptors and test cases and generate report data in CSV-like format. Than i upload it in Google docs and create this awesome charts. Five quality and one performance test was done for each kind of descriptor.

Test cases

  • Rotation test -  this test shows how the feature descriptor depends on feature orientation.
  • Scaling test -  this test shows how the feature descriptor depends on feature size.
  • Blur test -  this test shows how the feature descriptor is robust against blur.
  • Lighting test -  this test shows how the feature descriptor is robust against lighting.
  • Pattern detection test – this test performs detection of planar object (image) on the real video. In contrast to the synthetic tests, this test gives a real picture of the overall stability of the particular descriptor.
  • Performance test is a measurement of description extraction time.

All quality tests works in similar way. Using a given source image we generate a synthetic test data: transformed images corresponding feature points. The transformation algorithm depends on the particular test. For the rotation test case, it’s the rotation of the source image around it’s center for 360 degrees, for scaling – it’s resizing of image from 0.25X to 2x size of original. Blur test uses gaussian blur with several steps and the lighting test changes the overall picture brightness.

The pattern detection test deserves a special attention. This test is done on very complex and noisy video sequence. So it’s challenging task for any feature descriptor algorithm to demonstrate a good results in this test.

The metric for all quality tests is the percent of correct matches between the source image and the transformed one. Since we use planar object, we can easily select the inliers from all matches using the homography estimation. I use ’s function cvFindHomography for this. This metric gives very good and stable results. I do no outlier detection of matches before homography estimation because this will affect the results in unexpected way. The matching of descriptors is done via brute-force matching from the .

Rotation test

In this test i obtain pretty expectable results, because all descriptors are rotation invariant expect the . Slight changes in stability can be explained by the feature orientation calculation algorithm and descriptor nature. A detailed study of why the descriptor behaves exactly as it is, takes time and effort. It’s a topic for another article. Maybe later on….

Scaling test

 and SIFT descriptors demonstrate us very good stability in this test because they do expensive keypoint size calculation. Other descriptors uses fixed-size descriptor and you can see what it leads to. Currently for descriptor i do not have separate  feature detector (i use  detector for tests) but I’m thinking on lightweight feature detector with feature size calculation, because it’s a must-have feature. Actually, scale invariance is much more important rather than precise orientation calculation.

Blur test

In this test i tried to simulate the motion blur which can occurs if camera moves suddenly. All descriptors demonstrate good results in this test. By “good” I mean that the more blur size is applied the less percent of correct matches is obtained. Which is expected behavior.

Lighting test

In lighting test the transformed images differs only in overall image brightness. All kinds of descriptors works well in this case. The major reason is that all descriptors extracted normalized, e.g the norm_2 of the descriptor vector equals 1. This normalization makes descriptor invariant to brightness changes.

Pattern detection on real video

Detection of the object on real video is the most complex task since ground truth contains rotation, scaling and motion blur. Also other objects are also present. And finally, it’s not HD quality. These conditions are dictated by the actual conditions of application of computer vision.

As you can see on diagram, the SIFT and SURF descriptors gives the best results, nevertheless they are far away from ideal, it’s quite enough for such challenging video. Unfortunately, scale-covariant descriptors show very bad results in this test because pattern image appears in 1:1 scale only at the beginning of the video (The “spike” near frame 20). On the rest of the video sequence target object moves from the camera back and scale-covariant descriptors can’t handle this situation.

Performance summary

This chart shows the extraction time for N features. I made Y-axis as logarithm scale to make it more readable. For all descriptor extraction algorithm the extraction time depends on number of features linearly. Local spikes is probably caused by some vector resizing or L2 cache misses. This performance test was done on Mac Book Pro 2.2 with Core 2 Duo 2.13 Ghz.

Further works

Add new quality test cases. One additional test i know for sure – affine transformations. Your ideas for other tests are welcome!

  • Add new kind of descriptors. Definitely will add an A-SIFT implementation.
  • Create an LAZY detector with feature size and orientation estimation.
  • Improve the LAZY descriptor extraction procedure. Expect at least 20% performance gain.
  • Generate matching video for each test to demonstrate the behavior of each descriptor algorithm.
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
A battle of three descriptors: SURF, FREAK and BRISK
输入标A Comprehensive Performance Evaluation of 3D Local Feature Descriptors题
JSP绘图
肩关节体格检查
Science 论文投稿成功 Cover Letter 示范
使用OPENCV自带的sift提取特征
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服