Difference between revisions of "Private:progress-khodabakhshi"

From NMSL
Line 8: Line 8:
 
* Working on the normalization of the scores, and defining a threshold. So that if the score between two videos are greater than this threshold, they are considered a copy of each other. After defining this threshold, precision and recall of the system can be determined.
 
* Working on the normalization of the scores, and defining a threshold. So that if the score between two videos are greater than this threshold, they are considered a copy of each other. After defining this threshold, precision and recall of the system can be determined.
  
- Probably I will try a sigmoid function to normalize the scores between [0,1]
+
* Probably I will try a sigmoid function to normalize the scores between [0,1]
- and boost the score of the best matching video, if the first best matching video has a much higher relevance score than the second best matching video. x1 = x1 * (x1/x2)
+
* and boost the score of the best matching video, if the first best matching video has a much higher relevance score than the second best matching video. x1 = x1 * (x1/x2)
  
 
* What should be done next:
 
* What should be done next:
  - Right now the system uses all the frames of the videos, next step would be to use boundary detection algorithm to extract keyframes.
+
* Right now the system uses all the frames of the videos, next step would be to use boundary detection algorithm to extract keyframes.
  - Right now I evaluated the performance of the system against view interpolation attack by using actual videos taken from different viewpoints by cameras. Next step would be to synthesize views and the use the synthesized ones and determine their distortion and use them for evaluation.
+
* Right now I evaluated the performance of the system against view interpolation attack by using actual videos taken from different viewpoints by cameras. Next step would be to synthesize views and the use the synthesized ones and determine their distortion and use them for evaluation.
  - Right now the implementation and evaluation are considering 3D videos consist of one video + its depth. This can be extended to Multiview + Depth. If so, the depth extraction is need.
+
* Right now the implementation and evaluation are considering 3D videos consist of one video + its depth. This can be extended to Multiview + Depth. If so, the depth extraction is need.
  
 
===May 9===
 
===May 9===

Revision as of 07:34, 18 May 2011

Spring 2011 (RA)

  • Courses:
    • CMPT 771: Internet Architecture and Protocols

May 17

  • Reading this paper "". Motivation: to see how distorted a synthesized view would be if its texture and depth are distorted to some degree
  • Working on the normalization of the scores, and defining a threshold. So that if the score between two videos are greater than this threshold, they are considered a copy of each other. After defining this threshold, precision and recall of the system can be determined.
  • Probably I will try a sigmoid function to normalize the scores between [0,1]
  • and boost the score of the best matching video, if the first best matching video has a much higher relevance score than the second best matching video. x1 = x1 * (x1/x2)
  • What should be done next:
  • Right now the system uses all the frames of the videos, next step would be to use boundary detection algorithm to extract keyframes.
  • Right now I evaluated the performance of the system against view interpolation attack by using actual videos taken from different viewpoints by cameras. Next step would be to synthesize views and the use the synthesized ones and determine their distortion and use them for evaluation.
  • Right now the implementation and evaluation are considering 3D videos consist of one video + its depth. This can be extended to Multiview + Depth. If so, the depth extraction is need.

May 9

  • Evaluation part of my report has been updated

April 26

  • Implementing: Combining SIFT and Depth results to make a better decision.
  • Next step would be to evaluate it. Different transformations must be applied to query videos and robustness of the system against them should be measured.

April 18

  • Course work (final exam)
  • The "D. Depth map creation for multiview video" section of my report is updated.

April 8

  • Course work (problem set + presentation)
  • Exploring depth extraction details

March 29

  • Implementing the depth signature part of the system. The updated report is here.

March 8

  • Exploring multiview geometry and depth extraction methods. My report is here.

March 1

  • Continuing implementation

I am implementing the proposed algorithm in my talk using the mentioned libraries. The implementation up to and including frame level matching phase is completed till now.

Feb 22

  • Continuing implementation

There are some reliable implementation of SIFT algorithm. The first one is by author of the SIFT, David Lowe. However, his implementation is not open source, and he just distributed a binary file. In other words, this implementation is not flexible and it's not possible to change its parameters, which is necessary in our task, since we have to change its parameters to reduce the number of SIFT features by accepting the most informative ones. The implementation that I decided to use is one by Andrea Vedaldi, which is in C language, and has Matlab interface as well. The Open-Source SIFT Library won first place at the ACM Multimedia 2010 Open-Source Software Competition (the competition was stiff). It is flexible enough for our task. The two parameters that we need to change are the peak threshold, and the edge threshold, which can be change using this library.

For the approximate nearest neighbor task, I have decided to use open source FLANN library, which is one of the fastest nearest neighbor algorithms, as I have explained in more details in my report. This library is in C++ language, and also have Matlab interface as well.

I have learned to use these two libraries using their Matlab interface. The reason is that this task needs some kind of matrix manipulation, and Matlab is efficient in this case.

Feb 15

  • Starting implementation

Feb 8

  • Exploring different nearest neighbor techniques in high dimensional spaces

Feb 1

  • Exploring depth maps features

Jan 24

  • Adapting 2D CBVCD to 3D
  • Preparing the group meeting talk: Slides.

Jan 17

  • Exploring image registration methods
  • Exploring feature-based image registration methods
  • Exploring feature-based image registration methods that use SIFT-based methods
  • A more detailed report is available at: report but it is still changing...

Jan 10

  • Exploring 3D videos.
  • Exploring literature for 3D video copy detection

Jan 3

  • having fun, not really working :)

Fall 2010 (TA)

  • Courses:
    • CMPT 701: Design/Analysis Algorithms
    • CMPT 820: Multimedia Systems


  • Worked on 2D video copy detection.