Revision as of 20:49, 28 March 2011

Spring 2011 (GF)

Courses: None

working on: Video Copy Detection using Optical Flow

March 15 - March 29

I have analyzed the results from the transformations using both cosine and euclidean distance metrics. I then got motion histograms for the 399 videos in the trekvid 2009 database. I used ffmpeg to get a 3 minute clip and then created the motion histogram for the clips. I got the pairwise distance between all combinations of pairs and compared these distances to the ones obtained for the video clip under transformations. I discovered that many of the distances for pairs of different clips were smaller than the distance for the same clip under mild transformations. This presents a problem for using motion vectors as a signature.

Most of the motion is small. On average, 40% of the motion vectors are (0,0) and 23% are of magnitude 1. This leads to 63% of the motion being placed in the first few bins. I am experimenting with looking at motion outside this region as the fingerprint

March 8 - March 14

I have made copies of a clip with the following transformations applied:

Resizing to 90%, 80%, 70%, 60%, 50% of the original.
Cropping of 10, 20, 30, 40, 50 pixels from border around clip
Addition of a logo to the clip

I am now trying to analyse the histograms to see how well the motion signature matches between clips. I expect to have results by the end of Tuesday

March 2 - March 7

I worked on writing up the algorithm - I had some trouble envisioning how to combine the two independant elements (motion vectors and SURF features) into one signature. As per discussion with Dr. Hefeeda, I am looking just at the motion vectors for now.

Several papers use motion vectors.

Hampapur et. al. describe a signature based on the motion of the video across frames. Each frame is partitioned into N = Nx x Ny blocks. they examine a block in frame-t centered at (xt, yt) and look in frame t+1 within specified search region for a block which has the minimum Sum of Average Pixel Differences, SAPD. The difference between the patch location in frame t and the best match in frame t + 1 produces a motion vector.

Tasdemir et. al. is the only paper I found which used the motion vectors directly. This paper claims motion vectors are a good parameter, but that "a complete CBCD system should be capable of fusing the information coming from motion, color and SIFT feature sets in an intelligent manner to reach a decision"

No paper that I saw attempted to aggregate motion vectors across frames for the signature

I am working on testing the motion vectors under a few transformations and see how the histograms compare.

Feb 23 - March 1

I have created seversal test cased to examine the motion vectors I extract and validate them.
Experimenting with motion vectors extracted gave results which did not seem correct. I have been digging into the source code to try and figure what is going on. I have been able to fix one major problem relating whether the prediction was based on the previous frame or the next frame. I am still looking into some others.

Feb 16 - Feb 22

Using ffmpeg APIs I am now able to do the following:
- Extract I-frames and save in jpg format for later analysis
- Extract Motion vectors for B and P frames which I will use to build my histogram

Note: There does not seem to be a way to do this with ffmpeg in the compressed domain. I have to make a call to avcodec_decode_frame() in order to get the motion information and the frame type. There is a workaround for the I-frames. I can use a utility called ffprobe (use the R92 branch) with the -show_frames option from which a list of I-frames can be constructed, but since I need to decode the I-frames to decode subsequent B/P frames, I will just loop through and decode all frames. Theoretically, this can be run in hte compressed domain if one knew enough about the encoding protocal and had lots of time to write their own parser. I think that my approach is enough to show proof of concept for now.

Feb 9 - Feb 15

Worked on motion vector extraction using ffmpeg APIs

Feb 2 - Feb 8

Downloaded and compiled ffprobe
Started coding using ffmpeg libraries to extract I-frames and motion vectors. I have almost got the I-Frame portion worked out, and I will look into the motion vectors next week

Jan 19 - Feb 1

Clustered SURF pts with both k-means and x-means
Looking into how to extract I-frames and motion information from video sequences
Downloaded and compiled source files for ffmpeg
- I think I can write something using this which will parse the mpegs
- I am having problems with permissions when i try to install - working with Ahmed and Jason
Found a Matlab m-file for extracting motion vectors - I am not sure this will be all that usefulMatlab m files
Also found a reworking of mplayer: modified mplayer (modified for flow)
- Janez Pers: The modifications are relatively minor, but ugly. The code that draws motion vectors is changed to dump the arrow directions and length into the text file. I cannot offer any support for compiling Mplayer though. The binary-only (windows) version is available here, it has added example video (it is better to start with this): windows binary. If you like the code and will use it in your scientific work, you can check my paper, which uses the same code for the second batch of experiments: Pers's paper.
- Short instructions:
  - if you have MPEG4 already (I used the mpeg4 encoding as a fast way to get vectors as well), then skip the first step: mencoder original.avi -ovc lavc -lavcopts vcodec=mpeg4 -o mpeg4encoded.avi
  - now extract the motion vectors, without displaying the video (you can display the video as well, if you like, it was just more convenient for me) mplayer mpeg4encoded.avi -benchmark -lavdopts vismv=1
  - Now, the file opticalflow.dat will appear. Do not forget option vismv=1, the extraction is part of the visualisation.
  - The file opticalflow.dat has the following format: framenum,x,y,vx,vy (vx vy being the vectors, x y being the position of the block).
- Be aware that the data for the I frames will be missing (no flow there). And, in my experience, lower bitrates give better flow than high ones - with high ones the encoder does not need to bother with the motion vectors, since it has enough bandwith already...

Jan 19-25

TrecVid 2008 final transformation document with examples: Transformation Document
TrecVid 2008 explanation of how transformations are generated: Transformation Explanation
2010 TrecVid Requirements
Downloaded the TrecVid 2007 and 2009 databases with test cases and testcases for 2010
Downloaded and investigating x-means experimental software (Licensed to me for research purposes only)
Experimenting with SURF interext points and x-means clustering

Jan 12-18

Finished survey of Video Copy Detection methods Survey
Prepared presentation on Optical Flow My Presentation

Jan 11

Survey of State of the Art Techniques

Fall 2010 (RA)

Courses:
- CMPT-820: Multimedia Systems
worked on:
- Video Copy Detection

Summer 2010 (TA)

Courses:
- None
worked on:
- Energy-Efficient Gaming on Mobile Devices using Dead Reckoning-based Power Management
submitted
- NetGames 2010: Energy-Efficient Gaming on Mobile Devices using Dead Reckoning-based Power Management (accepted)

Spring 2010 (RA)

Courses:
- CMPT-822: Special topics in Database Systems
- CMPT 884: Computational Vision
worked on:
- Energy-Efficient Gaming on Mobile Devices
submitted
- Nosdav 2010: Energy-Efficient Gaming on Mobile Devices (not accepted)

Fall 2009 (TA)

Courses:
- CMPT-705: Algorithm
- CMPT-771: Internet Architecture and Protocols

To Do List

Full Write up of the proposed algorithm. Start with a centralized approach and them show how it can be distributed.
- Claim: This algorithm is novel, better, more efficient, and can be easily distributed
- State how it can be implemented distributively. Naively - 1 node/video clip - is there a better way?
- State evaluation criteria. Prioritize wich transformations are the most important and how our algorithm is expected to deal with each.

Consider implementing a non-evenly distributed histogram for magnitudes. Consider the behaviour of the tail of the magnitude data. Also look into the search window size in ffmpeg and set the maximum magniture to this value.

Watch for the number of motion vectors received from 8x8. 16x8 and 8x16 and decide if it is worth considering these

Want to get something together for conference in mid April

@@ Line 5: / Line 5: @@
 ==== March 15 - March 29 ====
 I have analyzed the results from the transformations using both cosine and euclidean distance metrics.  I then got motion histograms for the 399 videos in the trekvid 2009 database.  I used ffmpeg to get a 3 minute clip and then created the motion histogram for the clips.  I got the pairwise distance between all combinations of pairs and compared these distances to  the ones obtained for the video clip under transformations.  I discovered that many of the distances for pairs of different clips were smaller than the distance for the same clip under mild transformations.  This presents a problem for using motion vectors as a signature.
+Most of the motion is small.  On average, 40% of the motion vectors are (0,0) and 23% are of magnitude 1.  This leads to 63% of the motion being placed in the first few bins.  I am experimenting with looking at motion outside this region as the fingerprint
 ==== March 8 - March 14 ====

Difference between revisions of "Private:progress-harvey"