Difference between revisions of "Private:copyDetection: Notes"

Latest revision as of 12:30, 29 December 2010

Video Copy Detection

Early Idea (discussed with Dr. Wael Abd-Almageed)

Collect large set of videos (may be from TREC)

Extract local features (e.g., SIFT) from I frames

Cluster these features into K clusters using for example K-means method (need to try different values of K)

To create signatures
- Divide a video into groups (may be GOP)
- Extract local features from I frames
- Map these features to the K clusters (prob value for each cluster)
- Normalize the probabilities so that they sum to 1
- Use these (k) probabilities as a signature.
- In addition, we extract motion vectors from non I-frames in the GOP.
- Quantize these motion vectors into fixed number of bins, say B
- Build a histogram on these bins
- Normalize and compute probabilities (vector of size B).
- Now, use a combined signature from local features (K vector) and motion info (B vector).

For comparing :
- Create signatures for each GoP in the target video.
- Compare signatures by comparing their vectors (some formal methods exist for this, check with Hamed).

Notes:
- Signature creation can be done on a moving window, i.e., shifting with each frame (computationally expensive though).
- Later, we can create another level of abstraction to improve performance: Use the K vector (local features) and build a topic model on top of it using for example LDA. That is, each k-vector will be used as a word. The topic model will identify the collection of words that commonly occur together (which is called a topic).

3D Video Copy Detection

ideas, previous works?

@@ Line 1: / Line 1: @@
+== Video Copy Detection ==
-== Video Copy Detection using Optical Flow ==
+'''Early Idea (discussed with Dr. Wael Abd-Almageed)'''
+* Collect large set of videos (may be from TREC)
+* Extract local features (e.g., SIFT) from I frames
+* Cluster these features into K clusters using for example K-means method (need to try different values of K)
+* To create signatures
+** Divide a video into groups (may be GOP)
+**Extract local features from I frames
+**Map these features to the K clusters (prob value for each cluster)
+**Normalize the probabilities so that they sum to 1
+**Use these (k) probabilities as a signature.
+**In addition, we extract motion vectors from non I-frames in the GOP.
+**Quantize these motion vectors into fixed number of bins, say B
+**Build a histogram on these bins
+**Normalize and compute probabilities (vector of size B).
+**Now, use a combined signature from local features (K vector) and motion info (B vector).
+*For comparing :
+**Create signatures for each GoP in the target video.
+**Compare signatures by comparing their vectors (some formal methods exist for this, check with Hamed).
+*Notes:
+**Signature creation can be done on a moving window, i.e., shifting with each frame (computationally expensive though).
+** Later, we can create another level of abstraction to improve performance: Use the K vector (local features) and build a topic model on top of it using for example LDA. That is, each k-vector will be used as a word. The topic model will identify the collection of words that commonly occur together (which is called a topic).
-* any previous work?