Difference between revisions of "Private:copyDetection: Notes"

Latest revision as of 13:30, 29 December 2010

Early Idea (discussed with Dr. Wael Abd-Almageed)

Cluster these features into K clusters using for example K-means method (need to try different values of K)

For comparing :
- Create signatures for each GoP in the target video.
- Compare signatures by comparing their vectors (some formal methods exist for this, check with Hamed).

Notes:
- Signature creation can be done on a moving window, i.e., shifting with each frame (computationally expensive though).
- Later, we can create another level of abstraction to improve performance: Use the K vector (local features) and build a topic model on top of it using for example LDA. That is, each k-vector will be used as a word. The topic model will identify the collection of words that commonly occur together (which is called a topic).

@@ Line 7: / Line 7: @@
 * Extract local features (e.g., SIFT) from I frames
-* Cluster these features into K clusters using for example K-means method (need to try for different values of K)
+* Cluster these features into K clusters using for example K-means method (need to try different values of K)
 * To create signatures
 ** Divide a video into groups (may be GOP)
 **Extract local features from I frames
@@ Line 16: / Line 15: @@
 **Normalize the probabilities so that they sum to 1
 **Use these (k) probabilities as a signature.
---
 **In addition, we extract motion vectors from non I-frames in the GOP.
 **Quantize these motion vectors into fixed number of bins, say B
 **Build a histogram on these bins
 **Normalize and compute probabilities (vector of size B).
 **Now, use a combined signature from local features (K vector) and motion info (B vector).
 *For comparing :
@@ Line 32: / Line 28: @@
 *Notes:
 **Signature creation can be done on a moving window, i.e., shifting with each frame (computationally expensive though).
 ** Later, we can create another level of abstraction to improve performance: Use the K vector (local features) and build a topic model on top of it using for example LDA. That is, each k-vector will be used as a word. The topic model will identify the collection of words that commonly occur together (which is called a topic).