Difference between revisions of "Private:copyDetection: Notes"
From NMSL
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Video Copy Detection == | == Video Copy Detection == | ||
− | '''Early Idea''' | + | '''Early Idea (discussed with Dr. Wael Abd-Almageed)''' |
* Collect large set of videos (may be from TREC) | * Collect large set of videos (may be from TREC) | ||
Line 7: | Line 7: | ||
* Extract local features (e.g., SIFT) from I frames | * Extract local features (e.g., SIFT) from I frames | ||
− | * Cluster these features into K clusters using for example K-means method (need to try | + | * Cluster these features into K clusters using for example K-means method (need to try different values of K) |
* To create signatures | * To create signatures | ||
+ | ** Divide a video into groups (may be GOP) | ||
+ | **Extract local features from I frames | ||
+ | **Map these features to the K clusters (prob value for each cluster) | ||
+ | **Normalize the probabilities so that they sum to 1 | ||
+ | **Use these (k) probabilities as a signature. | ||
+ | **In addition, we extract motion vectors from non I-frames in the GOP. | ||
+ | **Quantize these motion vectors into fixed number of bins, say B | ||
+ | **Build a histogram on these bins | ||
+ | **Normalize and compute probabilities (vector of size B). | ||
+ | **Now, use a combined signature from local features (K vector) and motion info (B vector). | ||
+ | |||
+ | *For comparing : | ||
+ | **Create signatures for each GoP in the target video. | ||
+ | **Compare signatures by comparing their vectors (some formal methods exist for this, check with Hamed). | ||
+ | |||
− | ** | + | *Notes: |
− | ** | + | **Signature creation can be done on a moving window, i.e., shifting with each frame (computationally expensive though). |
− | + | ** Later, we can create another level of abstraction to improve performance: Use the K vector (local features) and build a topic model on top of it using for example LDA. That is, each k-vector will be used as a word. The topic model will identify the collection of words that commonly occur together (which is called a topic). | |
− | |||
Latest revision as of 12:30, 29 December 2010
Video Copy Detection
Early Idea (discussed with Dr. Wael Abd-Almageed)
- Collect large set of videos (may be from TREC)
- Extract local features (e.g., SIFT) from I frames
- Cluster these features into K clusters using for example K-means method (need to try different values of K)
- To create signatures
- Divide a video into groups (may be GOP)
- Extract local features from I frames
- Map these features to the K clusters (prob value for each cluster)
- Normalize the probabilities so that they sum to 1
- Use these (k) probabilities as a signature.
- In addition, we extract motion vectors from non I-frames in the GOP.
- Quantize these motion vectors into fixed number of bins, say B
- Build a histogram on these bins
- Normalize and compute probabilities (vector of size B).
- Now, use a combined signature from local features (K vector) and motion info (B vector).
- For comparing :
- Create signatures for each GoP in the target video.
- Compare signatures by comparing their vectors (some formal methods exist for this, check with Hamed).
- Notes:
- Signature creation can be done on a moving window, i.e., shifting with each frame (computationally expensive though).
- Later, we can create another level of abstraction to improve performance: Use the K vector (local features) and build a topic model on top of it using for example LDA. That is, each k-vector will be used as a word. The topic model will identify the collection of words that commonly occur together (which is called a topic).
3D Video Copy Detection
- ideas, previous works?