Here are some of the results from running the new code. SigThresh is hard coded at 0.25 which seems to work well for various types of video. The plots for frame distance are given in discussion_18apr08.pdf
Video | #Frames | Summary Image |
city | 299 | city |
ice | 239 | ice |
foreman | 299 | foreman |
soccer | 299 | soccer |
doc_reality (from CBC) | 2000 | doc_reality |
car surveillance video 1 | 490 | car surveillance 1 |
car surveillance video 2 (night) | 540 | car surveillance 2 |
car surveillance video 3 | 420 | car surveillance 3 |
As it can be seen from the extracted summaries, the algorithm succeeds in detecting shots. A good example here is the doc_reality sequence provided by CBC that has a lot of shots. In addition, since we are using all three color elements (HSV), changes in the brightness are detected e.g. the first fews key frames in doc_reality. For videos that only contain one shot without much camera motion, only one key frame is extracted such as city and ice sequences. The algorithm fails to extract meaningful key frames for surveillance videos especially the last two. This could mean there is a need for fine tuning the algorithm for such application. In addition, since the background in surveillance videos are usually steady, we may want to subtract an average histogram (computed progressively) from all video frames, so small changes can be detected more easily. Hierarchical Summarization