Since the first full sequencing of a bacteriophage genome done in 1977 completely by hand, DNA sequences of hundreds of different species have been decoded. Compared to the tiny phage, the human genome analysis completed three decades later, required completely new computational methodologies. The growing amount of genetic data has driven the emergence of bioinformatics a field of computer science using string matching and search algorithms for automatic analysis of DNA sequences. A similar exlosive growth has been witnessed in video data available in the public domain, driven by fast evolution of Internet technologies, which have created unprecedented challenges in the analysis, organization, management, and control of such content. The problems encountered in video analysis such as identifying a video in a large database (e.g. detecting pirated content in YouTube), putting together video fragments, finding similarities and common ancestry between different versions of a video, have analogous counterpart problems in genetic research and analysis of DNA and protein sequences.
Analogy between insertion/delection and substitution mutations in biological DNA sequences (left) and video sequences (right).
We exploit the analogy between genetic sequences and video. Our approach to video analysis is motivated by genomic research. Representing video information as video DNA sequences and applying bioinformatic algorithms allows to search, match, and compare videos on Internet scale.
One can conceptually think of a video as of a sequence of visual information units, which can be represented over some potentially very large alphabet, resulting in a sequence of "letters" (or visual nucleotides) which we call video DNA by analogy to genetic sequences. Video DNA sequencing is performed by fist dividing the video is divided into time intervals, in which prominent and stable local visual features are detected. The statistics of visual features is used as a descriptor of the visual content of the video interval.
Fundamental problems in video analysis are reduced to finding similarity and correspondence between video DNA sequences. Content-based retrieval and copy detection are similar to genetic database search, and the BLAST algorithm is used to find regions of local similarity between video DNA sequences same way it is used in genomic research. Local alignment using dynamic programming algorithms such as Smith-Waterman allow to establish timeline correspondence between similar videos. The correspondence is used to transfer metadata associated with a video timeline (e.g. subtitle text or keywords describing contextual information) to another similar video. Having a sufficiently large annotated database of video DNA sequences, the understanding of a new video is possible by using the annotations of similar videos from the database.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, "The Video Genome", arXiv:1003.5320v1, 27 March 2010.
Video Genome project stops movie pirates with DNA science, Switched, 5 April 2010.
Sequencing the Video Genome, Wired Science, 5 April 2010.
The Video Genome Project