Skip to main content
Article thumbnail
Location of Repository

Automatic semantic video annotation in wide domain videos based on similarity and commonsense knowledgebases

By Amjad Altadmri and Amr Ahmed


In this paper, we introduce a novel framework for automatic Semantic Video Annotation. As this framework detects possible events occurring in video clips, it forms the annotating base of video search engine. To achieve this purpose, the system has to able to operate on uncontrolled wide-domain videos. Thus, all layers have to be based on generic features.\ud \ud This framework aims to bridge the "semantic gap", which is the difference between the low-level visual features and the human's perception, by finding videos with similar visual events, then analyzing their free text annotation to find a common area then to decide the best description for this new video using commonsense knowledgebases.\ud \ud Experiments were performed on wide-domain video clips from the TRECVID 2005 BBC rush standard database. Results from these experiments show promising integrity between those two layers in order to find expressing annotations for the input video. These results were evaluated based on retrieval performance

Topics: G700 Artificial Intelligence, G710 Speech and Natural Language Processing, G400 Computer Science, G720 Knowledge Representation, G450 Multi-media Computing Science, G740 Computer Vision, G540 Databases
Year: 2009
OAI identifier:

Suggested articles


  1. (1998). A fully automated content-based video search engine supportingspatiotemporal queries,” doi
  2. (2007). A hybrid approach to improving semantic extraction of news video,” doi
  3. (2004). A multi-modal system for the retrieval of semantic video events,” doi
  4. (2004). A rule-based video annotation system,” doi
  5. (2000). A semantic event-detection approach and its application to detecting hunts in wildlife video,” doi
  6. (2008). Association and temporal rule mining for post-filtering of semantic concept detection in video,” doi
  7. (2007). Classification of video events using 4-dimensional time-compressed motion features,” doi
  8. (2004). Conceptnet a practical commonsense reasoning tool-kit,” doi
  9. (2008). Content based video matching using spatiotemporal volumes,” doi
  10. (1997). Content-based search of video using color, texture, and motion,” doi
  11. (1995). Cyc: A large-scale investment in knowledge infrastructure,” doi
  12. (2003). Generic play-break event detection for summarization and hierarchical sports video analysis,” doi
  13. (2008). Learning ontology rules for semantic video annotation,” doi
  14. (1999). Object recognition from local scale-invariant features,” doi
  15. (2007). Semantic annotation and retrieval of video events using multimedia ontologies,” doi
  16. (2008). Semantic concept learning through massive internet video mining,” doi
  17. (2000). The earth mover’s distance as a metric for image retrieval,” doi
  18. The stanford nlp log-linear part of speech tagger.”
  19. (2005). Trec video retrieval track doi
  20. (2008). Video semantic event/concept detection using a subspace-based multimedia data mining framework,” doi
  21. (1997). Videoq: an automated content based video search system using visual cues,” doi
  22. (1999). What are ontologies, and why do we need them?” doi
  23. (1998). WordNet: an electronic lexical database. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.