2,218 research outputs found

    VisualNet: Commonsense knowledgebase for video and image indexing and retrieval application

    Get PDF
    The rapidly increasing amount of video collections, available on the web or via broadcasting, motivated research towards building intelligent tools for searching, rating, indexing and retrieval purposes. Establishing a semantic representation of visual data, mainly in textual form, is one of the important tasks. The time needed for building and maintaining Ontologies and knowledge, especially for wide domain, and the efforts for integrating several approaches emphasize the need for unified generic commonsense knowledgebase for visual applications. In this paper, we propose a novel commonsense knowledgebase that forms the link between the visual world and its semantic textual representation. We refer to it as "VisualNet". VisualNet is obtained by our fully automated engine that constructs a new unified structure concluding the knowledge from two commonsense knowledgebases, namely WordNet and ConceptNet. This knowledge is extracted by performing analysis operations on WordNet and ConceptNet contents, and then only useful knowledge in visual domain applications is considered. Moreover, this automatic engine enables this knowledgebase to be developed, updated and maintained automatically, synchronized with any future enhancement on WordNet or ConceptNet. Statistical properties of the proposed knowledgebase, in addition to an evaluation of a sample application results, show coherency and effectiveness of the proposed knowledgebase and its automatic engine

    Video databases annotation enhancing using commonsense knowledgebases for indexing and retrieval

    Get PDF
    The rapidly increasing amount of video collections, especially on the web, motivated the need for intelligent automated annotation tools for searching, rating, indexing and retrieval purposes. These videos collections contain all types of manually annotated videos. As this annotation is usually incomplete and uncertain and contains misspelling words, search using some keywords almost do retrieve only a portion of videos which actually contains the desired meaning. Hence, this annotation needs filtering, expanding and validating for better indexing and retrieval. In this paper, we present a novel framework for video annotation enhancement, based on merging two widely known commonsense knowledgebases, namely WordNet and ConceptNet. In addition to that, a comparison between these knowledgebases in video annotation domain is presented. Experiments were performed on random wide-domain video clips, from the \emph{vimeo.com} website. Results show that searching for a video over enhanced tags, based on our proposed framework, outperforms searching using the original tags. In addition to that, the annotation enhanced by our framework outperforms both those enhanced by WordNet and ConceptNet individually, in terms of tags enrichment ability, concept diversity and most importantly retrieval performance

    Semi-automated Ontology Generation for Biocuration and Semantic Search

    Get PDF
    Background: In the life sciences, the amount of literature and experimental data grows at a tremendous rate. In order to effectively access and integrate these data, biomedical ontologies – controlled, hierarchical vocabularies – are being developed. Creating and maintaining such ontologies is a difficult, labour-intensive, manual process. Many computational methods which can support ontology construction have been proposed in the past. However, good, validated systems are largely missing. Motivation: The biocuration community plays a central role in the development of ontologies. Any method that can support their efforts has the potential to have a huge impact in the life sciences. Recently, a number of semantic search engines were created that make use of biomedical ontologies for document retrieval. To transfer the technology to other knowledge domains, suitable ontologies need to be created. One area where ontologies may prove particularly useful is the search for alternative methods to animal testing, an area where comprehensive search is of special interest to determine the availability or unavailability of alternative methods. Results: The Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG) developed in this thesis is a system which supports the creation and extension of ontologies by semi-automatically generating terms, definitions, and parent-child relations from text in PubMed, the web, and PDF repositories. The system is seamlessly integrated into OBO-Edit and ProtĂ©gĂ©, two widely used ontology editors in the life sciences. DOG4DAG generates terms by identifying statistically significant noun-phrases in text. For definitions and parent-child relations it employs pattern-based web searches. Each generation step has been systematically evaluated using manually validated benchmarks. The term generation leads to high quality terms also found in manually created ontologies. Definitions can be retrieved for up to 78% of terms, child ancestor relations for up to 54%. No other validated system exists that achieves comparable results. To improve the search for information on alternative methods to animal testing an ontology has been developed that contains 17,151 terms of which 10% were newly created and 90% were re-used from existing resources. This ontology is the core of Go3R, the first semantic search engine in this field. When a user performs a search query with Go3R, the search engine expands this request using the structure and terminology of the ontology. The machine classification employed in Go3R is capable of distinguishing documents related to alternative methods from those which are not with an F-measure of 90% on a manual benchmark. Approximately 200,000 of the 19 million documents listed in PubMed were identified as relevant, either because a specific term was contained or due to the automatic classification. The Go3R search engine is available on-line under www.Go3R.org

    Text Mining the History of Medicine

    Get PDF
    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform

    Automated illustration of multimedia stories

    Get PDF
    Submitted in part fulfillment of the requirements for the degree of Master in Computer ScienceWe all had the problem of forgetting about what we just read a few sentences before. This comes from the problem of attention and is more common with children and the elderly. People feel either bored or distracted by something more interesting. The challenge is how can multimedia systems assist users in reading and remembering stories? One solution is to use pictures to illustrate stories as a mean to captivate ones interest as it either tells a story or makes the viewer imagine one. This thesis researches the problem of automated story illustration as a method to increase the readers’ interest and attention. We formulate the hypothesis that an automated multimedia system can help users in reading a story by stimulating their reading memory with adequate visual illustrations. We propose a framework that tells a story and attempts to capture the readers’ attention by providing illustrations that spark the readers’ imagination. The framework automatically creates a multimedia presentation of the news story by (1) rendering news text in a sentence by-sentence fashion, (2) providing mechanisms to select the best illustration for each sentence and (3) select the set of illustrations that guarantees the best sequence. These mechanisms are rooted in image and text retrieval techniques. To further improve users’ attention, users may also activate a text-to-speech functionality according to their preference or reading difficulties. First experiments show how Flickr images can illustrate BBC news articles and provide a better experience to news readers. On top of the illustration methods, a user feedback feature was implemented to perfect the illustrations selection. With this feature users can aid the framework in selecting more accurate results. Finally, empirical evaluations were performed in order to test the user interface,image/sentence association algorithms and users’ feedback functionalities. The respective results are discussed
    • 

    corecore