220 research outputs found

    Enhancing the performance of multi-modality ontology semantic image retrieval using object properties filter

    Get PDF
    Semantic technology such as ontology provides the possible approach to narrow down the semantic gap issue in image retrieval between low-level visual features and high-level human semantic.The semantic gap occurs when there is a disagreement between the information that is extracted from visual data and the text description.In this paper, we applied ontology to bridge the semantic gap by developing a prototype multi-modality ontology image retrieval with the enhancement of retrieval mechanism by using the object properties filter.The results demonstrated that, based on precision measurement, our proposed approach delivered better results compared to the approach without using object properties filter

    A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset

    Full text link
    Video understanding is an important task in short video business platforms and it has a wide application in video recommendation and classification. Most of the existing video understanding works only focus on the information that appeared within the video content, including the video frames, audio and text. However, introducing common sense knowledge from the external Knowledge Graph (KG) dataset is essential for video understanding when referring to the content which is less relevant to the video. Owing to the lack of video knowledge graph dataset, the work which integrates video understanding and KG is rare. In this paper, we propose a heterogeneous dataset that contains the multi-modal video entity and fruitful common sense relations. This dataset also provides multiple novel video inference tasks like the Video-Relation-Tag (VRT) and Video-Relation-Video (VRV) tasks. Furthermore, based on this dataset, we propose an end-to-end model that jointly optimizes the video understanding objective with knowledge graph embedding, which can not only better inject factual knowledge into video understanding but also generate effective multi-modal entity embedding for KG. Comprehensive experiments indicate that combining video understanding embedding with factual knowledge benefits the content-based video retrieval performance. Moreover, it also helps the model generate better knowledge graph embedding which outperforms traditional KGE-based methods on VRT and VRV tasks with at least 42.36% and 17.73% improvement in HITS@10

    An object properties filter for multi-modality ontology semantic image retrieval

    Get PDF
    Ontology is a semantic technology that provides the possible approach to bridge the issue on semantic gap in image retrieval between low-level visual features and high-level human semantic.The semantic gap occurs when there is a discrepancy between the information that is extracted from visual data and the text description.In other words, there is a difference between the computational representation in machine and human natural language.In this paper, an ontology has been utilized to reduce the semantic gap by developing a multi-modality ontology image retrieval with the enhancement of a retrieval mechanism by using the object properties filter. To achieve this, a multi-modality ontology semantic image framework was proposed, comprising of four main components which were resource identification, information extraction, knowledge-based construction and retrieval mechanism.A new approach, namely object properties filter is proposed by customizing the semantic image retrieval algorithm and the graphical user interface to facilitate the user to engage with the machine i.e. computers, in order to enhance the retrieval performance.The experiment results showed that the proposed approach delivered better results compared to the approach that did not use the object properties filter based on probability precision measurement

    Visual Question Answering: A SURVEY

    Get PDF
    Visual Question Answering (VQA) has been an emerging field in computer vision and natural language processing that aims to enable machines to understand the content of images and answer natural language questions about them. Recently, there has been increasing interest in integrating Semantic Web technologies into VQA systems to enhance their performance and scalability. In this context, knowledge graphs, which represent structured knowledge in the form of entities and their relationships, have shown great potential in providing rich semantic information for VQA. This paper provides an abstract overview of the state-of-the-art research on VQA using Semantic Web technologies, including knowledge graph based VQA, medical VQA with semantic segmentation, and multi-modal fusion with recurrent neural networks. The paper also highlights the challenges and future directions in this area, such as improving the accuracy of knowledge graph based VQA, addressing the semantic gap between image content and natural language, and designing more effective multimodal fusion strategies. Overall, this paper emphasizes the importance and potential of using Semantic Web technologies in VQA and encourages further research in this exciting area

    OEKG: The Open Event Knowledge Graph

    Get PDF
    Accessing and understanding contemporary and historical events of global impact such as the US elections and the Olympic Games is a major prerequisite for cross-lingual event analytics that investigate event causes, perception and consequences across country borders. In this paper, we present the Open Event Knowledge Graph (OEKG), a multilingual, event-centric, temporal knowledge graph composed of seven different data sets from multiple application domains, including question answering, entity recommendation and named entity recognition. These data sets are all integrated through an easy-to-use and robust pipeline and by linking to the event-centric knowledge graph EventKG. We describe their common schema and demonstrate the use of the OEKG at the example of three use cases: type-specific image retrieval, hybrid question answering over knowledge graphs and news articles, as well as language-specific event recommendation. The OEKG and its query endpoint are publicly available

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770
    • …
    corecore