197,212 research outputs found
Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Machine Learning has been a big success story during the AI resurgence. One
particular stand out success relates to learning from a massive amount of data.
In spite of early assertions of the unreasonable effectiveness of data, there
is increasing recognition for utilizing knowledge whenever it is available or
can be created purposefully. In this paper, we discuss the indispensable role
of knowledge for deeper understanding of content where (i) large amounts of
training data are unavailable, (ii) the objects to be recognized are complex,
(e.g., implicit entities and highly subjective content), and (iii) applications
need to use complementary or related data in multiple modalities/media. What
brings us to the cusp of rapid progress is our ability to (a) create relevant
and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP
techniques. Using diverse examples, we seek to foretell unprecedented progress
in our ability for deeper understanding and exploitation of multimodal data and
continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International
Conference on Web Intelligence (WI). arXiv admin note: substantial text
overlap with arXiv:1610.0770
Multimodal Grounding for Language Processing
This survey discusses how recent developments in multimodal processing
facilitate conceptual grounding of language. We categorize the information flow
in multimodal processing with respect to cognitive models of human information
processing and analyze different methods for combining multimodal
representations. Based on this methodological inventory, we discuss the benefit
of multimodal grounding for a variety of language processing tasks and the
challenges that arise. We particularly focus on multimodal grounding of verbs
which play a crucial role for the compositional power of language.Comment: The paper has been published in the Proceedings of the 27 Conference
of Computational Linguistics. Please refer to this version for citations:
https://www.aclweb.org/anthology/papers/C/C18/C18-1197
Sharing Human-Generated Observations by Integrating HMI and the Semantic Sensor Web
Current âInternet of Thingsâ concepts point to a future where connected objects gather meaningful information about their environment and share it with other objects and people. In particular, objects embedding Human Machine Interaction (HMI), such as mobile devices and, increasingly, connected vehicles, home appliances, urban interactive infrastructures, etc., may not only be conceived as sources of sensor information, but, through interaction with their users, they can also produce highly valuable context-aware human-generated observations. We believe that the great promise offered by combining and sharing all of the different sources of information available can be realized through the integration of HMI and Semantic Sensor Web technologies. This paper presents a technological framework that harmonizes two of the most influential HMI and Sensor Web initiatives: the W3Câs Multimodal Architecture and Interfaces (MMI) and the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) with its semantic extension, respectively. Although the proposed framework is general enough to be applied in a variety of connected objects integrating HMI, a particular development is presented for a connected car scenario where driversâ observations about the traffic or their environment are shared across the Semantic Sensor Web. For implementation and evaluation purposes an on-board OSGi (Open Services Gateway Initiative) architecture was built, integrating several available HMI, Sensor Web and Semantic Web technologies. A technical performance test and a conceptual validation of the scenario with potential users are reported, with results suggesting the approach is soun
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
- âŠ