274 research outputs found

    FVQA: Fact-based Visual Question Answering

    Full text link
    Visual Question Answering (VQA) has attracted a lot of attention in both Computer Vision and Natural Language Processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of such questions that require no external information to answer is interesting, but very limited. It excludes questions which require common sense, or basic factual knowledge to answer, for example. Here we introduce FVQA, a VQA dataset which requires, and supports, much deeper reasoning. FVQA only contains questions which require external information to answer. We thus extend a conventional visual question answering dataset, which contains image-question-answerg triplets, through additional image-question-answer-supporting fact tuples. The supporting fact is represented as a structural triplet, such as . We evaluate several baseline models on the FVQA dataset, and describe a novel model which is capable of reasoning about an image on the basis of supporting facts.Comment: 16 page

    Towards Machine-Assisted Meta Studies of Astrophysical Data From the Scientific Literature

    Get PDF
    We develop a new model for automatic extraction of reported measurements from the astrophysical literature, utilising modern Natural Language Processing techniques. We begin with a rules-based model for keyword-search-based extraction, and then proceed to develop artificial neural network models for full entity and relation extraction from free text. This process also requires the creation of hand-annotated datasets selected from the available astrophysical literature for training and validation purposes. We use a set of cosmological parameters to examine the model's ability to identify information relating to a specific parameter and to illustrate its capabilities, using the Hubble constant as a primary case study due to the well-document history of that parameter. Our results correctly highlight the current tension present in measurements of the Hubble constant and recover the 3.5σ discrepancy – demonstrating that the models are useful for meta-studies of astrophysical measurements from a large number of publications. From the other cosmological parameter results we can clearly observe the historical trends in the reported values of these quantities over the past two decades, and see the impacts of landmark publications on our understanding of cosmology. The outputs of these models, when applied to the article abstracts present in the arXiv repository, constitute a database of over 231,000 astrophysical numerical measurements, relating to over 61,000 different symbolic parameter representations – here a measurement refers to the combination of a numerical value and an identifier (i.e. a name or symbol) to give it physical meaning. We present an online interface (Numerical Atlas) to allow users to query and explore this database, based on parameter names and symbolic representations, and download the resulting datasets for their own research uses

    Exploiting Latent Features of Text and Graphs

    Get PDF
    As the size and scope of online data continues to grow, new machine learning techniques become necessary to best capitalize on the wealth of available information. However, the models that help convert data into knowledge require nontrivial processes to make sense of large collections of text and massive online graphs. In both scenarios, modern machine learning pipelines produce embeddings --- semantically rich vectors of latent features --- to convert human constructs for machine understanding. In this dissertation we focus on information available within biomedical science, including human-written abstracts of scientific papers, as well as machine-generated graphs of biomedical entity relationships. We present the Moliere system, and our method for identifying new discoveries through the use of natural language processing and graph mining algorithms. We propose heuristically-based ranking criteria to augment Moliere, and leverage this ranking to identify a new gene-treatment target for HIV-associated Neurodegenerative Disorders. We additionally focus on the latent features of graphs, and propose a new bipartite graph embedding technique. Using our graph embedding, we advance the state-of-the-art in hypergraph partitioning quality. Having newfound intuition of graph embeddings, we present Agatha, a deep-learning approach to hypothesis generation. This system learns a data-driven ranking criteria derived from the embeddings of our large proposed biomedical semantic graph. To produce human-readable results, we additionally propose CBAG, a technique for conditional biomedical abstract generation

    Streaming Infrastructure and Natural Language Modeling with Application to Streaming Big Data

    Get PDF
    Streaming data are produced in great velocity and diverse variety. The vision of this research is to build an end-to-end system that handles the collection, curation and analysis of streaming data. The streaming data used in this thesis contain both numeric type data and text type data. First, in the field of data collection, we design and evaluate a data delivery framework that handles the real-time nature of streaming data. In this component, we use streaming data in automotive domain since it is suitable for testing and evaluating our data delivery system. Secondly, in the field of data curation, we use a language model to analyze two online automotive forums as an example for streaming text data curation. Last but not least, we present our approach for automated query expansion on Twitter data as an example of streaming social media data analysis. This thesis provides a holistic view of the end-to-end system we have designed, built and analyzed. To study the streaming data in automotive domain, a complex and massive amount of data is being collected from on-board sensors of operational connected vehicles (CVs), infrastructure data sources such as roadway sensors and traffic signals, mobile data sources such as cell phones, social media sources such as Twitter, and news and weather data services. Unfortunately, these data create a bottleneck at data centers for processing and retrievals of collected data, and require the deployment of additional message transfer infrastructure between data producers and consumers to support diverse CV applications. The first part of this dissertation, we present a strategy for creating an efficient and low-latency distributed message delivery system for CV systems using a distributed message delivery platform. This strategy enables large-scale ingestion, curation, and transformation of unstructured data (roadway traffic-related and roadway non-traffic-related data) into labeled and customized topics for a large number of subscribers or consumers, such as CVs, mobile devices, and data centers. We evaluate the performance of this strategy by developing a prototype infrastructure using Apache Kafka, an open source message delivery system, and compared its performance with the latency requirements of CV applications. We present experimental results of the message delivery infrastructure on two different distributed computing testbeds at Clemson University. Experiments were performed to measure the latency of the message delivery system for a variety of testing scenarios. These experiments reveal that measured latencies are less than the U.S. Department of Transportation recommended latency requirements for CV applications, which provides evidence that the system is capable for managing CV related data distribution tasks. Human-generated streaming data are large in volume and noisy in content. Direct acquisition of the full scope of human-generated data is often ineffective. In our research, we try to find an alternative resource to study such data. Common Crawl is a massive multi-petabyte dataset hosted by Amazon. It contains archived HTML web page data from 2008 to date. Common Crawl has been widely used for text mining purposes. Using data extracted from Common Crawl has several advantages over a direct crawl of web data, among which is removing the likelihood of a user\u27s home IP address becoming blacklisted for accessing a given web site too frequently. However, Common Crawl is a data sample, and so questions arise about the quality of Common Crawl as a representative sample of the original data. We perform systematic tests on the similarity of topics estimated from Common Crawl compared to topics estimated from the full data of online forums. Our target is online discussions from a user forum for car enthusiasts, but our research strategy can be applied to other domains and samples to evaluate the representativeness of topic models. We show that topic proportions estimated from Common Crawl are not significantly different than those estimated on the full data. We also show that topics are similar in terms of their word compositions, and not worse than topic similarity estimated under true random sampling, which we simulate through a series of experiments. Our research will be of interest to analysts who wish to use Common Crawl to study topics of interest in user forum data, and analysts applying topic models to other data samples. Twitter data is another example of high-velocity streaming data. We use it as an example to study the query expansion application in streaming social media data analysis. Query expansion is a problem concerned with gathering more relevant documents from a given set that cover a certain topic. Here in this thesis we outline a number of tools for a query expansion system that will allow its user to gather more relevant documents (in this case, tweets from the Twitter social media system), while discriminating from irrelevant documents. These tools include a method for triggering a given query expansion using a Jaccard similarity threshold between keywords, and a query expansion method using archived news reports to create a vector space of novel keywords. As the nature of streaming data, Twitter stream contains emerging events that are constantly changing and therefore not predictable using static queries. Since keywords used in static query method often mismatch the words used in topics around emerging events. To solve this problem, our proposed approach of automated query expansion detects the emerging events in the first place. Then we combine both local analysis and global analysis methods to generate queries for capturing the emerging topics. Experiment results show that by combining the global analysis and local analysis method, our approach can capture the semantic information in the emerging events with high efficiency

    Knowledge and Reasoning for Image Understanding

    Get PDF
    abstract: Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning. Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Cluster analysis on radio product integration testing faults

    Get PDF
    Abstract. Nowadays, when the different software systems keep getting larger and more complex, integration testing is necessary to ensure that the different components of the system work together correctly. With the large and complex systems the analysis of the test faults can be difficult, as there are so many components that can cause the failure. Also with the increased usage of automated tests, the faults can often be caused by test environment or test automation issues. Testing data and logs collected during the test executions are usually the main source of information that are used for test fault analysis. With the usage of text mining, natural language processing and machine learning methods, the fault analysis process is possible to be automated using the data and logs collected from the tests, as multiple studies have shown in the recent years. In this thesis, an exploratory data study is done on data collected from radio product integration tests done at Nokia. Cluster analysis is used to find the different fault types that can be found from each of the collected file types. Different feature extraction methods are used and evaluated in terms of how well they separate the data for fault analysis. The study done on this thesis paves the way for automated fault analysis in the future. The introduced methods can be applied for classifying the faults and the results and findings can be used to determine what are the next steps that can be taken to enable future implementations for automated fault analysis applications.Radiotuotteiden integraatiotestauksen vikojen klusterianalyysi. Tiivistelmä. Nykypäivänä, kun erilaiset ohjelmistojärjestelmät jatkavat kasvamista ja muuttuvat monimutkaisimmaksi, integraatiotestaus on välttämätöntä, jotta voidaan varmistua siitä, että järjestelmän eri komponentit toimivat yhdessä oikein. Suurien ja monimutkaisten järjestelmien testivikojen analysointi voi olla vaikeaa, koska järjestelmissä on mukana niin monta komponenttia, jotka voivat aiheuttaa testien epäonnistumisen. Testien automatisoinnin lisääntymisen myötä testit voivat usein epäonnistua myös johtuen testiympäristön tai testiautomaation ongelmista. Testien aikana kerätty testidata ja testilogit ovat yleensä tärkein tiedonlähde testivikojen analyysissä. Hyödyntämällä tekstinlouhinnan, luonnollisen kielen käsittelyn sekä koneoppimisen menetelmiä, testivikojen analyysiprosessi on mahdollista automatisoida käyttämällä testien aikana kerättyä testidataa ja testilogeja, kuten monet tutkimukset ovat viime vuosina osoittaneet. Tässä tutkielmassa tehdään eksploratiivinen tutkimus Nokian radiotuotteiden integraatiotesteistä kerätyllä datalla. Erilaiset vikatyypit, jotka voidaan löytää kustakin kerätystä tiedostotyypistä, löydetään käyttämällä klusterianalyysiä. Ominaisuusvektorien laskentaan käytetään eri menetelmiä ja näiden menetelmien kykyä erotella dataa vika-analyysin näkökulmasta arvioidaan. Tutkielmassa tehty tutkimus avaa tietä vika-analyysien automatisoinnille tulevaisuudessa. Esitettyjä menetelmiä voidaan käyttää vikojen luokittelussa ja tuloksien perusteella voidaan määritellä, mitkä ovat seuraavia askelia, jotta vika-analyysiprosessia voidaan automatisoida tulevaisuudessa

    Entity-Oriented Search

    Get PDF
    This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
    corecore