1,876 research outputs found

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770

    Introduction to semantic e-Science in biomedicine

    Get PDF
    The Semantic Web technologies provide enhanced capabilities that allow data and the meaning of the data to be shared and reused across application, enterprise, and community boundaries, better enabling integrative research and more effective knowledge discovery. This special issue is intended to give an introduction of the state-of-the-art of Semantic Web technologies and describe how such technologies would be used to build the e-Science infrastructure for biomedical communities. Six papers have been selected and included, featuring different approaches and experiences in a variety of biomedical domains

    Facilitating Design-by-Analogy: Development of a Complete Functional Vocabulary and Functional Vector Approach to Analogical Search

    Get PDF
    Design-by-analogy is an effective approach to innovative concept generation, but can be elusive at times due to the fact that few methods and tools exist to assist designers in systematically seeking and identifying analogies from general data sources, databases, or repositories, such as patent databases. A new method for extracting analogies from data sources has been developed to provide this capability. Building on past research, we utilize a functional vector space model to quantify analogous similarity between a design problem and the data source of potential analogies. We quantitatively evaluate the functional similarity between represented design problems and, in this case, patent descriptions of products. We develop a complete functional vocabulary to map the patent database to applicable functionally critical terms, using document parsing algorithms to reduce text descriptions of the data sources down to the key functions, and applying Zipf’s law on word count order reduction to reduce the words within the documents. The reduction of a document (in this case a patent) into functional analogous words enables the matching to novel ideas that are functionally similar, which can be customized in various ways. This approach thereby provides relevant sources of design-by-analogy inspiration. Although our implementation of the technique focuses on functional descriptions of patents and the mapping of these functions to those of the design problem, resulting in a set of analogies, we believe that this technique is applicable to other analogy data sources as well. As a verification of the approach, an original design problem for an automated window washer illustrates the distance range of analogical solutions that can be extracted, extending from very near-field, literal solutions to far-field cross-domain analogies. Finally, a comparison with a current patent search tool is performed to draw a contrast to the status quo and evaluate the effectiveness of this work.National Science Foundation (U.S.) (grant number CMMI-0855510)National Science Foundation (U.S.) (grant number CMMI-0855326)National Science Foundation (U.S.) (grant number CMMI-0855293)SUTD-MIT International Design Centre (IDC

    ISTRAŽIVANJE O POVEZIVANJU ENTITETA ZA SPECIFIČNE DOMENE S HETEROGENIM INFORMACIJSKIM MREŽAMA

    Get PDF
    Entity linking is a task of extracting information that links the mentioned entity in a collection of text with their similar knowledge base as well as it is the task of allocating unique identity to various entities such as locations, individuals and companies. Knowledgebase (KB) is used to optimize the information collection, organization and for retrieval of information. Heterogeneous information networks (HIN) comprises multiple-type interlinked objects with various types of relationship which are becoming increasingly most popular named bibliographic networks, social media networks as well including the typical relational database data. In HIN, there are various data objects are interconnected through various relations. The entity linkage determines the corresponding entities from unstructured web text, in the existing HIN. This work is the most important and it is the most challenge because of ambiguity and existing limited knowledge. Some HIN could be considered as a domain-specific KB. The current Entity Linking (EL) systems aimed towards corpora which contain heterogeneous as web information and it performs sub-optimally on the domain-specific corpora. The EL systems used one or more general or specific domains of linking such as DBpedia, Wikipedia, Freebase, IMDB, YAGO, Wordnet and MKB. This paper presents a survey on domain-specific entity linking with HIN. This survey describes with a deep understanding of HIN, which includes datasets,types and examples with related concepts.Povezivanje entiteta je zadatak izvlačenja podataka koji povezuju spomenuti entitet u zbirci teksta sa njihovom sličnom bazom znanja, kao i zadatak dodjeljivanja jedinstvenog identiteta različitim entitetima, kao što su lokacije, pojedinci i tvrtke. Baza znanja (BZ) koristi se za optimizaciju prikupljanja, organizacije i pronalaženja informacija. Heterogene mreže informacija (HMI) obuhvaćaju višestruke međusobno povezane objekte različitih vrsta odnosa koji postaju sve popularniji i nazivaju se bibliografskim mrežama, mrežama društvenih medija, uključujući tipične podatke relacijske baze podataka. U HMI-u postoje razni podaci koji su međusobno povezani kroz različite odnose. Povezanost entiteta određuje odgovarajuće entitete iz nestrukturiranog teksta na webu u postojećem HMI-u. Ovaj je rad najvažniji i najveći izazov zbog nejasnoće i postojećeg ograničenog znanja. Neki se HMI mogu smatrati BZ-om specifičnim za domenu. Trenutni sustav povezivanja entiteta (PE) usmjeren je prema korpusima koji sadrže heterogene informacije kao web informacije i oni djeluju suptimalno na korpusima specifičnim za domenu. PE sustavi koristili su jednu ili više općih ili specifičnih domena povezivanja, kao što su DBpedia, Wikipedia, Freebase, IMDB, YAGO, Wordnet i MKB. U ovom radu predstavljeno je istraživanje o povezivanju entiteta specifičnog za domenu sa HMI-om. Ovo istraživanje opisuje s dubokim razumijevanjem HMI-a, što uključuje skupove podataka, vrste i primjere s povezanim konceptima

    PubMed and beyond: a survey of web tools for searching biomedical literature

    Get PDF
    The past decade has witnessed the modern advances of high-throughput technology and rapid growth of research capacity in producing large-scale biological data, both of which were concomitant with an exponential growth of biomedical literature. This wealth of scholarly knowledge is of significant importance for researchers in making scientific discoveries and healthcare professionals in managing health-related matters. However, the acquisition of such information is becoming increasingly difficult due to its large volume and rapid growth. In response, the National Center for Biotechnology Information (NCBI) is continuously making changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted themselves to developing Web tools for helping users quickly and efficiently search and retrieve relevant publications. These practices, together with maturity in the field of text mining, have led to an increase in the number and quality of various Web tools that provide comparable literature search service to PubMed. In this study, we review 28 such tools, highlight their respective innovations, compare them to the PubMed system and one another, and discuss directions for future development. Furthermore, we have built a website dedicated to tracking existing systems and future advances in the field of biomedical literature search. Taken together, our work serves information seekers in choosing tools for their needs and service providers and developers in keeping current in the field

    Information retrieval and text mining technologies for chemistry

    Get PDF
    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio

    Function Based Design-by-Analogy: A Functional Vector Approach to Analogical Search

    Get PDF
    Design-by-analogy is a powerful approach to augment traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. While the concept of design-by-analogy has been known for some time, few actual methods and tools exist to assist designers in systematically seeking and identifying analogies from general data sources, databases, or repositories, such as patent databases. A new method for extracting functional analogies from data sources has been developed to provide this capability, here based on a functional basis rather than form or conflict descriptions. Building on past research, we utilize a functional vector space model (VSM) to quantify analogous similarity of an idea's functionality. We quantitatively evaluate the functional similarity between represented design problems and, in this case, patent descriptions of products. We also develop document parsing algorithms to reduce text descriptions of the data sources down to the key functions, for use in the functional similarity analysis and functional vector space modeling. To do this, we apply Zipf's law on word count order reduction to reduce the words within the documents down to the applicable functionally critical terms, thus providing a mapping process for function based search. The reduction of a document into functional analogous words enables the matching to novel ideas that are functionally similar, which can be customized various ways. This approach thereby provides relevant sources of design-by-analogy inspiration. As a verification of the approach, two original design problem case studies illustrate the distance range of analogical solutions that can be extracted. This range extends from very near-field, literal solutions to far-field cross-domain analogies.National Science Foundation (U.S.) (Grant CMMI-0855326)National Science Foundation (U.S.) (Grant CMMI-0855510)National Science Foundation (U.S.) (Grant CMMI-0855293)SUTD-MIT International Design Centre (IDC

    Design-by-analogy: experimental evaluation of a functional analogy search methodology for concept generation improvement

    Get PDF
    Design-by-analogy is a growing field of study and practice, due to its power to augment and extend traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. This paper presents the results of experimentally testing a new method for extracting functional analogies from general data sources, such as patent databases, to assist designers in systematically seeking and identifying analogies. In summary, the approach produces significantly improved results on the novelty of solutions generated and no significant change in the total quantity of solutions generated. Computationally, this design-by-analogy facilitation methodology uses a novel functional vector space representation to quantify the functional similarity between represented design problems and, in this case, patent descriptions of products. The mapping of the patents into the functional analogous words enables the generation of functionally relevant novel ideas that can be customized in various ways. Overall, this approach provides functionally relevant novel sources of design-by-analogy inspiration to designers and design teams.SUTD-MIT International Design Centre (IDC)National Science Foundation (U.S.) (Grant Numbers CMMI-0855326, CMMI-0855510, and CMMI-08552930
    corecore