11,740 research outputs found

    Spartan Daily, September 6, 1984

    Get PDF
    Volume 83, Issue 5https://scholarworks.sjsu.edu/spartandaily/7191/thumbnail.jp

    The Semantic Web: Apotheosis of annotation, but what are its semantics?

    Get PDF
    This article discusses what kind of entity the proposed Semantic Web (SW) is, principally by reference to the relationship of natural language structure to knowledge representation (KR). There are three distinct views on this issue. The first is that the SW is basically a renaming of the traditional AI KR task, with all its problems and challenges. The second view is that the SW will be, at a minimum, the World Wide Web with its constituent documents annotated so as to yield their content, or meaning structure, more directly. This view makes natural language processing central as the procedural bridge from texts to KR, usually via some form of automated information extraction. The third view is that the SW is about trusted databases as the foundation of a system of Web processes and services. There's also a fourth view, which is much more difficult to define and discuss: If the SW just keeps moving as an engineering development and is lucky, then real problems won't arise. This article is part of a special issue called Semantic Web Update

    Inside UNLV

    Full text link

    Text mining of biomedical literature: discovering new knowledge

    Get PDF
    Biomedical literature is increasing day by day. The present scenario shows that the volume of literature regarding “coronavirus” has expanded at a high rate. In this study, text mining technique has been employed to discover something new from the published literature. The main objectives of this study are to show the growth of literature (Jan-Jun, 2020), extract document section, identify latent topics, find the most frequent word, represent the bag of words, and the hierarchical clustering. We have collected 16500 documents from PubMed. This study finds most number of documents (11499) belong to May and June. We explore “betacoronavirus” as the leading document section (3837); “covid” (29890) as the most frequent word in the abstracts; and positive-negative weights of topics. Further, we measure the term frequency (TF) of a document title in the bag of words model. Then we compute a hierarchical clustering of document titles. It reveals that the lowest distance the selected cluster (C133) is 0.30. We also have made a discussion over future prospects and mentioned that this paper can be useful to researchers and library professionals for knowledge management

    Arabic and English News Coverage on aljazeera.net

    Get PDF
    The controversial Al Jazeera network, with its Arabic and English news websites, is an interesting object for comparative study. This study compares the\ud two language versions in terms of their layouts and the structural features, regional and thematic coverage, and ideological perspective reflected in the headlines of\ud news reports. Content analysis and critical discourse analysis revealed differences between the two versions for all aspects except for thematic coverage, indicating\ud systematic biases in coverage, alongside efforts to present ideological balance. \ud \ud <br />\ud <br />\ud \ud Le réseau Al Jazeera, avec ses sites d’information en arabe et en anglais\ud représente un objet intéressant pour une étude comparative. Cette étude compare les versions dans les deux langues, en ce qui concerne la présentation et les\ud caractéristiques structurelles, la couverture régionale et thématique, ainsi que la perspective idéologique telle qu’elle est reflétée par les grands titres. L’analyse du\ud contenu et l’analyse du discours révèlent des différences entre les deux versions sur tous les aspects, sauf pour la couverture thématique et pointent un biais\ud systématique pour les domaines couverts et des efforts pour assurer un équilibre idéologiqu

    Large-Scale Knowledge Synthesis and Complex Information Retrieval from Biomedical Documents

    Full text link
    Recent advances in the healthcare industry have led to an abundance of unstructured data, making it challenging to perform tasks such as efficient and accurate information retrieval at scale. Our work offers an all-in-one scalable solution for extracting and exploring complex information from large-scale research documents, which would otherwise be tedious. First, we briefly explain our knowledge synthesis process to extract helpful information from unstructured text data of research documents. Then, on top of the knowledge extracted from the documents, we perform complex information retrieval using three major components- Paragraph Retrieval, Triplet Retrieval from Knowledge Graphs, and Complex Question Answering (QA). These components combine lexical and semantic-based methods to retrieve paragraphs and triplets and perform faceted refinement for filtering these search results. The complexity of biomedical queries and documents necessitates using a QA system capable of handling queries more complex than factoid queries, which we evaluate qualitatively on the COVID-19 Open Research Dataset (CORD-19) to demonstrate the effectiveness and value-add
    • …
    corecore