828 research outputs found

    AGROVOC: The linked data concept hub for food and agriculture

    Get PDF
    Newly acquired, aggregated and shared data are essential for innovation in food and agriculture to improve the discoverability of research. Since the early 1980′s, the Food and Agriculture Organization of the United Nations (FAO) has coordinated AGROVOC, a valuable tool for data to be classified homogeneously, facilitating interoperability and reuse. AGROVOC is a multilingual and controlled vocabulary designed to cover concepts and terminology under FAO's areas of interest. It is the largest Linked Open Data set about agriculture available for public use and its highest impact is through facilitating the access and visibility of data across domains and languages. This chapter has the aim of describing the current status of one of the most popular thesaurus in all FAO’s areas of interest, and how it has become the Linked Data Concept Hub for food and agriculture, through new procedures put in plac

    Mining Meaning from Wikipedia

    Get PDF
    Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.Comment: An extensive survey of re-using information in Wikipedia in natural language processing, information retrieval and extraction and ontology building. Accepted for publication in International Journal of Human-Computer Studie

    Classification management and use in a networked environment : the case of the Universal Decimal Classification

    Get PDF
    In the Internet information space, advanced information retrieval (IR) methods and automatic text processing are used in conjunction with traditional knowledge organization systems (KOS). New information technology provides a platform for better KOS publishing, exploitation and sharing both for human and machine use. Networked KOS services are now being planned and developed as powerful tools for resource discovery. They will enable automatic contextualisation, interpretation and query matching to different indexing languages. The Semantic Web promises to be an environment in which the quality of semantic relationships in bibliographic classification systems can be fully exploited. Their use in the networked environment is, however, limited by the fact that they are not prepared or made available for advanced machine processing. The UDC was chosen for this research because of its widespread use and its long-term presence in online information retrieval systems. It was also the first system to be used for the automatic classification of Internet resources, and the first to be made available as a classification tool on the Web. The objective of this research is to establish the advantages of using UDC for information retrieval in a networked environment, to highlight the problems of automation and classification exchange, and to offer possible solutions. The first research question was is there enough evidence of the use of classification on the Internet to justify further development with this particular environment in mind? The second question is what are the automation requirements for the full exploitation of UDC and its exchange? The third question is which areas are in need of improvement and what specific recommendations can be made for implementing the UDC in a networked environment? A summary of changes required in the management and development of the UDC to facilitate its full adaptation for future use is drawn from this analysis.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Jewish Studies in the Digital Age

    Get PDF
    The digitisation boom of the last two decades, and the rapid advancement of digital tools to analyse data in myriad ways, have opened up new avenues for humanities research. This volume discusses how the so-called digital turn has affected the field of Jewish Studies, explores the current state of the art and probes how digital developments can be harnessed to address the specific questions, challenges and problems in the field

    The Future of Information Sciences : INFuture2009 : Digital Resources and Knowledge Sharing

    Get PDF

    Jewish Studies in the Digital Age

    Get PDF
    The digitisation boom of the last two decades, and the rapid advancement of digital tools to analyse data in myriad ways, have opened up new avenues for humanities research. This volume discusses how the so-called digital turn has affected the field of Jewish Studies, explores the current state of the art and probes how digital developments can be harnessed to address the specific questions, challenges and problems in the field

    Object-based knowledge representation of famale related issues from the Holy Quran

    Get PDF
    Focusing on the use of Semantic Network and Conceptual Graph (CG) representations, this paper presents an easy way in understanding concepts discussed in the Holy Quran.Quran is known as the main source of knowledge and has been a major source reference for all types of problems. However understanding the issues and the solution from the Quran is diffi cult due to lack of understanding of Quran literature.Meticulously, the Quran contains much important information related to female.However, this information are scattered and complexly linked. Technically, to extract and present the encapsulated knowledge on female matters in the Quran is a challenging task.Thus, this paper discusses on how to understand and represent the knowledge in an easy way.A total of 18 female terms are identifi ed. Through the terms, the name of surah, verses number and text from the verses are gathered. The texts are then analyzed and clustered into specific issues.Result of the analysis that consists of extracted knowledge on female issues is presented in a systematic structure using Semantic Network and CG.The strength and advantages of both approaches are compared, discussed and presented

    Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction

    Get PDF
    Ontology learning refers to an automatic extraction of ontology to produce the ontology learning layer cake which consists of five kinds of output: terms, concepts, taxonomy relations, non-taxonomy relations and axioms. Term extraction is a prerequisite for all aspects of ontology learning. It is the automatic mining of complete terms from the input document. Another important part of ontology is taxonomy, or the hierarchy of concepts. It presents a tree view of the ontology and shows the inheritance between subconcepts and superconcepts. In this research, two methods were proposed for improving the performance of the extraction result. The first method uses particle swarm optimization in order to optimize the weights of features. The advantage of particle swarm optimization is that it can calculate and adjust the weight of each feature according to the appropriate value, and here it is used to improve the performance of term and taxonomy extraction. The second method uses a hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems that ensures that the membership functions and fuzzy system rule sets are optimized. The advantage of using a fuzzy system is that the imprecise and uncertain values of feature weights can be tolerated during the extraction process. This method is used to improve the performance of taxonomy extraction. In the term extraction experiment, five extracted features were used for each term from the document. These features were represented by feature vectors consisting of domain relevance, domain consensus, term cohesion, first occurrence and length of noun phrase. For taxonomy extraction, matching Hearst lexico-syntactic patterns in documents and the web, and hypernym information form WordNet were used as the features that represent each pair of terms from the texts. These two proposed methods are evaluated using a dataset that contains documents about tourism. For term extraction, the proposed method is compared with benchmark algorithms such as Term Frequency Inverse Document Frequency, Weirdness, Glossary Extraction and Term Extractor, using the precision performance evaluation measurement. For taxonomy extraction, the proposed methods are compared with benchmark methods of Feature-based and weighting by Support Vector Machine using the f-measure, precision and recall performance evaluation measurements. For the first method, the experiment results concluded that implementing particle swarm optimization in order to optimize the feature weights in terms and taxonomy extraction leads to improved accuracy of extraction result compared to the benchmark algorithms. For the second method, the results concluded that the hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems leads to improved performance of taxonomy extraction results when compared to the benchmark methods, while adjusting the fuzzy membership function and keeping the number of fuzzy rules to a minimum number with a high degree of accuracy
    corecore