9,586 research outputs found

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    empathi: An ontology for Emergency Managing and Planning about Hazard Crisis

    Full text link
    In the domain of emergency management during hazard crises, having sufficient situational awareness information is critical. It requires capturing and integrating information from sources such as satellite images, local sensors and social media content generated by local people. A bold obstacle to capturing, representing and integrating such heterogeneous and diverse information is lack of a proper ontology which properly conceptualizes this domain, aggregates and unifies datasets. Thus, in this paper, we introduce empathi ontology which conceptualizes the core concepts concerning with the domain of emergency managing and planning of hazard crises. Although empathi has a coarse-grained view, it considers the necessary concepts and relations being essential in this domain. This ontology is available at https://w3id.org/empathi/

    Expert Finding by Capturing Organisational Knowledge from Legacy Documents

    No full text
    Organisations capitalise on their best knowledge through the improvement of shared expertise which leads to a higher level of productivity and competency. The recognition of the need to foster the sharing of expertise has led to the development of expert finder systems that hold pointers to experts who posses specific knowledge in organisations. This paper discusses an approach to locating an expert through the application of information retrieval and analysis processes to an organization’s existing information resources, with specific reference to the engineering design domain. The approach taken was realised through an expert finder system framework. It enables the relationships of heterogeneous information sources with experts to be factored in modelling individuals’ expertise. These valuable relationships are typically ignored by existing expert finder systems, which only focus on how documents relate to their content. The developed framework also provides an architecture that can be easily adapted to different organisational environments. In addition, it also allows users to access the expertise recognition logic, giving them greater trust in the systems implemented using this framework. The framework were applied to real world application and evaluated within a major engineering company

    Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure

    Full text link
    Big data research has attracted great attention in science, technology, industry and society. It is developing with the evolving scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies. However, its nature and fundamental challenge have not been recognized, and its own methodology has not been formed. This paper explores and answers the following questions: What is big data? What are the basic methods for representing, managing and analyzing big data? What is the relationship between big data and knowledge? Can we find a mapping from big data into knowledge space? What kind of infrastructure is required to support not only big data management and analysis but also knowledge discovery, sharing and management? What is the relationship between big data and science paradigm? What is the nature and fundamental challenge of big data computing? A multi-dimensional perspective is presented toward a methodology of big data computing.Comment: 59 page
    • …
    corecore