6,412 research outputs found

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    Effects of Random Errors on Graph Convolutional Networks

    Get PDF
    The use of Graph Convolutional Networks (GCN) has been an emerging trend in the network science research community. While GCN achieves excellent performance in several tasks, there exists an open issue in applying GCN to real-world applications. The issue is the effects of network errors on GCN. Since real-world network data contain several types of noises and errors, GCN is desirable to be less affected by such errors. However, the effects have not been sufficiently evaluated before. In this paper, we analyze the effects of random errors on GCN through extensive experiments. The results show that the node classification accuracy of GCN is decreased only 5% even when 50% of the edges are randomly increased or decreased. Moreover, in terms of false labels, the accuracy of node classification is decreased only 10% even when 20% of the labels are changed

    How alternative food networks work in a metropolitan area? An analysis of Solidarity Purchase Groups in Northern Italy

    Get PDF
    Our paper focuses on Solidarity Purchase Group (SPG) participants located in a highly urbanized area, with the aim to investigate the main motivations underlining their participation in a SPG and provide a characterization of them. To this end, we carried out a survey of 795 participants involved in 125 SPGs in the metropolitan area of Milan (Italy). Taking advantage of a questionnaire with 39 questions, we run a factor analysis and a two-step cluster analysis to identify different profiles of SPG participants. Our results show that the system of values animating metropolitan SPG practitioners does not fully conform to that traditionally attributed to an alternative food network (AFN). In fact, considerations linked to food safety and healthiness prevail on altruistic motives such as environmental sustainability and solidarity toward small producers. Furthermore, metropolitan SPGs do not consider particularly desirable periurban and local food products. Observing the SPGs from this perspective, it emerges as such initiatives can flourish also in those places where the lack of connection with the surrounding territory is counterbalanced by the high motivation to buy products from trusted suppliers who are able to guarantee genuine and safe products, not necessarily located nearby
    corecore