24,854 research outputs found
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
The global economic crisis and international migration: An uncertain outlook
This article investigates the impact of the global economic crisis on international migration. Empirical evidence is scarce and mainly captures short-term consequences. The study covers (1) international migration theory, (2) the impact of past financial crises on international migration, and (3) published expert opinions, studies and discussions. The impact varies by reason for migration and by migrants’ employment status. Labour migration is affected most, in particular migration of low-skilled persons. Political and environmental refugees, marriage migration and family reunion will not be much affected. Remittances are affected less than predicted. The management of migration during periods of economic downturns should be guided by short- and long-term perspectives on the role of migration in development.
Benefits of a Product's Industry 4.0 Compliancy
The latest industrial revolution, the fourth one labeled Industry 4.0, has been ongoing for over a decade. Still, the topic seems to be surrounded by ambiguity with lacking some of the details defining what it really means, what is the purpose of the Indsutry 4.0? Inspired by a cloud integration project implemented by Cryotech Nordic for an Italian company in autumn 2021, the aim of this thesis is to answer some of the questions that arose during the project related to Industry 4.0 by looking at the issue from the perspective of product features, while discussing the benefits that different stakeholders seek and obtain from the implementation of Industry 4.0 technologies and systems. The thesis utilises two methods for its two research questions, one where national Industry 4.0 initiatives and known Industry 4.0 products are studied and another where a literature review is conducted to find answers from group of articles. The outcome of the thesis is a construction of a Minimum Viable Product model and a categorisation for the benefits and for stakeholders receiving the benefits from the Industry 4.0 implementations with a statistical distribution of the found benefits into these categories
ICT Update 66: A future for telecentres
ICT Update is a bimonthly printed and on-line magazine (http://ictupdate.cta.int) and an accompanying e-mail newsletter published by CTA. This issue focuses on the future of telecentres
Copyright Infringement Liability of Online Content Sharing Platforms in the US and in the EU after the Digital Single Market Directive: A Case Study
The EU copyright liability regime for internet service providers has significantly changed after the enactment of article 17 of the Digital Single Market Directive. Where two fairly similar systems once existed in the US and in the EU, there are now significant differences between the regimes with which service providers must comply in each region. This paper seeks to offer a practical view of the differences between both systems through a comparative analysis of the result that the application of each legal framework would have on an identical factual case. Specifically, this paper contrasts the decision reached by US courts in Capitol Records v. Vimeo with the hypothetical result that a EU court would deliver to those same facts in application of the Digital Single Market Directive
Complexity Theory, Adaptation, and Administrative Law
Recently, commentators have applied insights from complexity theory to legal analysis generally and to administrative law in particular. This Article focuses on one of the central problems that complexity. theory addresses, the importance and mechanisms of adaptation within complex systems. In Part I, the Article uses three features of complex adaptive systems-emergence from self-assembly, nonlinearity, and sensitivity to initial conditions-and explores the extent to which they may add value as a matter of positive analysis to the understanding of change within legal systems. In Part H, the Article focuses on three normative claims in public law scholarship that depend explicitly or implicitly on notions of adaptation: that states offer advantages over the federal government because experimentation can make them more adaptive, that federal agencies should themselves become more experimentalist using the tool of adaptive management, and that administrative agencies shou Id adopt collaborative mechanisms in policymaking. Using two analytic tools found in the complexity literature, the genetic algorithm and evolutionary game theory, the Article tests the extent to which these three normative claims are borne out
You, the Web and Your Device: Longitudinal Characterization of Browsing Habits
Understanding how people interact with the web is key for a variety of
applications, e.g., from the design of effective web pages to the definition of
successful online marketing campaigns. Browsing behavior has been traditionally
represented and studied by means of clickstreams, i.e., graphs whose vertices
are web pages, and edges are the paths followed by users. Obtaining large and
representative data to extract clickstreams is however challenging. The
evolution of the web questions whether browsing behavior is changing and, by
consequence, whether properties of clickstreams are changing. This paper
presents a longitudinal study of clickstreams in from 2013 to 2016. We evaluate
an anonymized dataset of HTTP traces captured in a large ISP, where thousands
of households are connected. We first propose a methodology to identify actual
URLs requested by users from the massive set of requests automatically fired by
browsers when rendering web pages. Then, we characterize web usage patterns and
clickstreams, taking into account both the temporal evolution and the impact of
the device used to explore the web. Our analyses precisely quantify various
aspects of clickstreams and uncover interesting patterns, such as the typical
short paths followed by people while navigating the web, the fast increasing
trend in browsing from mobile devices and the different roles of search engines
and social networks in promoting content. Finally, we contribute a dataset of
anonymized clickstreams to the community to foster new studies (anonymized
clickstreams are available to the public at
http://bigdata.polito.it/clickstream).Comment: 30 page
Recommended from our members
Regulating Search Engines: Taking Stock And Looking Ahead
Since the creation of the first pre-Web Internet search engines in the early 1990s, search engines have become almost as important as email as a primary online activity. Arguably, search engines are among the most important gatekeepers in today's digitally networked environment. Thus, it does not come as a surprise that the evolution of search technology and the diffusion of search engines have been accompanied by a series of conflicts among stakeholders such as search operators, content creators, consumers/users, activists, and governments. While few tussles existed in the initial phase of innovation where Internet search engines were mainly used by 'techies' and academics, substantial conflicts emerged once the technology got out of the universities and entered the commercial space. When search technology advanced and search services gained commercial significance, these conflicts became more severe and made their way into the legal arena. At the core of most of these disputes were controversies over intellectual property, particularly trademark and copyright issues
- …