7,055 research outputs found
Intelligent Management and Efficient Operation of Big Data
This chapter details how Big Data can be used and implemented in networking
and computing infrastructures. Specifically, it addresses three main aspects:
the timely extraction of relevant knowledge from heterogeneous, and very often
unstructured large data sources, the enhancement on the performance of
processing and networking (cloud) infrastructures that are the most important
foundational pillars of Big Data applications or services, and novel ways to
efficiently manage network infrastructures with high-level composed policies
for supporting the transmission of large amounts of data with distinct
requisites (video vs. non-video). A case study involving an intelligent
management solution to route data traffic with diverse requirements in a wide
area Internet Exchange Point is presented, discussed in the context of Big
Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big
Data and Web Intelligence, IGI Global, 201
Information Extraction in Illicit Domains
Extracting useful entities and attribute values from illicit domains such as
human trafficking is a challenging problem with the potential for widespread
social impact. Such domains employ atypical language models, have `long tails'
and suffer from the problem of concept drift. In this paper, we propose a
lightweight, feature-agnostic Information Extraction (IE) paradigm specifically
designed for such domains. Our approach uses raw, unlabeled text from an
initial corpus, and a few (12-120) seed annotations per domain-specific
attribute, to learn robust IE models for unobserved pages and websites.
Empirically, we demonstrate that our approach can outperform feature-centric
Conditional Random Field baselines by over 18\% F-Measure on five annotated
sets of real-world human trafficking datasets in both low-supervision and
high-supervision settings. We also show that our approach is demonstrably
robust to concept drift, and can be efficiently bootstrapped even in a serial
computing environment.Comment: 10 pages, ACM WWW 201
Recommended from our members
Big Data in the Oil and Gas Industry: A Promising Courtship
The energy industry remains one of the highest money-producing and investment industries in the world. The United States’ own economic stability depends greatly on the stability of oil and gas prices. Various factors affect the amount of money that will continue to be invested in producing oil. A main disadvantage to the oil and gas industry is its lack of technological adaptation. This weakens the industry because the surest measures are not currently being taken to produce oil in optimally efficient, safe, and cost-effective ways. Big data has gained global recognition as an opportunity to gather large volumes of information in real-time and translate data sets into actionable insights. In a low commodity price environment, saving time, reducing costs, and improving safety are crucial outcomes that can be realized using machine learning in oil and gas operations. Big data provides the opportunity to use unsupervised learning. For example, with this approach, engineers can predict oil wells’ optimal barrels of production given the completion data in a specific area. However, a caveat to utilizing big data in the oil and gas industry is that there simply is neither enough physical data nor data velocity in the industry to be properly referred to as “big data.” Big data, as it develops, will nonetheless significantly change the energy business in the future, as it already has in various other industries.Petroleum and Geosystems Engineerin
Categorical Data Integration for Computational Science
Categorical Query Language is an open-source query and data integration
scripting language that can be applied to common challenges in the field of
computational science. We discuss how the structure-preserving nature of CQL
data migrations protect those who publicly share data from the
misinterpretation of their data. Likewise, this feature of CQL migrations
allows those who draw from public data sources to be sure only data which meets
their specification will actually be transferred. We argue some open problems
in the field of data sharing in computational science are addressable by
working within this paradigm of functorial data migration. We demonstrate these
tools by integrating data from the Open Quantum Materials Database with some
alternative materials databases.Comment: 10 pages, 5 figure
Engineering polymer informatics: Towards the computer-aided design of polymers
The computer-aided design of polymers is one of the holy grails of modern chemical
informatics and of significant interest for a number of communities in polymer
science. The paper outlines a vision for the in silico design of polymers and presents
an information model for polymers based on modern semantic web technologies, thus
laying the foundations for achieving the vision
Estimating the relation of big data on business model innovation: a qualitative research
Gaining interdisciplinary attention across academia, the concept of Big Data also finds application in the business world. Realizing the potential of the trend, this research considers the impact of Big Data with a strategic perspective and by focusing on the following research question: How can data and data-driven decisions lead to business model innovation?Challenging the assumption that Big Data even has the potential to impact business models, this research firstly elaborates on the construct of business modelsandbusiness model patterns. Subsequently, the Big Data concept is defined, by focusing on its unstructured and fast-moving nature. Considering the broad influence Big Data might have on business models, a qualitative research design is esteemed appropriate to answer the research question: The anal yses of semi-structured interviews with experts give insights about complex relations in the field of Big Data.For this research 13 participants contributedtheir opinions on Big Data, among others, they identifycurrent methodsand illustrate data visions for the future.One of the main findings of this research is that Big Data still imposes problems on managers, most of them are of analytical, technical or cultural nature. At the same time, the agents that suffer from insufficient data analytics,are invested to generate a data strategy that will facilitate data management.This research defines that data objects must be prioritized due to their utility,by means of data valuation. Associating a monetary value with data objects helps managersto commit totheirdecisions indata management. Furthermore, this research reveals that Big Data integration improves operations at various levels. In an incremental instance,businesses can reduce costs or differentiate their product and service portfolio through Big Data integration.Furthermore, Big Data finds applications on a strategic level:This research detects that Big Data possesses the proficiency to facilitate all business model dimensions and even to create innovation. Concluding, this master thesiscontributes to the research field of Strategy&Innovationas it increases the theoretical understanding of Big Data and its integration in strategic decision making. It considers several related topics to assess the capability of data,by including the notions of data monetization and experience data. Furthermore, this thesis discloses novel case studies, which give evidence of the status quo of data integration across industries. By deriving propositions, this study serves as a valuable guideline for further research on data management and business model innovation
Data Motility: The Materiality of Big Social Data
In this article, the author uses Foucault's largely overlooked but vital concept, the dispositif, in relation to the recent rise of mobility, explosion of data and proliferation of platforms and apps. With a focus on how data an individual generates increasingly moves autonomously of their control, he presents the dispositif of ‘data motility’ to develop a new materialist analysis of the digital human as a discursive and non-discursive assemblage
- …