Search CORE

7,055 research outputs found

Intelligent Management and Efficient Operation of Big Data

Author: Batista Fernando
Cardoso Elsa
Moura Jose
Nunes Luis
Publication venue
Publication date: 01/01/2015
Field of study

This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201

arXiv.org e-Print Archive

Repositório Institucional do ISCTE-IUL

Information Extraction in Illicit Domains

Author: Banko M.
Bauer F.
Chakrabarti S.
Kushmerick N.
Mikolov T.
Sahlgren M.
Wick M.
Zouaq A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/03/2017
Field of study

Extracting useful entities and attribute values from illicit domains such as human trafficking is a challenging problem with the potential for widespread social impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. In this paper, we propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains. Our approach uses raw, unlabeled text from an initial corpus, and a few (12-120) seed annotations per domain-specific attribute, to learn robust IE models for unobserved pages and websites. Empirically, we demonstrate that our approach can outperform feature-centric Conditional Random Field baselines by over 18\% F-Measure on five annotated sets of real-world human trafficking datasets in both low-supervision and high-supervision settings. We also show that our approach is demonstrably robust to concept drift, and can be efficiently bootstrapped even in a serial computing environment.Comment: 10 pages, ACM WWW 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Big Data in the Oil and Gas Industry: A Promising Courtship

Author: Tankimovich Michelle R.
Publication venue
Publication date: 01/05/2018
Field of study

The energy industry remains one of the highest money-producing and investment industries in the world. The United States’ own economic stability depends greatly on the stability of oil and gas prices. Various factors affect the amount of money that will continue to be invested in producing oil. A main disadvantage to the oil and gas industry is its lack of technological adaptation. This weakens the industry because the surest measures are not currently being taken to produce oil in optimally efficient, safe, and cost-effective ways. Big data has gained global recognition as an opportunity to gather large volumes of information in real-time and translate data sets into actionable insights. In a low commodity price environment, saving time, reducing costs, and improving safety are crucial outcomes that can be realized using machine learning in oil and gas operations. Big data provides the opportunity to use unsupervised learning. For example, with this approach, engineers can predict oil wells’ optimal barrels of production given the completion data in a specific area. However, a caveat to utilizing big data in the oil and gas industry is that there simply is neither enough physical data nor data velocity in the industry to be properly referred to as “big data.” Big data, as it develops, will nonetheless significantly change the energy business in the future, as it already has in various other industries.Petroleum and Geosystems Engineerin

Texas ScholarWorks

Categorical Data Integration for Computational Science

Author: Brown Kristopher
Spivak David I.
Wisnesky Ryan
Publication venue
Publication date: 17/10/1911
Field of study

Categorical Query Language is an open-source query and data integration scripting language that can be applied to common challenges in the field of computational science. We discuss how the structure-preserving nature of CQL data migrations protect those who publicly share data from the misinterpretation of their data. Likewise, this feature of CQL migrations allows those who draw from public data sources to be sure only data which meets their specification will actually be transferred. We argue some open problems in the field of data sharing in computational science are addressable by working within this paradigm of functorial data migration. We demonstrate these tools by integrating data from the Open Quantum Materials Database with some alternative materials databases.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

Trinity College

Engineering polymer informatics: Towards the computer-aided design of polymers

Author: Adams
Adams
Adams
Adams
Ai
Ai
Berners-Lee
Bicerano
Blower
Brooksbank
Carrell
Chowdhury
Chowdhury
Corbett
Cuchelkar
Davies
Degtyarenko
Elias
Feldman
Fleischmann
Frenkel
Frey
Gkoutos
Gordon
Gordon
Gordon
Hamoudeh
Herz
Holliday
Hoogenboom
Hoogenboom
Jenkins
Kanehisa
Kang
Kataoka
Keener
Lamport
Ma
Malmsten
Meier
Metanomski
Murray-Rust
Murray-Rust
Murray-Rust
Murray-Rust
Murray-Rust
Putnam
Rieder
Sankar
Schmaljohann
Service
Studer
Taylor
van der Vet
van Krevelen
Wagner
Weininger
Wiesbrock
Wilks
Wilks
Wilks
Wilks
Wu
Zamora
Zamora
Zhang
Publication venue: MACROMOL RAPID COMM
Publication date: 27/03/2008
Field of study

The computer-aided design of polymers is one of the holy grails of modern chemical informatics and of significant interest for a number of communities in polymer science. The paper outlines a vision for the in silico design of polymers and presents an information model for polymers based on modern semantic web technologies, thus laying the foundations for achieving the vision

Crossref

Apollo (Cambridge)

Estimating the relation of big data on business model innovation: a qualitative research

Author: Montermann Anna Luisa
Publication venue
Publication date: 16/12/2019
Field of study

Gaining interdisciplinary attention across academia, the concept of Big Data also finds application in the business world. Realizing the potential of the trend, this research considers the impact of Big Data with a strategic perspective and by focusing on the following research question: How can data and data-driven decisions lead to business model innovation?Challenging the assumption that Big Data even has the potential to impact business models, this research firstly elaborates on the construct of business modelsandbusiness model patterns. Subsequently, the Big Data concept is defined, by focusing on its unstructured and fast-moving nature. Considering the broad influence Big Data might have on business models, a qualitative research design is esteemed appropriate to answer the research question: The anal yses of semi-structured interviews with experts give insights about complex relations in the field of Big Data.For this research 13 participants contributedtheir opinions on Big Data, among others, they identifycurrent methodsand illustrate data visions for the future.One of the main findings of this research is that Big Data still imposes problems on managers, most of them are of analytical, technical or cultural nature. At the same time, the agents that suffer from insufficient data analytics,are invested to generate a data strategy that will facilitate data management.This research defines that data objects must be prioritized due to their utility,by means of data valuation. Associating a monetary value with data objects helps managersto commit totheirdecisions indata management. Furthermore, this research reveals that Big Data integration improves operations at various levels. In an incremental instance,businesses can reduce costs or differentiate their product and service portfolio through Big Data integration.Furthermore, Big Data finds applications on a strategic level:This research detects that Big Data possesses the proficiency to facilitate all business model dimensions and even to create innovation. Concluding, this master thesiscontributes to the research field of Strategy&Innovationas it increases the theoretical understanding of Big Data and its integration in strategic decision making. It considers several related topics to assess the capability of data,by including the notions of data monetization and experience data. Furthermore, this thesis discloses novel case studies, which give evidence of the status quo of data integration across industries. By deriving propositions, this study serves as a valuable guideline for further research on data management and business model innovation

Repositório da Universidade Nova de Lisboa

Data Motility: The Materiality of Big Social Data

Author: Coté Mark
Publication venue: 'University of Technology, Sydney (UTS)'
Publication date: 01/10/2013
Field of study

In this article, the author uses Foucault's largely overlooked but vital concept, the dispositif, in relation to the recent rise of mobility, explosion of data and proliferation of platforms and apps. With a focus on how data an individual generates increasingly moves autonomously of their control, he presents the dispositif of ‘data motility’ to develop a new materialist analysis of the digital human as a discursive and non-discursive assemblage

Directory of Open Access Journals

King's Research Portal

UTS ePress