Search CORE

240,716 research outputs found

Statistical structures for internet-scale data management

Author: Ntarmos N.
Triantafillou P.
Weikum G.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Efficient query processing in traditional database management systems relies on statistics on base data. For centralized systems, there is a rich body of research results on such statistics, from simple aggregates to more elaborate synopses such as sketches and histograms. For Internet-scale distributed systems, on the other hand, statistics management still poses major challenges. With the work in this paper we aim to endow peer-to-peer data management over structured overlays with the power associated with such statistical information, with emphasis on meeting the scalability challenge. To this end, we first contribute efficient, accurate, and decentralized algorithms that can compute key aggregates such as Count, CountDistinct, Sum, and Average. We show how to construct several types of histograms, such as simple Equi-Width, Average-Shifted Equi-Width, and Equi-Depth histograms. We present a full-fledged open-source implementation of these tools for distributed statistical synopses, and report on a comprehensive experimental performance evaluation, evaluating our contributions in terms of efficiency, accuracy, and scalability

CiteSeerX

Springer - Publisher Connector

Enlighten

MPG.PuRe

BiDAl: Big Data Analyzer for Cluster Traces

Author: Babaoglu O.
Balliu A.
Marzolla M.
Olivetti D.
SIRBU ALINA
Publication venue: place:Bonn
Publication date: 01/01/2014
Field of study

Modern data centers that provide Internet-scale services are stadium-size structures housing tens of thousands of heterogeneous devices (server clusters, networking equipment, power and cooling infrastructures) that must operate continuously and reliably. As part of their operation, these devices produce large amounts of data in the form of event and error logs that are essential not only for identifying problems but also for improving data center efficiency and management. These activities employ data analytics and often exploit hidden statistical patterns and correlations among different factors present in the data. Uncovering these patterns and correlations is challenging due to the sheer volume of data to be analyzed. This paper presents BiDAl, a prototype “log-data analysis framework” that incorporates various Big Data technologies to simplify the analysis of data traces from large clusters. BiDAl is written in Java with a modular and extensible architecture so that different storage backends (currently, HDFS and SQLite are supported), as well as different analysis languages (current implementation supports SQL, R and Hadoop MapReduce) can be easily selected as appropriate. We present the design of BiDAl and describe our experience using it to analyze several public traces of Google data clusters for building a simulation model capable of reproducing observed behavior

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Organic Design of Massively Distributed Systems: A Complex Networks Perspective

Author: Scholtes Ingo
Tessone Claudio Juan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The vision of Organic Computing addresses challenges that arise in the design of future information systems that are comprised of numerous, heterogeneous, resource-constrained and error-prone components or devices. Here, the notion organic particularly highlights the idea that, in order to be manageable, such systems should exhibit self-organization, self-adaptation and self-healing characteristics similar to those of biological systems. In recent years, the principles underlying many of the interesting characteristics of natural systems have been investigated from the perspective of complex systems science, particularly using the conceptual framework of statistical physics and statistical mechanics. In this article, we review some of the interesting relations between statistical physics and networked systems and discuss applications in the engineering of organic networked computing systems with predictable, quantifiable and controllable self-* properties.Comment: 17 pages, 14 figures, preprint of submission to Informatik-Spektrum published by Springe

arXiv.org e-Print Archive

CiteSeerX

Repository for Publications and Research Data

Crossref

Updates in metabolomics tools and resources: 2014-2015

Author: Misra Biswapriya B.
van der Hooft Justin
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

Enlighten

Models of everywhere revisited: a technological perspective

Author: Ahmed
Atzori
Barry Hankin
Basco-Carrera
Bastin
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Beven
Bierkens
Blair
Blei
Box
Clark
Coulouris
Coxon
Dadson
David
Dean
Di Baldassarre
Di Baldassarre
Edwards
Evers
Faiza Samreen
Fenicia
Ferré
Foster
France
Gilbert
Gordon S. Blair
Graham Dean
Habata
Hankin
Hazeleger
Hurrell
Johnson
Keith Beven
Kephart
Kon
Kris Cauwenberghs
Landström
Lane
Leavesley
Liz Edwards
Lloyd
Lopez
Maes
Maskrey
McCallum
McDonnell
McKinley
Metcalfe
Nearing
Neil Hunter
Nundloll
Oreizy
Prudhomme
Renard
Richard Bassett
Rob Lamb
Ross Towe
Rougier
Simm
Smith
Smith
Towe
Vatsala Nundloll
Voinov
Vrugt
Waldrop
Weiler
Westerberg
Westerberg
Will Simm
Wood
Publication venue: 'Elsevier BV'
Publication date: 27/09/2019
Field of study

The concept ‘models of everywhere’ was first introduced in the mid 2000s as a means of reasoning about the environmental science of a place, changing the nature of the underlying modelling process, from one in which general model structures are used to one in which modelling becomes a learning process about specific places, in particular capturing the idiosyncrasies of that place. At one level, this is a straightforward concept, but at another it is a rich multi-dimensional conceptual framework involving the following key dimensions: models of everywhere, models of everything and models at all times, being constantly re-evaluated against the most current evidence. This is a compelling approach with the potential to deal with epistemic uncertainties and nonlinearities. However, the approach has, as yet, not been fully utilised or explored. This paper examines the concept of models of everywhere in the light of recent advances in technology. The paper argues that, when first proposed, technology was a limiting factor but now, with advances in areas such as Internet of Things, cloud computing and data analytics, many of the barriers have been alleviated. Consequently, it is timely to look again at the concept of models of everywhere in practical conditions as part of a trans-disciplinary effort to tackle the remaining research questions. The paper concludes by identifying the key elements of a research agenda that should underpin such experimentation and deployment

Crossref

Sheffield Hallam University Research Archive

The University of Manchester - Institutional Repository

Lancaster E-Prints

Measurement Invariance of the Internet Addiction Test Among Hong Kong, Japanese, and Malaysian Adolescents

Author: Bahar N
Cheng C
Griffiths MD
Kim D
Ko H-C
Lai C-M
Mak K-K
Nomachi S
Watanabe H
Young KS
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 15/10/2015
Field of study

There has been increased research examining the psychometric properties on the Internet Addiction Test across different ages and populations. This population-based study examined the psychometric properties using Confirmatory Factory Analysis and measurement invariance using Item Response Theory (IRT) of the IAT in adolescents from three Asian countries. In the Asian Adolescent Risk Behavior Survey (AARBS), 2,535 secondary school students (55.91% girls) in Grade 7 to Grade 13 (Mean age = 15.61 years; SD=1.56) from Hong Kong (n=844), Japan (n=744), and Malaysia (n=947) completed a survey on their Internet use that incorporated the IAT scale. A nested hierarchy of hypotheses concerning IAT cross-country invariance was tested using multi-group confirmatory factor analysis. Replicating past finding in Hong Kong adolescents, the construct of IAT is best represented by a second-order three-factor structure in Malaysian and Japanese adolescents. Configural, metric, scalar, and partial strict factorial invariance was established across the three samples. No cross-country differences on Internet addiction were detected at latent mean level. This study provided empirical support to the IAT as a reliable and factorially stable instrument, and valid to be used across Asian adolescent populations

Crossref

Nottingham Trent Institutional Repository (IRep)

Understanding User Behavioral Intention to Adopt a Search Engine that Promotes Sustainable Water Management

Author: Palos Sánchez Pedro Ramiro
Reyes Menéndez Ana
Saura José Ramón
Álvarez García José
Publication venue: MDPI (Multidisciplinary Digital Publishing Institute)
Publication date: 01/01/2018
Field of study

An increase in users’ online searches, the social concern for an efficient management of resources such as water, and the appearance of more and more digital platforms for sustainable purposes to conduct online searches lead us to reflect more on the users’ behavioral intention with respect to search engines that support sustainable projects like water management projects. Another issue to consider is the factors that determine the adoption of such search engines. In the present study, we aim to identify the factors that determine the intention to adopt a search engine, such as Lilo, that favors sustainable water management. To this end, a model based on the Theory of Planned Behavior (TPB) is proposed. The methodology used is the Structural Equation Modeling (SEM) analysis with the Analysis of Moment Structures (AMOS). The results demonstrate that individuals who intend to use a search engine are influenced by hedonic motivations, which drive their feeling of contentment with the search. Similarly, the success of search engines is found to be closely related to the ability a search engine grants to its users to generate a social or environmental impact, rather than users’ trust in what they do or in their results. However, according to our results, habit is also an important factor that has both a direct and an indirect impact on users’ behavioral intention to adopt different search engines

Directory of Open Access Journals

Re-UNIR

idUS. Depósito de Investigación Universidad de Sevilla

Dehesa. Repositorio Institucional de la Universidad de Extremadura

Mathematics and the Internet: A Source of Enormous Confusion and Great Potential

Author: Alderson David
Doyle John C.
Willinger Walter
Publication venue: 'American Mathematical Society (AMS)'
Publication date: 01/05/2009
Field of study

Graph theory models the Internet mathematically, and a number of plausible mathematically intersecting network models for the Internet have been developed and studied. Simultaneously, Internet researchers have developed methodology to use real data to validate, or invalidate, proposed Internet models. The authors look at these parallel developments, particularly as they apply to scale-free network models of the preferential attachment type

Caltech Authors

Calhoun, Institutional Archive of the Naval Postgraduate School

Challenges in Complex Systems Science

Author: A. Arenas
A. Díaz-Guilera
A. Gozolchiani
A.J.W. Ward
A.L. Barabasi
C. Castellano
C. Granger
D. Centola
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Lazer
D.J. Watts
D.V. Buonomano
D.Y. Kenett
E. Majorana
E. Majorana
E.L. Trist
F. Vazquez
G. Paolacci
G. Pickard
H. Jaeger
J. Borge-Holthoefer
J. Giles
J. H. Johnson
J. Kertesz
J.C. González-Avella
J.H. Johnson
J.M. Bates
J.P. Crutchfield
K. Kaski
L. Ahn von
L. Appeltant
L. Conradt
L. Pietronero
L. Pietronero
M. Diakonova
M. González
M. San Miguel
M. Zimmermann
M.B. Araujo
P. Bak
P. Grabowicz
P. Wang
P. Érdi
P.W. Anderson
R. Axelrod
R. Carvalho
R. S. MacKay
R.S. MacKay
R.S. MacKay
S. Bocaletti
S. Buldyrev
S. Havlin
S. Hempel
T. Vicsek
V. Loreto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

FuturICT foundations are social science, complex systems science, and ICT. The main concerns and challenges in the science of complex systems in the context of FuturICT are laid out in this paper with special emphasis on the Complex Systems route to Social Sciences. This include complex systems having: many heterogeneous interacting parts; multiple scales; complicated transition laws; unexpected or unpredicted emergence; sensitive dependence on initial conditions; path-dependent dynamics; networked hierarchical connectivities; interaction of autonomous agents; self-organisation; non-equilibrium dynamics; combinatorial explosion; adaptivity to changing environments; co-evolving subsystems; ill-defined boundaries; and multilevel dynamics. In this context, science is seen as the process of abstracting the dynamics of systems from data. This presents many challenges including: data gathering by large-scale experiment, participatory sensing and social computation, managing huge distributed dynamic and heterogeneous databases; moving from data to dynamical models, going beyond correlations to cause-effect relationships, understanding the relationship between simple and comprehensive models with appropriate choices of variables, ensemble modeling and data assimilation, modeling systems of systems of systems with many levels between micro and macro; and formulating new approaches to prediction, forecasting, and risk, especially in systems that can reflect on and change their behaviour in response to predictions, and systems whose apparently predictable behaviour is disrupted by apparently unpredictable rare or extreme events. These challenges are part of the FuturICT agenda

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Open Research Online (The Open University)

Aaltodoc Publication Archive

Warwick Research Archives Portal Repository

Digital.CSIC

Archivio della ricerca- Università di Roma La Sapienza

Context Aware Computing for The Internet of Things: A Survey

Author: Arkady Zaslavsky
Charith Perera
Dimitrios Georgakopoulos
Peter Christen
Student Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2013
Field of study

As we are moving towards the Internet of Things (IoT), the number of sensors deployed around the world is growing at a rapid pace. Market research has shown a significant growth of sensor deployments over the past decade and has predicted a significant increment of the growth rate in the future. These sensors continuously generate enormous amounts of data. However, in order to add value to raw sensor data we need to understand it. Collection, modelling, reasoning, and distribution of context in relation to sensor data plays critical role in this challenge. Context-aware computing has proven to be successful in understanding sensor data. In this paper, we survey context awareness from an IoT perspective. We present the necessary background by introducing the IoT paradigm and context-aware fundamentals at the beginning. Then we provide an in-depth analysis of context life cycle. We evaluate a subset of projects (50) which represent the majority of research and commercial solutions proposed in the field of context-aware computing conducted over the last decade (2001-2011) based on our own taxonomy. Finally, based on our evaluation, we highlight the lessons to be learnt from the past and some possible directions for future research. The survey addresses a broad range of techniques, methods, models, functionalities, systems, applications, and middleware solutions related to context awareness and IoT. Our goal is not only to analyse, compare and consolidate past research work but also to appreciate their findings and discuss their applicability towards the IoT.Comment: IEEE Communications Surveys & Tutorials Journal, 201

arXiv.org e-Print Archive

CiteSeerX

Deakin Research Online

Crossref

Online Research @ Cardiff

The Australian National University