4,583 research outputs found
IoT Data Imputation with Incremental Multiple Linear Regression
In this paper, we address the problem related to missing data imputation in the IoT domain. More specifically, we propose an Incremental Space-Time-based model (ISTM) for repairing missing values in IoT real-time data streams. ISTM is based on Incremental Multiple Linear Regression, which processes data as follows: Upon data arrival, ISTM updates the model after reading again the intermediary data matrix instead of accessing all historical information. If a missing value is detected, ISTM will provide an estimation for the missing value based on nearly historical data and the observations of neighboring sensors of the default one. Experiments conducted with real traffic data show the performance of ISTM in comparison with known techniques
Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation
Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries.
Rehabilitation after such a musculoskeletal injury remains a prolonged process
with a very variable outcome. Accurately predicting rehabilitation outcome is
crucial for treatment decision support. However, it is challenging to train an
automatic method for predicting the ATR rehabilitation outcome from treatment
data, due to a massive amount of missing entries in the data recorded from ATR
patients, as well as complex nonlinear relations between measurements and
outcomes. In this work, we design an end-to-end probabilistic framework to
impute missing data entries and predict rehabilitation outcomes simultaneously.
We evaluate our model on a real-life ATR clinical cohort, comparing with
various baselines. The proposed method demonstrates its clear superiority over
traditional methods which typically perform imputation and prediction in two
separate stages
PicShark: mitigating metadata scarcity through large-scale P2P collaboration
With the commoditization of digital devices, personal information and media sharing is becoming a key application on the pervasive Web. In such a context, data annotation rather than data production is the main bottleneck. Metadata scarcity represents a major obstacle preventing efficient information processing in large and heterogeneous communities. However, social communities also open the door to new possibilities for addressing local metadata scarcity by taking advantage of global collections of resources. We propose to tackle the lack of metadata in large-scale distributed systems through a collaborative process leveraging on both content and metadata. We develop a community-based and self-organizing system called PicShark in which information entropy—in terms of missing metadata—is gradually alleviated through decentralized instance and schema matching. Our approach focuses on semi-structured metadata and confines computationally expensive operations to the edge of the network, while keeping distributed operations as simple as possible to ensure scalability. PicShark builds on structured Peer-to-Peer networks for distributed look-up operations, but extends the application of self-organization principles to the propagation of metadata and the creation of schema mappings. We demonstrate the practical applicability of our method in an image sharing scenario and provide experimental evidences illustrating the validity of our approac
Recommended from our members
The Computational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health.
Food and human health are inextricably linked. As such, revolutionary impacts on health have been derived from advances in the production and distribution of food relating to food safety and fortification with micronutrients. During the past two decades, it has become apparent that the human microbiome has the potential to modulate health, including in ways that may be related to diet and the composition of specific foods. Despite the excitement and potential surrounding this area, the complexity of the gut microbiome, the chemical composition of food, and their interplay in situ remains a daunting task to fully understand. However, recent advances in high-throughput sequencing, metabolomics profiling, compositional analysis of food, and the emergence of electronic health records provide new sources of data that can contribute to addressing this challenge. Computational science will play an essential role in this effort as it will provide the foundation to integrate these data layers and derive insights capable of revealing and understanding the complex interactions between diet, gut microbiome, and health. Here, we review the current knowledge on diet-health-gut microbiota, relevant data sources, bioinformatics tools, machine learning capabilities, as well as the intellectual property and legislative regulatory landscape. We provide guidance on employing machine learning and data analytics, identify gaps in current methods, and describe new scenarios to be unlocked in the next few years in the context of current knowledge
A Comprehensive Survey on Generative Diffusion Models for Structured Data
In recent years, generative diffusion models have achieved a rapid paradigm
shift in deep generative models by showing groundbreaking performance across
various applications. Meanwhile, structured data, encompassing tabular and time
series data, has been received comparatively limited attention from the deep
learning research community, despite its omnipresence and extensive
applications. Thus, there is still a lack of literature and its reviews on
structured data modelling via diffusion models, compared to other data
modalities such as visual and textual data. To address this gap, we present a
comprehensive review of recently proposed diffusion models in the field of
structured data. First, this survey provides a concise overview of the
score-based diffusion model theory, subsequently proceeding to the technical
descriptions of the majority of pioneering works that used structured data in
both data-driven general tasks and domain-specific applications. Thereafter, we
analyse and discuss the limitations and challenges shown in existing works and
suggest potential research directions. We hope this review serves as a catalyst
for the research community, promoting developments in generative diffusion
models for structured data.Comment: 20 pages, 1 figure, 2 table
Towards Mobility Data Science (Vision Paper)
Mobility data captures the locations of moving objects such as humans,
animals, and cars. With the availability of GPS-equipped mobile devices and
other inexpensive location-tracking technologies, mobility data is collected
ubiquitously. In recent years, the use of mobility data has demonstrated
significant impact in various domains including traffic management, urban
planning, and health sciences. In this paper, we present the emerging domain of
mobility data science. Towards a unified approach to mobility data science, we
envision a pipeline having the following components: mobility data collection,
cleaning, analysis, management, and privacy. For each of these components, we
explain how mobility data science differs from general data science, we survey
the current state of the art and describe open challenges for the research
community in the coming years.Comment: Updated arXiv metadata to include two authors that were missing from
the metadata. PDF has not been change
- …