1,714 research outputs found
Adversarial Unsupervised Representation Learning for Activity Time-Series
Sufficient physical activity and restful sleep play a major role in the
prevention and cure of many chronic conditions. Being able to proactively
screen and monitor such chronic conditions would be a big step forward for
overall health. The rapid increase in the popularity of wearable devices
provides a significant new source, making it possible to track the user's
lifestyle real-time. In this paper, we propose a novel unsupervised
representation learning technique called activity2vec that learns and
"summarizes" the discrete-valued activity time-series. It learns the
representations with three components: (i) the co-occurrence and magnitude of
the activity levels in a time-segment, (ii) neighboring context of the
time-segment, and (iii) promoting subject-invariance with adversarial training.
We evaluate our method on four disorder prediction tasks using linear
classifiers. Empirical evaluation demonstrates that our proposed method scales
and performs better than many strong baselines. The adversarial regime helps
improve the generalizability of our representations by promoting subject
invariant features. We also show that using the representations at the level of
a day works the best since human activity is structured in terms of daily
routinesComment: Accepted at AAAI'19. arXiv admin note: text overlap with
arXiv:1712.0952
A Transformer-based Framework For Multi-variate Time Series: A Remaining Useful Life Prediction Use Case
In recent times, Large Language Models (LLMs) have captured a global
spotlight and revolutionized the field of Natural Language Processing. One of
the factors attributed to the effectiveness of LLMs is the model architecture
used for training, transformers. Transformer models excel at capturing
contextual features in sequential data since time series data are sequential,
transformer models can be leveraged for more efficient time series data
prediction. The field of prognostics is vital to system health management and
proper maintenance planning. A reliable estimation of the remaining useful life
(RUL) of machines holds the potential for substantial cost savings. This
includes avoiding abrupt machine failures, maximizing equipment usage, and
serving as a decision support system (DSS). This work proposed an
encoder-transformer architecture-based framework for multivariate time series
prediction for a prognostics use case. We validated the effectiveness of the
proposed framework on all four sets of the C-MAPPS benchmark dataset for the
remaining useful life prediction task. To effectively transfer the knowledge
and application of transformers from the natural language domain to time
series, three model-specific experiments were conducted. Also, to enable the
model awareness of the initial stages of the machine life and its degradation
path, a novel expanding window method was proposed for the first time in this
work, it was compared with the sliding window method, and it led to a large
improvement in the performance of the encoder transformer model. Finally, the
performance of the proposed encoder-transformer model was evaluated on the test
dataset and compared with the results from 13 other state-of-the-art (SOTA)
models in the literature and it outperformed them all with an average
performance increase of 137.65% over the next best model across all the
datasets
Temporal decision making using unsupervised learning
With the explosion of ubiquitous continuous sensing, on-line streaming clustering continues to attract attention. The requirements are that the streaming clustering algorithm recognize and adapt clusters as the data evolves, that anomalies are detected, and that new clusters are automatically formed as incoming data dictate. In this dissertation, we develop a streaming clustering algorithm, MU Streaming Clustering (MUSC), that is based on coupling a Gaussian mixture model (GMM) with possibilistic clustering to build an adaptive system for analyzing streaming multi-dimensional activity feature vectors. For this reason, the possibilistic C-Means (PCM) and Automatic Merging Possibilistic Clustering Method (AMPCM) are combined together to cluster the initial data points, detect anomalies and initialize the GMM. MUSC achieves our goals when tested on synthetic and real-life datasets. We also compare MUSC's performance with Sequential k-means (sk-means), Basic Sequential Clustering Algorithm (BSAS), and Modified BSAS (MBSAS) here MUSC shows superiority in the performance and accuracy. The performance of a streaming clustering algorithm needs to be monitored over time to understand the behavior of the streaming data in terms of new emerging clusters and number of outlier data points. Incremental internal Validity Indices (iCVIs) are used to monitor the performance of an on-line clustering algorithm. We study the internal incremental Davies-Bouldin (DB), Xie-Beni (XB), and Dunn internal cluster validity indices in the context of streaming data analysis. We extend the original incremental DB (iDB) to a more general version parameterized by the exponent of membership weights. Then we illustrate how the iDB can be used to analyze and understand the performance of MUSC algorithm. We give examples that illustrate the appearance of a new cluster, the effect of different cluster sizes, handling of outlier data samples, and the effect of the input order on the resultant cluster history. In addition, we investigate the internal incremental Davies-Bouldin (iDB) cluster validity index in the context of big streaming data analysis. We analyze the effect of large numbers of samples on the values of the iCVI (iDB). We also develop online versions of two modified generalized Dunn's indices that can be used for dynamic evaluation of evolving (cluster) structure in streaming data. We argue that this method is a good way to monitor the ongoing performance of online clustering algorithms and we illustrate several types of inferences that can be drawn from such indices. We compare the two new indices to the incremental Xie-Beni and Davies-Bouldin indices, which to our knowledge offer the only comparable approach, with numerical examples on a variety of synthetic and real data sets. We also study the performance of MUSC and iCVIs with big streaming data applications. We show the advantage of iCVIs in monitoring large streaming datasets and in providing useful information about the data stream in terms of emergence of a new structure, amount of outlier data, size of the clusters, and order of data samples in each cluster. We also propose a way to project streaming data into a lower space for cases where the distance measure does not perform as expected in the high dimensional space. Another example of streaming is the data acivity data coming from TigerPlace and other elderly residents' apartments in and around Columbia. MO. TigerPlace is an eldercare facility that promotes aging-in-place in Columbia Missouri. Eldercare monitoring using non-wearable sensors is a candidate solution for improving care and reducing costs. Abnormal sensor patterns produced by certain resident behaviors could be linked to early signs of illness. We propose an unsupervised method for detecting abnormal behavior patterns based on a new context preserving representation of daily activities. A preliminary analysis of the method was conducted on data collected in TigerPlace. Sensor firings of each day are converted into sequences of daily activities. Then, building a histogram from the daily sequences of a resident, we generate a single data vector representing that day. Using the proposed method, a day with hundreds of sequences is converted into a single data point representing that day and preserving the context of the daily routine at the same time. We obtained an average Area Under the Curve (AUC) of 0.9 in detecting days where elder adults need to be assessed. Our approach outperforms other approaches on the same datset. Using the context preserving representation, we develoed a multi-dimensional alert system to improve the existing single-dimensional alert system in TigerPlace. Also, this represenation is used to develop a framework that utilizes sensor sequence similarity and medical concepts extracted from the EHR to automatically inform the nursing staff when health problems are detected. Our context preserving representation of daily activities is used to measure the similarity between the sensor sequences of different days. The medical concepts are extracted from the nursing notes using MetamapLite, an NLP tool included in the Unified Medical Language System (UMLS). The proposed idea is validated on two pilot datasets from twelve Tiger Place residents, with a total of 5810 sensor days out of which 1966 had nursing notes
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Unveiling the frontiers of deep learning: innovations shaping diverse domains
Deep learning (DL) enables the development of computer models that are
capable of learning, visualizing, optimizing, refining, and predicting data. In
recent years, DL has been applied in a range of fields, including audio-visual
data processing, agriculture, transportation prediction, natural language,
biomedicine, disaster management, bioinformatics, drug design, genomics, face
recognition, and ecology. To explore the current state of deep learning, it is
necessary to investigate the latest developments and applications of deep
learning in these disciplines. However, the literature is lacking in exploring
the applications of deep learning in all potential sectors. This paper thus
extensively investigates the potential applications of deep learning across all
major fields of study as well as the associated benefits and challenges. As
evidenced in the literature, DL exhibits accuracy in prediction and analysis,
makes it a powerful computational tool, and has the ability to articulate
itself and optimize, making it effective in processing data with no prior
training. Given its independence from training data, deep learning necessitates
massive amounts of data for effective analysis and processing, much like data
volume. To handle the challenge of compiling huge amounts of medical,
scientific, healthcare, and environmental data for use in deep learning, gated
architectures like LSTMs and GRUs can be utilized. For multimodal learning,
shared neurons in the neural network for all activities and specialized neurons
for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table
Machine learning with limited label availability: algorithms and applications
L'abstract è presente nell'allegato / the abstract is in the attachmen
Embedding-based real-time change point detection with application to activity segmentation in smart home time series data
[EN]Human activity recognition systems are essential to enable many assistive applications. Those systems can be sensor-based or vision-based. When sensor-based systems are deployed in real environments, they must segment sensor data streams on the fly in order to extract features and recognize the ongoing activities. This segmentation can be done with different approaches. One effective approach is to employ change point detection (CPD) algorithms to detect activity transitions (i.e. determine when activities start and end). In this paper, we present a novel real-time CPD method to perform activity segmentation, where neural embeddings (vectors of continuous numbers) are used to represent sensor events. Through empirical evaluation with 3 publicly available benchmark datasets, we conclude that our method is useful for segmenting sensor data, offering significant better performance than state of the art algorithms in two of them. Besides, we propose the use of retrofitting, a graph-based technique, to adjust the embeddings and introduce expert knowledge in the activity segmentation task, showing empirically that it can improve the performance of our method using three graphs generated from two sources of information. Finally, we discuss the advantages of our approach regarding computational cost, manual effort reduction (no need of hand-crafted features) and cross-environment possibilities (transfer learning) in comparison to others.This work was carried out with the financial support of FuturAALEgo (RTI2018-101045-A-C22) granted by Spanish Ministry of Science, Innovation and Universities
- …