14 research outputs found
Learning Behavioral Representations of Routines From Large-scale Unlabeled Wearable Time-series Data Streams using Hawkes Point Process
Continuously-worn wearable sensors enable researchers to collect copious
amounts of rich bio-behavioral time series recordings of real-life activities
of daily living, offering unprecedented opportunities to infer novel human
behavior patterns during daily routines. Existing approaches to routine
discovery through bio-behavioral data rely either on pre-defined notions of
activities or use additional non-behavioral measurements as contexts, such as
GPS location or localization within the home, presenting risks to user privacy.
In this work, we propose a novel wearable time-series mining framework, Hawkes
point process On Time series clusters for ROutine Discovery (HOT-ROD), for
uncovering behavioral routines from completely unlabeled wearable recordings.
We utilize a covariance-based method to generate time-series clusters and
discover routines via the Hawkes point process learning algorithm. We
empirically validate our approach for extracting routine behaviors using a
completely unlabeled time-series collected continuously from over 100
individuals both in and outside of the workplace during a period of ten weeks.
Furthermore, we demonstrate this approach intuitively captures daily
transitional relationships between physical activity states without using prior
knowledge. We also show that the learned behavioral patterns can assist in
illuminating an individual's personality and affect.Comment: 2023 9th ACM SIGKDD International Workshop on Mining and Learning
From Time Series (MiLeTS 2023
Anomaly Detection in Medical Time Series with Generative Adversarial Networks: A Selective Review
Anomaly detection in medical data is often of critical importance, from diagnosing and potentially localizing disease processes such as epilepsy to detecting and preventing fatal events such as cardiac arrhythmias. Generative adversarial networks (GANs) have since their inception shown promise in various applications and have been shown to be effective in cybersecurity, data denoising, and data augmentation, and have more recently found a potentially important place in the detection of anomalies in medical time series. This chapter provides a selective review of this novel use of GANs, in the process highlighting the nature of anomalies in time series, special challenges related to medical time series, and some general issues in approaching time series anomaly detection with deep learning. We cover the most frequently applied GAN models and briefly detail the current landscape of applying GANs to anomaly detection in two commonly used medical time series, electrocardiography (ECG) and electroencephalography (EEG)
Computational Intelligence in Healthcare
This book is a printed edition of the Special Issue Computational Intelligence in Healthcare that was published in Electronic
Computational Intelligence in Healthcare
The number of patient health data has been estimated to have reached 2314 exabytes by 2020. Traditional data analysis techniques are unsuitable to extract useful information from such a vast quantity of data. Thus, intelligent data analysis methods combining human expertise and computational models for accurate and in-depth data analysis are necessary. The technological revolution and medical advances made by combining vast quantities of available data, cloud computing services, and AI-based solutions can provide expert insight and analysis on a mass scale and at a relatively low cost. Computational intelligence (CI) methods, such as fuzzy models, artificial neural networks, evolutionary algorithms, and probabilistic methods, have recently emerged as promising tools for the development and application of intelligent systems in healthcare practice. CI-based systems can learn from data and evolve according to changes in the environments by taking into account the uncertainty characterizing health data, including omics data, clinical data, sensor, and imaging data. The use of CI in healthcare can improve the processing of such data to develop intelligent solutions for prevention, diagnosis, treatment, and follow-up, as well as for the analysis of administrative processes. The present Special Issue on computational intelligence for healthcare is intended to show the potential and the practical impacts of CI techniques in challenging healthcare applications
How can humans leverage machine learning? From Medical Data Wrangling to Learning to Defer to Multiple Experts
Mención Internacional en el tÃtulo de doctorThe irruption of the smartphone into everyone’s life and the ease with which we digitise or record
any data supposed an explosion of quantities of data. Smartphones, equipped with advanced
cameras and sensors, have empowered individuals to capture moments and contribute to the
growing pool of data. This data-rich landscape holds great promise for research, decision-making,
and personalized applications. By carefully analyzing and interpreting this wealth of information,
valuable insights, patterns, and trends can be uncovered.
However, big data is worthless in a vacuum. Its potential value is unlocked only when leveraged
to drive decision-making. In recent times we have been participants of the outburst of artificial
intelligence: the development of computer systems and algorithms capable of perceiving, reasoning,
learning, and problem-solving, emulating certain aspects of human cognitive abilities. Nevertheless,
our focus tends to be limited, merely skimming the surface of the problem, while the reality
is that the application of machine learning models to data introduces is usually fraught. More
specifically, there are two crucial pitfalls frequently neglected in the field of machine learning:
the quality of the data and the erroneous assumption that machine learning models operate
autonomously. These two issues have established the foundation for the motivation driving this
thesis, which strives to offer solutions to two major associated challenges: 1) dealing with irregular
observations and 2) learning when and who should we trust.
The first challenge originates from our observation that the majority of machine learning
research primarily concentrates on handling regular observations, neglecting a crucial technological
obstacle encountered in practical big-data scenarios: the aggregation and curation of heterogeneous
streams of information. Before applying machine learning algorithms, it is crucial to establish
robust techniques for handling big data, as this specific aspect presents a notable bottleneck in
the creation of robust algorithms. Data wrangling, which encompasses the extraction, integration,
and cleaning processes necessary for data analysis, plays a crucial role in this regard. Therefore,
the first objective of this thesis is to tackle the frequently disregarded challenge of addressing
irregularities within the context of medical data. We will focus on three specific aspects. Firstly,
we will tackle the issue of missing data by developing a framework that facilitates the imputation
of missing data points using relevant information derived from alternative data sources or past
observations. Secondly, we will move beyond the assumption of homogeneous observations,
where only one statistical data type (such as Gaussian) is considered, and instead, work with
heterogeneous observations. This means that different data sources can be represented by various
statistical likelihoods, such as Gaussian, Bernoulli, categorical, etc. Lastly, considering the
temporal enrichment of todays collected data and our focus on medical data, we will develop a novel algorithm capable of capturing and propagating correlations among different data streams
over time. All these three problems are addressed in our first contribution which involves the
development of a novel method based on Deep Generative Models (DGM) using Variational
Autoencoders (VAE). The proposed model, the Sequential Heterogeneous Incomplete VAE (Shi-
VAE), enables the aggregation of multiple heterogeneous data streams in a modular manner,
taking into consideration the presence of potential missing data. To demonstrate the feasibility
of our approach, we present proof-of-concept results obtained from a real database generated
through continuous passive monitoring of psychiatric patients.
Our second challenge relates to the misbelief that machine learning algorithms can perform
independently. However, this notion that AI systems can solely account for automated decisionmaking,
especially in critical domains such as healthcare, is far from reality. Our focus now shifts
towards a specific scenario where the algorithm has the ability to make predictions independently
or alternatively defer the responsibility to a human expert. The purpose of including the human
is not to obtain jsut better performance, but also more reliable and trustworthy predictions we
can rely on. In reality, however, important decisions are not made by one person but are usually
committed by an ensemble of human experts. With this in mind, two important questions arise:
1) When should the human or the machine bear responsibility and 2) among the experts, who
should we trust? To answer the first question, we will employ a recent theory known as Learning
to defer (L2D). In L2D we are not only interested in abstaining from prediction but also in
understanding the humans confidence for making such prediction. thus deferring only when the
human is more likely to be correct. The second question about who to defer among a pool of
experts has not been yet answered in the L2D literature, and this is what our contributions
aim to provide. First, we extend the two yet proposed consistent surrogate losses in the L2D
literature to the multiple-expert setting. Second, we study the frameworks ability to estimate
the probability that a given expert correctly predicts and assess whether the two surrogate losses
are confidence calibrated. Finally, we propose a conformal inference technique that chooses a
subset of experts to query when the system defers. Ensembling experts based on confidence
levels is vital to optimize human-machine collaboration.
In conclusion, this doctoral thesis has investigated two cases where humans can leverage the
power of machine learning: first, as a tool to assist in data wrangling and data understanding
problems and second, as a collaborative tool where decision-making can be automated by the
machine or delegated to human experts, fostering more transparent and trustworthy solutions.La irrupción de los smartphones en la vida de todos y la facilidad con la que digitalizamos o
registramos cualquier situación ha supuesto una explosión en la cantidad de datos. Los teléfonos,
equipados con cámaras y sensores avanzados, han contribuido a que las personas puedann capturar
más momentos, favoreciendo asà el creciente conjunto de datos. Este panorama repleto de datos
aporta un gran potencial de cara a la investigación, la toma de decisiones y las aplicaciones
personalizadas. Mediante el análisis minucioso y una cuidada interpretación de esta abundante
información, podemos descubrir valiosos patrones, tendencias y conclusiones
Sin embargo, este gran volumen de datos no tiene valor por si solo. Su potencial se desbloquea
solo cuando se aprovecha para impulsar la toma de decisiones. En tiempos recientes, hemos sido
testigos del auge de la inteligencia artificial: el desarrollo de sistemas informáticos y algoritmos
capaces de percibir, razonar, aprender y resolver problemas, emulando ciertos aspectos de las
capacidades cognitivas humanas. No obstante, solemos centrarnos solo en la superficie del problema
mientras que la realidad es que la aplicación de modelos de aprendizaje automático a los datos
presenta desafÃos significativos. Concretamente, se suelen pasar por alto dos problemas cruciales
en el campo del aprendizaje automático: la calidad de los datos y la suposición errónea de
que los modelos de aprendizaje automático pueden funcionar de manera autónoma. Estos dos
problemas han sido el fundamento de la motivación que impulsa esta tesis, que se esfuerza
en ofrecer soluciones a dos desafÃos importantes asociados: 1) lidiar con datos irregulares y 2)
aprender cuándo y en quién debemos confiar.
El primer desafÃo surge de nuestra observación de que la mayorÃa de las investigaciones en
aprendizaje automático se centran principalmente en manejar datos regulares, descuidando un
obstáculo tecnológico crucial que se encuentra en escenarios prácticos con gran cantidad de
datos: la agregación y el curado de secuencias heterogéneas. Antes de aplicar algoritmos de
aprendizaje automático, es crucial establecer técnicas robustas para manejar estos datos, ya que
est problemática representa un cuello de botella claro en la creación de algoritmos robustos. El
procesamiento de datos (en concreto, nos centraremos en el término inglés data wrangling), que
abarca los procesos de extracción, integración y limpieza necesarios para el análisis de datos,
desempeña un papel crucial en este sentido. Por lo tanto, el primer objetivo de esta tesis es
abordar el desafÃo normalmente paso por alto de tratar datos irregulare. EspecÃficamente, bajo
el contexto de datos médicos. Nos centraremos en tres aspectos principales. En primer lugar,
abordaremos el problema de los datos perdidos mediante el desarrollo de un marco que facilite
la imputación de estos datos perdidos utilizando información relevante obtenida de fuentes de
datos de diferente naturalaeza u observaciones pasadas. En segundo lugar, iremos más allá de la suposición de lidiar con observaciones homogéneas, donde solo se considera un tipo de dato
estadÃstico (como Gaussianos) y, en su lugar, trabajaremos con observaciones heterogéneas. Esto
significa que diferentes fuentes de datos pueden estar representadas por diversas distribuciones
de probabilidad, como Gaussianas, Bernoulli, categóricas, etc. Por último, teniendo en cuenta
el enriquecimiento temporal de los datos hoy en dÃa y nuestro enfoque directo sobre los datos
médicos, propondremos un algoritmo innovador capaz de capturar y propagar la correlación
entre diferentes flujos de datos a lo largo del tiempo. Todos estos tres problemas se abordan
en nuestra primera contribución, que implica el desarrollo de un método basado en Modelos
Generativos Profundos (Deep Genarative Model en inglés) utilizando Autoencoders Variacionales
(Variational Autoencoders en ingés). El modelo propuesto, Sequential Heterogeneous Incomplete
VAE (Shi-VAE), permite la agregación de múltiples flujos de datos heterogéneos de manera
modular, teniendo en cuenta la posible presencia de datos perdidos. Para demostrar la viabilidad
de nuestro enfoque, presentamos resultados de prueba de concepto obtenidos de una base de datos
real generada a través del monitoreo continuo pasivo de pacientes psiquiátricos.
Nuestro segundo desafÃo está relacionado con la creencia errónea de que los algoritmos de
aprendizaje automático pueden funcionar de manera independiente. Sin embargo, esta idea de que
los sistemas de inteligencia artificial pueden ser los únicos responsables en la toma de decisione,
especialmente en dominios crÃticos como la atención médica, está lejos de la realidad. Ahora,
nuestro enfoque se centra en un escenario especÃfico donde el algoritmo tiene la capacidad de
realizar predicciones de manera independiente o, alternativamente, delegar la responsabilidad
en un experto humano. La inclusión del ser humano no solo tiene como objetivo obtener un
mejor rendimiento, sino también obtener predicciones más transparentes y seguras en las que
podamos confiar. En la realidad, sin embargo, las decisiones importantes no las toma una sola
persona, sino que generalmente son el resultado de la colaboración de un conjunto de expertos.
Con esto en mente, surgen dos preguntas importantes: 1) ¿Cuándo debe asumir la responsabilidad
el ser humano o cuándo la máquina? y 2) de entre los expertos, ¿en quién debemos confiar?
Para responder a la primera pregunta, emplearemos una nueva teorÃa llamada Learning to defer
(L2D). En L2D, no solo estamos interesados en abstenernos de hacer predicciones, sino también
en comprender cómo de seguro estará el experto para hacer dichas predicciones, diferiendo solo
cuando el humano sea más probable en predecir correcatmente. La segunda pregunta sobre a quién
deferir entre un conjunto de expertos aún no ha sido respondida en la literatura de L2D, y esto es
precisamente lo que nuestras contribuciones pretenden proporcionar. En primer lugar, extendemos
las dos primeras surrogate losses consistentes propuestas hasta ahora en la literatura de L2D al
contexto de múltiples expertos. En segundo lugar, estudiamos la capacidad de estos modelos para
estimar la probabilidad de que un experto dado haga predicciones correctas y evaluamos si estas
surrogate losses están calibradas en términos de confianza. Finalmente, proponemos una técnica
de conformal inference que elige un subconjunto de expertos para consultar cuando el sistema
decide diferir. Esta combinación de expertos basada en los respectivos niveles de confianza es
fundamental para optimizar la colaboración entre humanos y máquinas En conclusión, esta tesis doctoral ha investigado dos casos en los que los humanos pueden
aprovechar el poder del aprendizaje automático: primero, como herramienta para ayudar en
problemas de procesamiento y comprensión de datos y, segundo, como herramienta colaborativa en
la que la toma de decisiones puede ser automatizada para ser realizada por la máquina o delegada
a expertos humanos, fomentando soluciones más transparentes y seguras.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: JoaquÃn MÃguez Arenas.- Secretario: Juan José Murillo Fuentes.- Vocal: Mélanie Natividad Fernández Pradie
IoT in smart communities, technologies and applications.
Internet of Things is a system that integrates different devices and technologies, removing the necessity of human intervention. This enables the capacity of having smart (or smarter) cities around the world. By hosting different technologies and allowing interactions between them, the internet of things has spearheaded the development of smart city systems for sustainable living, increased comfort and productivity for citizens. The Internet of Things (IoT) for Smart Cities has many different domains and draws upon various underlying systems for its operation, in this work, we provide a holistic coverage of the Internet of Things in Smart Cities by discussing the fundamental components that make up the IoT Smart City landscape, the technologies that enable these domains to exist, the most prevalent practices and techniques which are used in these domains as well as the challenges that deployment of IoT systems for smart cities encounter and which need to be addressed for ubiquitous use of smart city applications. It also presents a coverage of optimization methods and applications from a smart city perspective enabled by the Internet of Things. Towards this end, a mapping is provided for the most encountered applications of computational optimization within IoT smart cities for five popular optimization methods, ant colony optimization, genetic algorithm, particle swarm optimization, artificial bee colony optimization and differential evolution. For each application identified, the algorithms used, objectives considered, the nature of the formulation and constraints taken in to account have been specified and discussed. Lastly, the data setup used by each covered work is also mentioned and directions for future work have been identified. Within the smart health domain of IoT smart cities, human activity recognition has been a key study topic in the development of cyber physical systems and assisted living applications. In particular, inertial sensor based systems have become increasingly popular because they do not restrict users’ movement and are also relatively simple to implement compared to other approaches. Fall detection is one of the most important tasks in human activity recognition. With an increasingly aging world population and an inclination by the elderly to live alone, the need to incorporate dependable fall detection schemes in smart devices such as phones, watches has gained momentum. Therefore, differentiating between falls and activities of daily living (ADLs) has been the focus of researchers in recent years with very good results. However, one aspect within fall detection that has not been investigated much is direction and severity aware fall detection. Since a fall detection system aims to detect falls in people and notify medical personnel, it could be of added value to health professionals tending to a patient suffering from a fall to know the nature of the accident. In this regard, as a case study for smart health, four different experiments have been conducted for the task of fall detection with direction and severity consideration on two publicly available datasets. These four experiments not only tackle the problem on an increasingly complicated level (the first one considers a fall only scenario and the other two a combined activity of daily living and fall scenario) but also present methodologies which outperform the state of the art techniques as discussed. Lastly, future recommendations have also been provided for researchers
Pacific Symposium on Biocomputing 2023
The Pacific Symposium on Biocomputing (PSB) 2023 is an international, multidisciplinary conference for the presentation and discussion of current research in the theory and application of computational methods in problems of biological significance. Presentations are rigorously peer reviewed and are published in an archival proceedings volume. PSB 2023 will be held on January 3-7, 2023 in Kohala Coast, Hawaii. Tutorials and workshops will be offered prior to the start of the conference.PSB 2023 will bring together top researchers from the US, the Asian Pacific nations, and around the world to exchange research results and address open issues in all aspects of computational biology. It is a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, and other computational methods, as applied to biological problems, with emphasis on applications in data-rich areas of molecular biology.The PSB has been designed to be responsive to the need for critical mass in sub-disciplines within biocomputing. For that reason, it is the only meeting whose sessions are defined dynamically each year in response to specific proposals. PSB sessions are organized by leaders of research in biocomputing's 'hot topics.' In this way, the meeting provides an early forum for serious examination of emerging methods and approaches in this rapidly changing field
Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain
The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
Intelligent Transportation Related Complex Systems and Sensors
Building around innovative services related to different modes of transport and traffic management, intelligent transport systems (ITS) are being widely adopted worldwide to improve the efficiency and safety of the transportation system. They enable users to be better informed and make safer, more coordinated, and smarter decisions on the use of transport networks. Current ITSs are complex systems, made up of several components/sub-systems characterized by time-dependent interactions among themselves. Some examples of these transportation-related complex systems include: road traffic sensors, autonomous/automated cars, smart cities, smart sensors, virtual sensors, traffic control systems, smart roads, logistics systems, smart mobility systems, and many others that are emerging from niche areas. The efficient operation of these complex systems requires: i) efficient solutions to the issues of sensors/actuators used to capture and control the physical parameters of these systems, as well as the quality of data collected from these systems; ii) tackling complexities using simulations and analytical modelling techniques; and iii) applying optimization techniques to improve the performance of these systems. It includes twenty-four papers, which cover scientific concepts, frameworks, architectures and various other ideas on analytics, trends and applications of transportation-related data
Front-Line Physicians' Satisfaction with Information Systems in Hospitals
Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe