47 research outputs found
Applications
Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications
Analyzing intentions from big data traces of human activities
The rapid growth of big data formed by human activities makes research on intention analysis both challenging and rewarding. We study multifaceted problems in analyzing intentions from big data traces of human activities, and such problems span a range of machine learning, optimization, and security and privacy.
We show that analyzing intentions from industry-scale human activity big data can effectively improve the accuracy of computational models. Specifically, we take query auto-completion as a case study. We identify two hitherto-undiscovered problems: adaptive query auto-completion and mobile query auto-completion. We develop two computational models by analyzing intentions from big data traces of human activities on search interface interactions and on mobile application usage respectively.
Solving the large-scale optimization problems in the proposed query auto-completion models drives deeper studies of the solvers. Hence, we consider the generalized machine learning problem settings and focus on developing lightweight stochastic algorithms as solvers to the large-scale convex optimization problems with theoretical guarantees. For optimizing strongly convex objectives, we design an accelerated stochastic block coordinate descent method with optimal sampling; for optimizing non-strongly convex objectives, we design a stochastic variance reduced alternating direction method of multipliers with the doubling-trick.
Inevitably, human activities are human-centric, thus its research can inform security and privacy. On one hand, intention analysis research from human activities can be motivated from the security perspective. For instance, to reduce false alarms of medical service providers' suspicious accesses to electronic health records, we discover potential de facto diagnosis specialties that reflect such providers' genuine and permissible intentions of accessing records with certain diagnoses. On the other hand, we examine the privacy risk in anonymized heterogeneous information networks representing large-scale human activities, such as in social networking. Such data are released for external researchers to improve the prediction accuracy for users' online social networking intentions on the publishers' microblogging site. We show a negative result that makes a compelling argument: privacy must be a central goal for sensitive human activity data publishers
Applications
Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications
Recommended from our members
Intelligent Devices for IoT Applications
Internet of Things (IoT) devices refer to a vast network of physical devices that are connected to the internet and can communicate with each other through sensors and software. These devices range from simple household appliances, like smart thermostats and security cameras, to more complex industrial equipment, such as sensors used in manufacturing and logistics. Specially, IoT enabled wireless gas sensing systems which can withstand harsh environments without compromising the performance are getting popular day by day, which necessitates adequate developments in this field. By being the essential components of a wireless gas sensing system, both the sensor and the elements for communication should be agile and resilient when it comes to tackle unfavorable scenario. Moreover, gas sensors are prone to drift, which can lead to inaccurate readings and decreased reliability over time. Again, recent advancements in antenna design, such as fractal antennas and metamaterial structures, have shown promises in improving the bandwidth and gain parameters of the antennas built on top of high temperature tackling substrates. This piece of research targets three fundamental sections: demonstration of recent advances in data driven techniques for gas sensing system optimization, designing of antennas for different applications, and device design as well as fabrication. The Dimatix DMP-2831 inkjet printer has been optimized to operate with six different inks and two different substrates including PET and 3 mol yttria-stabilized zirconia (3YSZ) based ceramic substrate. Later, the feature oriented gas sensor data analysis to investigate correlations among stability, selectivity and long term drift is illustrated, which should significant relations among those parameters that can be considered while designing different intelligent data driven models to compensate drift. Moreover, a subspace transfer based approach is proposed to classify drifted gas sensor response to detect particular gas with higher accuracy. The model achieved an average accuracy greater than 87% while using only 40% of the total dataset to be trained. In the field of antenna technology, a co-planar waveguide (CPW) fed super wideband antenna is proposed which can cover C, X, Ku, K, Ka, Q, V, and W bands according to the simulated performance with high gain and radiation efficiency. Again, a high temperature tolerant antenna based on 3YSZ substrate is proposed which achieved good alignment between the simulated and fabricated device performance
Deep Learning in Medical Image Analysis
The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis
A Photoplethysmography System Optimised for Pervasive Cardiac Monitoring
Photoplethysmography is a non-invasive sensing technique which infers instantaneous
cardiac function from an optical measurement of blood vessels. This
thesis presents a photoplethysmography based sensor system that has been developed
speci fically for the requirements of a pervasive healthcare monitoring
system. Continuous monitoring of patients requires both the size and power
consumption of the chosen sensor solution to be minimised to ensure the patients
will be willing to use the device. Pervasive sensing also requires that
the device be scalable for manufacturing in high volume at a build cost that
healthcare providers are willing to accept. System level choice of both electronic
circuits and signal processing techniques are based on their sensitivity to
cardiac biosignals, robustness against noise inducing artefacts and simplicity
of implementation. Numerical analysis is used to justify the implementation
of a technique in hardware. Circuit prototyping and experimental data collection
is used to validate a technique's application. The entire signal chain
operates in the discrete-time domain which allows all of the signal processing
to be implemented in firmware on an embedded processor which minimised the
number of discrete components while optimising the trade-off between power
and bandwidth in the analogue front-end. Synchronisation of the optical illumination
and detection modules enables high dynamic range rejection of both
AC and DC independent light sources without compromising the biosignal.
Signal delineation is used to reduce the required communication bandwidth as
it preserves both amplitude and temporal resolution of the non-stationary photoplethysmography
signals allowing more complicated analytical techniques to
be performed at the other end of communication channel. The complete sensing
system is implemented on a single PCB using only commercial-off -the-shelf
components and consumes less than 7.5mW of power. The sensor platform
is validated by the successful capture of physiological data in a harsh optical
sensing environment
A Photoplethysmography System Optimised for Pervasive Cardiac Monitoring
Photoplethysmography is a non-invasive sensing technique which infers instantaneous
cardiac function from an optical measurement of blood vessels. This
thesis presents a photoplethysmography based sensor system that has been developed
speci fically for the requirements of a pervasive healthcare monitoring
system. Continuous monitoring of patients requires both the size and power
consumption of the chosen sensor solution to be minimised to ensure the patients
will be willing to use the device. Pervasive sensing also requires that
the device be scalable for manufacturing in high volume at a build cost that
healthcare providers are willing to accept. System level choice of both electronic
circuits and signal processing techniques are based on their sensitivity to
cardiac biosignals, robustness against noise inducing artefacts and simplicity
of implementation. Numerical analysis is used to justify the implementation
of a technique in hardware. Circuit prototyping and experimental data collection
is used to validate a technique's application. The entire signal chain
operates in the discrete-time domain which allows all of the signal processing
to be implemented in firmware on an embedded processor which minimised the
number of discrete components while optimising the trade-off between power
and bandwidth in the analogue front-end. Synchronisation of the optical illumination
and detection modules enables high dynamic range rejection of both
AC and DC independent light sources without compromising the biosignal.
Signal delineation is used to reduce the required communication bandwidth as
it preserves both amplitude and temporal resolution of the non-stationary photoplethysmography
signals allowing more complicated analytical techniques to
be performed at the other end of communication channel. The complete sensing
system is implemented on a single PCB using only commercial-off -the-shelf
components and consumes less than 7.5mW of power. The sensor platform
is validated by the successful capture of physiological data in a harsh optical
sensing environment
Text Similarity Between Concepts Extracted from Source Code and Documentation
Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p