6,666 research outputs found
Recommended from our members
Cancer Care in Pandemic Times: Building Inclusive Local Health Security in Africa and India
This is a book about improving cancer care in Africa and India that is a child of its pandemic times. It has been collaboratively researched and written by colleagues in Kenya, Tanzania, India and the UK, working within a cross-country, multidisciplinary research project, Innovation for Cancer Care in Africa (ICCA). Since this was a health-focused research project, ICCA researchers during the pandemic not only continued to work on the cancer research project but were also called upon by their governments to respond to immediate pandemic needs. In combining these two concerns, for improving cancer care and responding to pandemic needs, our original project aims have been challenged, deepened and reworked. ICCAâs initial collaborative research focus includedâagainst the grain of most global health literatureâthe potential role of enhanced local production of essential healthcare supplies for improving cancer care in African countries. The pandemic experience has strikingly validated these earlier findings on the importance of industrial development for health care. The pandemic crystallised for researchers and policymakers an often overlooked phenomenon: global health security is built on the foundations of strong local health security. We argue in this book that new analytical thinking from social scientists and others is required on how to build local health security. We use the âlensâ of original research on cancer care in East Africa and India to build up an understanding of the scope for the development of stronger synergies between local health industries and health care, in order to strengthen local health security and develop tools for policy making. The rethinking and reimagining presented here is required for different African countries, for India and the wider world, and this research on cancer care has taught us that this imperative goes much wider than infectious diseases
More-than-words: Reconceptualising Two-year-old Childrenâs Onto-epistemologies Through Improvisation and the Temporal Arts
This thesis project takes place at a time of increasing focus upon two-year-old children and the words they speak. On the one hand there is a mounting pressure, driven by the school readiness agenda, to make children talk as early as possible. On the other hand, there is an increased interest in understanding childrenâs communication in order to create effective pedagogies. More-than-words (MTW) examines an improvised art-education practice that combines heterogenous elements: sound, movement and materials (such as silk, string, light) to create encounters for young children, educators and practitioners from diverse backgrounds. During these encounters, adults adopt a practice of stripping back their words in order to tune into the polyphonic ways that children are becoming-with the world.
For this research-creation, two MTW sessions for two-year-old children and their carers took place in a specially created installation. These sessions were filmed on a 360Ë camera, nursery school iPad and on a specially made child-friendly Toddler-cam (Tcam) that rolled around in the installation-event with the children. Through using the frameless technology of 360Ë film, I hoped to make tangible the relation and movement of an emergent and improvised happening and the way in which young children operate fluidly through multiple modes.
Travelling with posthuman, Deleuzio-Guattarian and feminist vital material philosophy, I wander and wonder speculatively through practice, memory, and film data as a bag lady, a Haraway-ian writer/artist/researcher-creator who resists the story of the wordless child as lacking and tragic; the story that positions the word as heroic. Instead, through returning to the uncertainty of improvisation, I attempt to tune into the savage, untamed and wild music of young childrenâs animistic onto-epistemologies
Low- and high-resource opinion summarization
Customer reviews play a vital role in the online purchasing decisions we make. The reviews
express user opinions that are useful for setting realistic expectations and uncovering important
details about products. However, some products receive hundreds or even thousands of
reviews, making them time-consuming to read. Moreover, many reviews contain uninformative
content, such as irrelevant personal experiences. Automatic summarization offers an
alternative â short text summaries capturing the essential information expressed in reviews.
Automatically produced summaries can reflect overall or particular opinions and be tailored to
user preferences. Besides being presented on major e-commerce platforms, home assistants
can also vocalize them. This approach can improve user satisfaction by assisting in making
faster and better decisions.
Modern summarization approaches are based on neural networks, often requiring thousands of
annotated samples for training. However, human-written summaries for products are expensive
to produce because annotators need to read many reviews. This has led to annotated data
scarcity where only a few datasets are available. Data scarcity is the central theme of our
works, and we propose a number of approaches to alleviate the problem. The thesis consists
of two parts where we discuss low- and high-resource data settings.
In the first part, we propose self-supervised learning methods applied to customer reviews
and few-shot methods for learning from small annotated datasets. Customer reviews without
summaries are available in large quantities, contain a breadth of in-domain specifics, and
provide a powerful training signal. We show that reviews can be used for learning summarizers
via a self-supervised objective. Further, we address two main challenges associated with
learning from small annotated datasets. First, large models rapidly overfit on small datasets
leading to poor generalization. Second, it is not possible to learn a wide range of in-domain
specifics (e.g., product aspects and usage) from a handful of gold samples. This leads to
subtle semantic mistakes in generated summaries, such as âgreat dead on arrival battery.â We
address the first challenge by explicitly modeling summary properties (e.g., content coverage
and sentiment alignment). Furthermore, we leverage small modules â adapters â that are
more robust to overfitting. As we show, despite their size, these modules can be used to
store in-domain knowledge to reduce semantic mistakes. Lastly, we propose a simple method
for learning personalized summarizers based on aspects, such as âprice,â âbattery life,â and
âresolution.â This task is harder to learn, and we present a few-shot method for training a
query-based summarizer on small annotated datasets.
In the second part, we focus on the high-resource setting and present a large dataset with
summaries collected from various online resources. The dataset has more than 33,000 humanwritten
summaries, where each is linked up to thousands of reviews. This, however, makes it
challenging to apply an âexpensiveâ deep encoder due to memory and computational costs. To
address this problem, we propose selecting small subsets of informative reviews. Only these
subsets are encoded by the deep encoder and subsequently summarized. We show that the
selector and summarizer can be trained end-to-end via amortized inference and policy gradient
methods
Multimodal MRI analysis using deep learning methods
Magnetic resonance imaging (MRI) has been widely used in scientific and clinical research. It is a non-invasive medical imaging technique that reveals anatomical structures and provides useful information for investigators to explore aging and pathological processes. Different MR modalities offer different useful properties. Automatic MRI analysis algorithms have been developed to address problems in many applications such as classification, segmentation, and disease diagnosis. Segmentation and labeling algorithms applied to brain MRIs enable evaluations of the volumetric changes of specific structures in neurodegenerative diseases. Reconstruction of fiber orientations using diffusion MRI is beneficial to obtain better understanding of the underlying structures.
In this thesis, we focused on development of deep learning methods for MRI analysis using different image modalities. Specifically, we applied deep learning techniques on different applications, including segmentation of brain structures and reconstruction of tongue muscle fiber orientations. For segmentation of brain structures, we developed an end-to-end deep learning algorithm for ventricle parcellation of brains with ventriculomegaly using T1-w MR images. The deep network provides robust and accurate segmentation results in subjects with high variability in ventricle shapes and sizes. We developed another deep learning method to automatically parcellate the thalamus into a set of thalamic nuclei using T1-w MRI and features from diffusion MRI. The algorithm incorporates a harmonization step to make the network adapt to input images with different contrasts.
We also studied the strains associated with tongue muscles during speech production using multiple MRI modalities. To enable this study, we first developed a deep network to reconstruct crossing tongue muscle fiber orientations using diffusion MRI. The network was specifically designed for the human tongue and accounted for the orthogonality property of the tongue muscles. Next, we proposed a comprehensive pipeline to analyze the strains associated with tongue muscle fiber orientations during speech using diffusion MRI, and tagged and cine MRI. The proposed pipeline provides a solution to analyze the cooperation between muscle groups during speech production
Enabling Deep Neural Network Inferences on Resource-constraint Devices
Department of Computer Science and EngineeringWhile deep neural networks (DNN) are widely used on various devices, including resource-constraint devices such as IoT, AR/VR, and mobile devices, running DNN from resource-constrained devices remains challenging. There exist three approaches for DNN inferences on resource-constraint devices: 1) lightweight DNN for on-device computing, 2) offloading DNN inferences to a cloud server, and 3) split computing to utilize computation and network resources efficiently.
Designing a lightweight DNN without compromising the accuracy of DNN is challenging due to a trade-off between latency and accuracy, that more computation is required to achieve higher accuracy. One solution to overcome this challenge is pre-processing to extract and transfer helpful information to achieve high accuracy of DNN. We design the pre-processing, which consists of three processes. The first process of pre-processing is finding out the best input source. The second process is the input-processing which extracts and contains important information for DNN inferences among the whole information gained from the input source. The last process is choosing or designing a suitable lightweight DNN for processed input. As an instance of how to apply the pre-processing, in Sec 2, we present a new transportation mode recognition system for smartphones called DeepVehicleSense, which aims at achieving three performance objectives: high accuracy, low latency, and low power consumption at once by exploiting sound characteristics captured from the built-in microphone while being on candidate transportations. To achieve high accuracy and low latency, DeepVehicleSense makes use of non-linear filters that can best extract the transportation sound samples. For the recognition of five different transportation modes, we design a deep learning-based sound classifier using a novel deep neural network architecture with multiple branches. Our staged inference technique can significantly reduce runtime and energy consumption while maintaining high accuracy for the majority of samples.
Offloading DNN inferences to a server is a solution for DNN inferences on resource-constraint devices, but there is one concern about latency caused by data transmission. To reduce transmission latency, recent studies have tried to make this offloading process more efficient by compressing data to be offloaded. However, conventional compression techniques are designed for human beings, so they compress data to be possible to restore data, which looks like the original from the perspective of human eyes. As a result, the compressed data through the compression technique contains redundancy beyond the necessary information for DNN inference.
In other words, the most fundamental question on extracting and offloading the minimal amount of necessary information that does not degrade the inference accuracy has remained unanswered. To answer the question, in Sec 3, we call such an ideal offloading semantic offloading and propose N-epitomizer, a new offloading framework that enables semantic offloading, thus achieving more reliable and timely inferences in highly-fluctuated or even low-bandwidth wireless networks. To realize N-epitomizer, we design an autoencoder-based scalable encoder trained to extract the most informative data and scale its output size to meet the latency and accuracy requirements of inferences over a network.
Even though our proposed lightweight DNN and offloading framework with the essential information extractor achieve low latency while preserving DNN performance, they alone cannot realize latency-guaranteed DNN inferences. To realize latency-guaranteed DNN inferences, the computational complexity of the lightweight DNN and the compression performance of the encoder for offloading should be adaptively selected according to current computation resources and network conditions by utilizing the DNN's trade-off between computational complexity and DNN performance and the encoder's trade-off between compression performance and DNN performance. To this end, we propose a new framework for latency-guaranteed DNN inferences called LG-DI, which predicts DNN performance degradation given a latency budget in advance and utilizes the better method between the lightweight DNN and offloading with compression. As a result, our proposed framework for DNN inferences can guarantee latency regardless of changes in computation and network resources while maintaining DNN performance as much as possible.ope
Anchorage: Visual Analysis of Satisfaction in Customer Service Videos via Anchor Events
Delivering customer services through video communications has brought new
opportunities to analyze customer satisfaction for quality management. However,
due to the lack of reliable self-reported responses, service providers are
troubled by the inadequate estimation of customer services and the tedious
investigation into multimodal video recordings. We introduce Anchorage, a
visual analytics system to evaluate customer satisfaction by summarizing
multimodal behavioral features in customer service videos and revealing
abnormal operations in the service process. We leverage the semantically
meaningful operations to introduce structured event understanding into videos
which help service providers quickly navigate to events of their interest.
Anchorage supports a comprehensive evaluation of customer satisfaction from the
service and operation levels and efficient analysis of customer behavioral
dynamics via multifaceted visualization views. We extensively evaluate
Anchorage through a case study and a carefully-designed user study. The results
demonstrate its effectiveness and usability in assessing customer satisfaction
using customer service videos. We found that introducing event contexts in
assessing customer satisfaction can enhance its performance without
compromising annotation precision. Our approach can be adapted in situations
where unlabelled and unstructured videos are collected along with sequential
records.Comment: 13 pages. A preprint version of a publication at IEEE Transactions on
Visualization and Computer Graphics (TVCG), 202
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks
Transformer is a deep neural network that employs a self-attention mechanism
to comprehend the contextual relationships within sequential data. Unlike
conventional neural networks or updated versions of Recurrent Neural Networks
(RNNs) such as Long Short-Term Memory (LSTM), transformer models excel in
handling long dependencies between input sequence elements and enable parallel
processing. As a result, transformer-based models have attracted substantial
interest among researchers in the field of artificial intelligence. This can be
attributed to their immense potential and remarkable achievements, not only in
Natural Language Processing (NLP) tasks but also in a wide range of domains,
including computer vision, audio and speech processing, healthcare, and the
Internet of Things (IoT). Although several survey papers have been published
highlighting the transformer's contributions in specific fields, architectural
differences, or performance evaluations, there is still a significant absence
of a comprehensive survey paper encompassing its major applications across
various domains. Therefore, we undertook the task of filling this gap by
conducting an extensive survey of proposed transformer models from 2017 to
2022. Our survey encompasses the identification of the top five application
domains for transformer-based models, namely: NLP, Computer Vision,
Multi-Modality, Audio and Speech Processing, and Signal Processing. We analyze
the impact of highly influential transformer-based models in these domains and
subsequently classify them based on their respective tasks using a proposed
taxonomy. Our aim is to shed light on the existing potential and future
possibilities of transformers for enthusiastic researchers, thus contributing
to the broader understanding of this groundbreaking technology
Posthuman Creative Styling can a creative writerâs style of writing be described as procedural?
This thesis is about creative styling â the styling a creative writer might use to make their writing
unique. It addresses the question as to whether such styling can be described as procedural. Creative
styling is part of the technique a creative writer uses when writing. It is how they make the text more
âlivelyâ by use of tips and tricks they have either learned or discovered. In essence these are rules, ones
the writer accrues over time by their practice. The thesis argues that the use and invention of these
rules can be set as procedures. and so describe creative styling as procedural.
The thesis follows from questioning why it is that machines or algorithms have, so far, been
incapable of producing creative writing which has value. Machine-written novels do not abound on
the bookshelves and writing styled by computers is, on the whole, dull in comparison to human-crafted
literature. It came about by thinking how it would be possible to reach a point where writing by people
and procedural writing are considered to have equal value. For this reason the thesis is set in a
posthuman context, where the differences between machines and people are erased.
The thesis uses practice to inform an original conceptual space model, based on quality dimensions
and dynamic-inter operation of spaces. This model gives an example of the procedures which a
posthuman creative writer uses when engaged in creative styling. It suggests an original formulation
for the conceptual blending of conceptual spaces, based on the casting of qualities from one space to
another. In support of and informing its arguments are ninety-nine examples of creative writing
practice which show the procedures by which style has been applied, created and assessed. It provides
a route forward for further joint research into both computational and human-coded creative writing
Improving diagnostic procedures for epilepsy through automated recording and analysis of patientsâ history
Transient loss of consciousness (TLOC) is a time-limited state of profound cognitive impairment characterised by amnesia, abnormal motor control, loss of responsiveness, a short duration and complete recovery. Most instances of TLOC are caused by one of three health conditions: epilepsy, functional (dissociative) seizures (FDS), or syncope. There is often a delay before the correct diagnosis is made and 10-20% of individuals initially receive an incorrect diagnosis. Clinical decision tools based on the endorsement of TLOC symptom lists have been limited to distinguishing between two causes of TLOC. The Initial Paroxysmal Event Profile (iPEP) has shown promise but was demonstrated to have greater accuracy in distinguishing between syncope and epilepsy or FDS than between epilepsy and FDS. The objective of this thesis was to investigate whether interactional, linguistic, and communicative differences in how people with epilepsy and people with FDS describe their experiences of TLOC can improve the predictive performance of the iPEP. An online web application was designed that collected information about TLOC symptoms and medical history from patients and witnesses using a binary questionnaire and verbal interaction with a virtual agent. We explored potential methods of automatically detecting these communicative differences, whether the differences were present during an interaction with a VA, to what extent these automatically detectable communicative differences improve the performance of the iPEP, and the acceptability of the application from the perspective of patients and witnesses. The two feature sets that were applied to previous doctor-patient interactions, features designed to measure formulation effort or detect semantic differences between the two groups, were able to predict the diagnosis with an accuracy of 71% and 81%, respectively. Individuals with epilepsy or FDS provided descriptions of TLOC to the VA that were qualitatively like those observed in previous research. Both feature sets were effective predictors of the diagnosis when applied to the web application recordings (85.7% and 85.7%). Overall, the accuracy of machine learning models trained for the threeway classification between epilepsy, FDS, and syncope using the iPEP responses from patients that were collected through the web application was worse than the performance observed in previous research (65.8% vs 78.3%), but the performance was increased by the inclusion of features extracted from the spoken descriptions on TLOC (85.5%). Finally, most participants who provided feedback reported that the online application was acceptable. These findings suggest that it is feasible to differentiate between people with epilepsy and people with FDS using an automated analysis of spoken seizure descriptions. Furthermore, incorporating these features into a clinical decision tool for TLOC can improve the predictive performance by improving the differential diagnosis between these two health conditions. Future research should use the feedback to improve the design of the application and increase perceived acceptability of the approach
- âŚ