1,085 research outputs found

    Low- and high-resource opinion summarization

    Get PDF
    Customer reviews play a vital role in the online purchasing decisions we make. The reviews express user opinions that are useful for setting realistic expectations and uncovering important details about products. However, some products receive hundreds or even thousands of reviews, making them time-consuming to read. Moreover, many reviews contain uninformative content, such as irrelevant personal experiences. Automatic summarization offers an alternative – short text summaries capturing the essential information expressed in reviews. Automatically produced summaries can reflect overall or particular opinions and be tailored to user preferences. Besides being presented on major e-commerce platforms, home assistants can also vocalize them. This approach can improve user satisfaction by assisting in making faster and better decisions. Modern summarization approaches are based on neural networks, often requiring thousands of annotated samples for training. However, human-written summaries for products are expensive to produce because annotators need to read many reviews. This has led to annotated data scarcity where only a few datasets are available. Data scarcity is the central theme of our works, and we propose a number of approaches to alleviate the problem. The thesis consists of two parts where we discuss low- and high-resource data settings. In the first part, we propose self-supervised learning methods applied to customer reviews and few-shot methods for learning from small annotated datasets. Customer reviews without summaries are available in large quantities, contain a breadth of in-domain specifics, and provide a powerful training signal. We show that reviews can be used for learning summarizers via a self-supervised objective. Further, we address two main challenges associated with learning from small annotated datasets. First, large models rapidly overfit on small datasets leading to poor generalization. Second, it is not possible to learn a wide range of in-domain specifics (e.g., product aspects and usage) from a handful of gold samples. This leads to subtle semantic mistakes in generated summaries, such as ‘great dead on arrival battery.’ We address the first challenge by explicitly modeling summary properties (e.g., content coverage and sentiment alignment). Furthermore, we leverage small modules – adapters – that are more robust to overfitting. As we show, despite their size, these modules can be used to store in-domain knowledge to reduce semantic mistakes. Lastly, we propose a simple method for learning personalized summarizers based on aspects, such as ‘price,’ ‘battery life,’ and ‘resolution.’ This task is harder to learn, and we present a few-shot method for training a query-based summarizer on small annotated datasets. In the second part, we focus on the high-resource setting and present a large dataset with summaries collected from various online resources. The dataset has more than 33,000 humanwritten summaries, where each is linked up to thousands of reviews. This, however, makes it challenging to apply an ‘expensive’ deep encoder due to memory and computational costs. To address this problem, we propose selecting small subsets of informative reviews. Only these subsets are encoded by the deep encoder and subsequently summarized. We show that the selector and summarizer can be trained end-to-end via amortized inference and policy gradient methods

    On the Utility of Representation Learning Algorithms for Myoelectric Interfacing

    Get PDF
    Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden

    Continuous Estimation of Smoking Lapse Risk from Noisy Wrist Sensor Data Using Sparse and Positive-Only Labels

    Get PDF
    Estimating the imminent risk of adverse health behaviors provides opportunities for developing effective behavioral intervention mechanisms to prevent the occurrence of the target behavior. One of the key goals is to find opportune moments for intervention by passively detecting the rising risk of an imminent adverse behavior. Significant progress in mobile health research and the ability to continuously sense internal and external states of individual health and behavior has paved the way for detecting diverse risk factors from mobile sensor data. The next frontier in this research is to account for the combined effects of these risk factors to produce a composite risk score of adverse behaviors using wearable sensors convenient for daily use. Developing a machine learning-based model for assessing the risk of smoking lapse in the natural environment faces significant outstanding challenges requiring the development of novel and unique methodologies for each of them. The first challenge is coming up with an accurate representation of noisy and incomplete sensor data to encode the present and historical influence of behavioral cues, mental states, and the interactions of individuals with their ever-changing environment. The next noteworthy challenge is the absence of confirmed negative labels of low-risk states and adequate precise annotations of high-risk states. Finally, the model should work on convenient wearable devices to facilitate widespread adoption in research and practice. In this dissertation, we develop methods that account for the multi-faceted nature of smoking lapse behavior to train and evaluate a machine learning model capable of estimating composite risk scores in the natural environment. We first develop mRisk, which combines the effects of various mHealth biomarkers such as stress, physical activity, and location history in producing the risk of smoking lapse using sequential deep neural networks. We propose an event-based encoding of sensor data to reduce the effect of noises and then present an approach to efficiently model the historical influence of recent and past sensor-derived contexts on the likelihood of smoking lapse. To circumvent the lack of confirmed negative labels (i.e., annotated low-risk moments) and only a few positive labels (i.e., sensor-based detection of smoking lapse corroborated by self-reports), we propose a new loss function to accurately optimize the models. We build the mRisk models using biomarker (stress, physical activity) streams derived from chest-worn sensors. Adapting the models to work with less invasive and more convenient wrist-based sensors requires adapting the biomarker detection models to work with wrist-worn sensor data. To that end, we develop robust stress and activity inference methodologies from noisy wrist-sensor data. We first propose CQP, which quantifies wrist-sensor collected PPG data quality. Next, we show that integrating CQP within the inference pipeline improves accuracy-yield trade-offs associated with stress detection from wrist-worn PPG sensors in the natural environment. mRisk also requires sensor-based precise detection of smoking events and confirmation through self-reports to extract positive labels. Hence, we develop rSmoke, an orientation-invariant smoking detection model that is robust to the variations in sensor data resulting from orientation switches in the field. We train the proposed mRisk risk estimation models using the wrist-based inferences of lapse risk factors. To evaluate the utility of the risk models, we simulate the delivery of intelligent smoking interventions to at-risk participants as informed by the composite risk scores. Our results demonstrate the envisaged impact of machine learning-based models operating on wrist-worn wearable sensor data to output continuous smoking lapse risk scores. The novel methodologies we propose throughout this dissertation help instigate a new frontier in smoking research that can potentially improve the smoking abstinence rate in participants willing to quit

    Data-proximal complementary â„“1\ell^1-TV reconstruction for limited data CT

    Full text link
    In a number of tomographic applications, data cannot be fully acquired, resulting in a severely underdetermined image reconstruction. In such cases, conventional methods lead to reconstructions with significant artifacts. To overcome these artifacts, regularization methods are applied that incorporate additional information. An important example is TV reconstruction, which is known to be efficient at compensating for missing data and reducing reconstruction artifacts. At the same time, however, tomographic data is also contaminated by noise, which poses an additional challenge. The use of a single regularizer must therefore account for both the missing data and the noise. However, a particular regularizer may not be ideal for both tasks. For example, the TV regularizer is a poor choice for noise reduction across multiple scales, in which case â„“1\ell^1 curvelet regularization methods are well suited. To address this issue, in this paper we introduce a novel variational regularization framework that combines the advantages of different regularizers. The basic idea of our framework is to perform reconstruction in two stages, where the first stage mainly aims at accurate reconstruction in the presence of noise, and the second stage aims at artifact reduction. Both reconstruction stages are connected by a data proximity condition. The proposed method is implemented and tested for limited-view CT using a combined curvelet-TV approach. We define and implement a curvelet transform adapted to the limited-view problem and illustrate the advantages of our approach in numerical experiments

    Beam scanning by liquid-crystal biasing in a modified SIW structure

    Get PDF
    A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium

    Learning Weakly Convex Regularizers for Convergent Image-Reconstruction Algorithms

    Full text link
    We propose to learn non-convex regularizers with a prescribed upper bound on their weak-convexity modulus. Such regularizers give rise to variational denoisers that minimize a convex energy. They rely on few parameters (less than 15,000) and offer a signal-processing interpretation as they mimic handcrafted sparsity-promoting regularizers. Through numerical experiments, we show that such denoisers outperform convex-regularization methods as well as the popular BM3D denoiser. Additionally, the learned regularizer can be deployed to solve inverse problems with iterative schemes that provably converge. For both CT and MRI reconstruction, the regularizer generalizes well and offers an excellent tradeoff between performance, number of parameters, guarantees, and interpretability when compared to other data-driven approaches

    Curvature corrected tangent space-based approximation of manifold-valued data

    Full text link
    When generalizing schemes for real-valued data approximation or decomposition to data living in Riemannian manifolds, tangent space-based schemes are very attractive for the simple reason that these spaces are linear. An open challenge is to do this in such a way that the generalized scheme is applicable to general Riemannian manifolds, is global-geometry aware and is computationally feasible. Existing schemes have been unable to account for all three of these key factors at the same time. In this work, we take a systematic approach to developing a framework that is able to account for all three factors. First, we will restrict ourselves to the -- still general -- class of symmetric Riemannian manifolds and show how curvature affects general manifold-valued tensor approximation schemes. Next, we show how the latter observations can be used in a general strategy for developing approximation schemes that are also global-geometry aware. Finally, having general applicability and global-geometry awareness taken into account we restrict ourselves once more in a case study on low-rank approximation. Here we show how computational feasibility can be achieved and propose the curvature-corrected truncated higher-order singular value decomposition (CC-tHOSVD), whose performance is subsequently tested in numerical experiments with both synthetic and real data living in symmetric Riemannian manifolds with both positive and negative curvature

    Geometric Data Analysis: Advancements of the Statistical Methodology and Applications

    Get PDF
    Data analysis has become fundamental to our society and comes in multiple facets and approaches. Nevertheless, in research and applications, the focus was primarily on data from Euclidean vector spaces. Consequently, the majority of methods that are applied today are not suited for more general data types. Driven by needs from fields like image processing, (medical) shape analysis, and network analysis, more and more attention has recently been given to data from non-Euclidean spaces–particularly (curved) manifolds. It has led to the field of geometric data analysis whose methods explicitly take the structure (for example, the topology and geometry) of the underlying space into account. This thesis contributes to the methodology of geometric data analysis by generalizing several fundamental notions from multivariate statistics to manifolds. We thereby focus on two different viewpoints. First, we use Riemannian structures to derive a novel regression scheme for general manifolds that relies on splines of generalized Bézier curves. It can accurately model non-geodesic relationships, for example, time-dependent trends with saturation effects or cyclic trends. Since Bézier curves can be evaluated with the constructive de Casteljau algorithm, working with data from manifolds of high dimensions (for example, a hundred thousand or more) is feasible. Relying on the regression, we further develop a hierarchical statistical model for an adequate analysis of longitudinal data in manifolds, and a method to control for confounding variables. We secondly focus on data that is not only manifold- but even Lie group-valued, which is frequently the case in applications. We can only achieve this by endowing the group with an affine connection structure that is generally not Riemannian. Utilizing it, we derive generalizations of several well-known dissimilarity measures between data distributions that can be used for various tasks, including hypothesis testing. Invariance under data translations is proven, and a connection to continuous distributions is given for one measure. A further central contribution of this thesis is that it shows use cases for all notions in real-world applications, particularly in problems from shape analysis in medical imaging and archaeology. We can replicate or further quantify several known findings for shape changes of the femur and the right hippocampus under osteoarthritis and Alzheimer's, respectively. Furthermore, in an archaeological application, we obtain new insights into the construction principles of ancient sundials. Last but not least, we use the geometric structure underlying human brain connectomes to predict cognitive scores. Utilizing a sample selection procedure, we obtain state-of-the-art results

    What's in a Prior? Learned Proximal Networks for Inverse Problems

    Full text link
    Proximal operators are ubiquitous in inverse problems, commonly appearing as part of algorithmic strategies to regularize problems that are otherwise ill-posed. Modern deep learning models have been brought to bear for these tasks too, as in the framework of plug-and-play or deep unrolling, where they loosely resemble proximal operators. Yet, something essential is lost in employing these purely data-driven approaches: there is no guarantee that a general deep network represents the proximal operator of any function, nor is there any characterization of the function for which the network might provide some approximate proximal. This not only makes guaranteeing convergence of iterative schemes challenging but, more fundamentally, complicates the analysis of what has been learned by these networks about their training data. Herein we provide a framework to develop learned proximal networks (LPN), prove that they provide exact proximal operators for a data-driven nonconvex regularizer, and show how a new training strategy, dubbed proximal matching, provably promotes the recovery of the log-prior of the true data distribution. Such LPN provide general, unsupervised, expressive proximal operators that can be used for general inverse problems with convergence guarantees. We illustrate our results in a series of cases of increasing complexity, demonstrating that these models not only result in state-of-the-art performance, but provide a window into the resulting priors learned from data

    Insights on Learning Tractable Probabilistic Graphical Models

    Get PDF
    • …
    corecore