23 research outputs found
Learning with Low-Quality Data: Multi-View Semi-Supervised Learning with Missing Views
The focus of this thesis is on learning approaches for what we call ``low-quality data'' and in particular data in which only small amounts of labeled target data is available. The first part provides background discussion on low-quality data issues, followed by preliminary study in this area. The remainder of the thesis focuses on a particular scenario: multi-view semi-supervised learning. Multi-view learning generally refers to the case of learning with data that has multiple natural views, or sets of features, associated with it. Multi-view semi-supervised learning methods try to exploit the combination of multiple views along with large amounts of unlabeled data in order to learn better predictive functions when limited labeled data is available. However, lack of complete view data limits the applicability of multi-view semi-supervised learning to real world data. Commonly, one data view is readily and cheaply available, but additionally views may be costly or only available in some cases. This thesis work aims to make multi-view semi-supervised learning approaches more applicable to real world data specifically by addressing the issue of missing views through both feature generation and active learning, and addressing the issue of model selection for semi-supervised learning with limited labeled data. This thesis introduces a unified approach for handling missing view data in multi-view semi-supervised learning tasks, which applies to both data with completely missing additional views and data only missing views in some instances. The idea is to learn a feature generation function mapping one view to another with the mapping biased to encourage the features generated to be useful for multi-view semi-supervised learning algorithms. The mapping is then used to fill in views as pre-processing. Unlike previously proposed single-view multi-view learning approaches, the proposed approach is able to take advantage of additional view data when available, and for the case of partial view presence is the first feature-generation approach specifically designed to take into account the multi-view semi-supervised learning aspect. The next component of this thesis is the analysis of an active view completion scenario. In some tasks, it is possible to obtain missing view data for a particular instance, but with some associated cost. Recent work has shown an active selection strategy can be more effective than a random one. In this thesis, a better understanding of active approaches is sought, and it is demonstrated that the effectiveness of an active selection strategy over a random one can depend on the relationship between the views. Finally, an important component of making multi-view semi-supervised learning applicable to real world data is the task of model selection, an open problem which is often avoided entirely in previous work. For cases of very limited labeled training data the commonly used cross-validation approach can become ineffective. This thesis introduces a re-training alternative to the method-dependent approaches similar in motivation to cross-validation, that involves generating new training and test data by sampling from the large amount of unlabeled data and estimated conditional probabilities for the labels. The proposed approaches are evaluated on a variety of multi-view semi-supervised learning data sets, and the experimental results demonstrate their efficacy
GLSVM: Integrating Structured Feature Selection and Large Margin Classification
Abstract—High dimensional data challenges current feature selection methods. For many real world problems we often have prior knowledge about the relationship of features. For example in microarray data analysis, genes from the same biological pathways are expected to have similar relationship to the outcome that we target to predict. Recent regularization methods on Support Vector Machine (SVM) have achieved great success to perform feature selection and model selection simultaneously for high dimensional data, but neglect such re-lationship among features. To build interpretable SVM models, the structure information of features should be incorporated. In this paper, we propose an algorithm GLSVM that automatically perform model selection and feature selection in SVMs. To incorporate the prior knowledge of feature relationship, we extend standard 2 norm SVM and use a penalty function that employs a L2 norm regularization term including the normalized Laplacian of the graph and L1 penalty. We have demonstrated the effectiveness of our methods and compare them to the state-of-the-art using two real-world benchmarks. I
An Optimistic-Robust Approach for Dynamic Positioning of Omnichannel Inventories
We introduce a new class of data-driven and distribution-free
optimistic-robust bimodal inventory optimization (BIO) strategy to effectively
allocate inventory across a retail chain to meet time-varying, uncertain
omnichannel demand. While prior Robust optimization (RO) methods emphasize the
downside, i.e., worst-case adversarial demand, BIO also considers the upside to
remain resilient like RO while also reaping the rewards of improved
average-case performance by overcoming the presence of endogenous outliers.
This bimodal strategy is particularly valuable for balancing the tradeoff
between lost sales at the store and the costs of cross-channel e-commerce
fulfillment, which is at the core of our inventory optimization model. These
factors are asymmetric due to the heterogenous behavior of the channels, with a
bias towards the former in terms of lost-sales cost and a dependence on network
effects for the latter. We provide structural insights about the BIO solution
and how it can be tuned to achieve a preferred tradeoff between robustness and
the average-case. Our experiments show that significant benefits can be
achieved by rethinking traditional approaches to inventory management, which
are siloed by channel and location. Using a real-world dataset from a large
American omnichannel retail chain, a business value assessment during a peak
period indicates over a 15% profitability gain for BIO over RO and other
baselines while also preserving the (practical) worst case performance
Hierarchy-guided Model Selection for Time Series Forecasting
Generalizability of time series forecasting models depends on the quality of
model selection. Temporal cross validation (TCV) is a standard technique to
perform model selection in forecasting tasks. TCV sequentially partitions the
training time series into train and validation windows, and performs
hyperparameter optmization (HPO) of the forecast model to select the model with
the best validation performance. Model selection with TCV often leads to poor
test performance when the test data distribution differs from that of the
validation data. We propose a novel model selection method, H-Pro that exploits
the data hierarchy often associated with a time series dataset. Generally, the
aggregated data at the higher levels of the hierarchy show better
predictability and more consistency compared to the bottom-level data which is
more sparse and (sometimes) intermittent. H-Pro performs the HPO of the
lowest-level student model based on the test proxy forecasts obtained from a
set of teacher models at higher levels in the hierarchy. The consistency of the
teachers' proxy forecasts help select better student models at the
lowest-level. We perform extensive empirical studies on multiple datasets to
validate the efficacy of the proposed method. H-Pro along with off-the-shelf
forecasting models outperform existing state-of-the-art forecasting methods
including the winning models of the M5 point-forecasting competition
Quantum Multiple Kernel Learning in Financial Classification Tasks
Financial services is a prospect industry where unlocked near-term quantum
utility could yield profitable potential, and, in particular, quantum machine
learning algorithms could potentially benefit businesses by improving the
quality of predictive models. Quantum kernel methods have demonstrated success
in financial, binary classification tasks, like fraud detection, and avoid
issues found in variational quantum machine learning approaches. However,
choosing a suitable quantum kernel for a classical dataset remains a challenge.
We propose a hybrid, quantum multiple kernel learning (QMKL) methodology that
can improve classification quality over a single kernel approach. We test the
robustness of QMKL on several financially relevant datasets using both fidelity
and projected quantum kernel approaches. We further demonstrate QMKL on quantum
hardware using an error mitigation pipeline and show the benefits of QMKL in
the large qubit regime
The Importance of Thermal Emission Spectroscopy for Understanding Terrestrial Exoplanets
The primary objective of this white paper is to illustrate the importance of the thermal infrared in characterizing terrestrial planets leveraging our experience in characterizing extra-solar jovian worlds
A Realistic Roadmap to Formation Flying Space Interferometry
The ultimate astronomical observatory would be a formation flying space interferometer, combining sensitivity and stability with high angular resolution. The smallSat revolution offers a new and maturing prototyping platform for space interferometry and we put forward a realistic plan for achieving first stellar fringes in space by 2030