Search CORE

23 research outputs found

Learning with Low-Quality Data: Multi-View Semi-Supervised Learning with Missing Views

Author: Quanz Brian
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2012
Field of study

The focus of this thesis is on learning approaches for what we call ``low-quality data'' and in particular data in which only small amounts of labeled target data is available. The first part provides background discussion on low-quality data issues, followed by preliminary study in this area. The remainder of the thesis focuses on a particular scenario: multi-view semi-supervised learning. Multi-view learning generally refers to the case of learning with data that has multiple natural views, or sets of features, associated with it. Multi-view semi-supervised learning methods try to exploit the combination of multiple views along with large amounts of unlabeled data in order to learn better predictive functions when limited labeled data is available. However, lack of complete view data limits the applicability of multi-view semi-supervised learning to real world data. Commonly, one data view is readily and cheaply available, but additionally views may be costly or only available in some cases. This thesis work aims to make multi-view semi-supervised learning approaches more applicable to real world data specifically by addressing the issue of missing views through both feature generation and active learning, and addressing the issue of model selection for semi-supervised learning with limited labeled data. This thesis introduces a unified approach for handling missing view data in multi-view semi-supervised learning tasks, which applies to both data with completely missing additional views and data only missing views in some instances. The idea is to learn a feature generation function mapping one view to another with the mapping biased to encourage the features generated to be useful for multi-view semi-supervised learning algorithms. The mapping is then used to fill in views as pre-processing. Unlike previously proposed single-view multi-view learning approaches, the proposed approach is able to take advantage of additional view data when available, and for the case of partial view presence is the first feature-generation approach specifically designed to take into account the multi-view semi-supervised learning aspect. The next component of this thesis is the analysis of an active view completion scenario. In some tasks, it is possible to obtain missing view data for a particular instance, but with some associated cost. Recent work has shown an active selection strategy can be more effective than a random one. In this thesis, a better understanding of active approaches is sought, and it is demonstrated that the effectiveness of an active selection strategy over a random one can depend on the relationship between the views. Finally, an important component of making multi-view semi-supervised learning applicable to real world data is the task of model selection, an open problem which is often avoided entirely in previous work. For cases of very limited labeled training data the commonly used cross-validation approach can become ineffective. This thesis introduces a re-training alternative to the method-dependent approaches similar in motivation to cross-validation, that involves generating new training and test data by sampling from the large amount of unlabeled data and estimated conditional probabilities for the labels. The proposed approaches are evaluated on a variety of multi-view semi-supervised learning data sets, and the experimental results demonstrate their efficacy

KU ScholarWorks (Univ. of Kansas)

GLSVM: Integrating Structured Feature Selection and Large Margin Classification

Author: Brian Quanz
Hongliang Fei
Jun Huan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Abstract—High dimensional data challenges current feature selection methods. For many real world problems we often have prior knowledge about the relationship of features. For example in microarray data analysis, genes from the same biological pathways are expected to have similar relationship to the outcome that we target to predict. Recent regularization methods on Support Vector Machine (SVM) have achieved great success to perform feature selection and model selection simultaneously for high dimensional data, but neglect such re-lationship among features. To build interpretable SVM models, the structure information of features should be incorporated. In this paper, we propose an algorithm GLSVM that automatically perform model selection and feature selection in SVMs. To incorporate the prior knowledge of feature relationship, we extend standard 2 norm SVM and use a penalty function that employs a L2 norm regularization term including the normalized Laplacian of the graph and L1 penalty. We have demonstrated the effectiveness of our methods and compare them to the state-of-the-art using two real-world benchmarks. I

CiteSeerX

Crossref

An Optimistic-Robust Approach for Dynamic Positioning of Omnichannel Inventories

Author: Harsha Pavithra
Koc Ali
Narayanaswami Chandra
Quanz Brian
Ramakrishna Mahesh
Shah Dhruv
Subramanian Shivaram
Publication venue
Publication date: 17/10/2023
Field of study

We introduce a new class of data-driven and distribution-free optimistic-robust bimodal inventory optimization (BIO) strategy to effectively allocate inventory across a retail chain to meet time-varying, uncertain omnichannel demand. While prior Robust optimization (RO) methods emphasize the downside, i.e., worst-case adversarial demand, BIO also considers the upside to remain resilient like RO while also reaping the rewards of improved average-case performance by overcoming the presence of endogenous outliers. This bimodal strategy is particularly valuable for balancing the tradeoff between lost sales at the store and the costs of cross-channel e-commerce fulfillment, which is at the core of our inventory optimization model. These factors are asymmetric due to the heterogenous behavior of the channels, with a bias towards the former in terms of lost-sales cost and a dependence on network effects for the latter. We provide structural insights about the BIO solution and how it can be tuned to achieve a preferred tradeoff between robustness and the average-case. Our experiments show that significant benefits can be achieved by rethinking traditional approaches to inventory management, which are siloed by channel and location. Using a real-world dataset from a large American omnichannel retail chain, a business value assessment during a peak period indicates over a 15% profitability gain for BIO over RO and other baselines while also preserving the (practical) worst case performance

arXiv.org e-Print Archive

Hierarchy-guided Model Selection for Time Series Forecasting

Author: Ekambaram Vijay
Gifford Wesley M.
Harsha Pavithra
Jati Arindam
Mukherjee Sumanta
Narayanaswami Chandra
Pal Shaonli
Quanz Brian
Siegel Stuart
Publication venue
Publication date: 28/11/2022
Field of study

Generalizability of time series forecasting models depends on the quality of model selection. Temporal cross validation (TCV) is a standard technique to perform model selection in forecasting tasks. TCV sequentially partitions the training time series into train and validation windows, and performs hyperparameter optmization (HPO) of the forecast model to select the model with the best validation performance. Model selection with TCV often leads to poor test performance when the test data distribution differs from that of the validation data. We propose a novel model selection method, H-Pro that exploits the data hierarchy often associated with a time series dataset. Generally, the aggregated data at the higher levels of the hierarchy show better predictability and more consistency compared to the bottom-level data which is more sparse and (sometimes) intermittent. H-Pro performs the HPO of the lowest-level student model based on the test proxy forecasts obtained from a set of teacher models at higher levels in the hierarchy. The consistency of the teachers' proxy forecasts help select better student models at the lowest-level. We perform extensive empirical studies on multiple datasets to validate the efficacy of the proposed method. H-Pro along with off-the-shelf forecasting models outperform existing state-of-the-art forecasting methods including the winning models of the M5 point-forecasting competition

arXiv.org e-Print Archive

Quantum Multiple Kernel Learning in Financial Classification Tasks

Author: Alevras Dimitris
Brown Martin
Intallura Philip
Jackson Matthew D.
King Daniel J. M.
Mamouei Mohammad
Metcalf Mekena
Mitra Abhijit
Miyabe Shungo
Park Jae-Eun
Quanz Brian
Rastunkov Vladimir
Shimada Noriaki
Yamamoto Takahiro
Publication venue
Publication date: 30/11/2023
Field of study

Financial services is a prospect industry where unlocked near-term quantum utility could yield profitable potential, and, in particular, quantum machine learning algorithms could potentially benefit businesses by improving the quality of predictive models. Quantum kernel methods have demonstrated success in financial, binary classification tasks, like fraud detection, and avoid issues found in variational quantum machine learning approaches. However, choosing a suitable quantum kernel for a classical dataset remains a challenge. We propose a hybrid, quantum multiple kernel learning (QMKL) methodology that can improve classification quality over a single kernel approach. We test the robustness of QMKL on several financially relevant datasets using both fidelity and projected quantum kernel approaches. We further demonstrate QMKL on quantum hardware using an error mitigation pipeline and show the benefits of QMKL in the large qubit regime

arXiv.org e-Print Archive

The Importance of Thermal Emission Spectroscopy for Understanding Terrestrial Exoplanets

Author: Angerhausen Daniel
Blecic Jasmina
Caldwell Douglas A.
Danchi William
Defrere Denis
Dragomir Diana
Fortney Jonathan J.
Greene Tom
Hicks Brian
Iro Nicolas
Iyer Aishwarya
Kaltenegger Lisa
Kane Stephen R.
Kataria Tiffany
Kite Edwin S.
Lichtenberg Tim
Line Michael
Mawet Dimitri
Mennesson Bertrand
Monnier John
Morley Caroline
Ngo Henry
Quanz Sascha P.
Schwieterman Edward W.
Shkolnik Evgenya
Solmaz Arif
Staguhn Johannes
Stassun Keivan
Stevenson Kevin B.
Tremblay Luke
Valencia Diana
Wolf Eric T.
Zellem Robert
Publication venue: 'American Astronomical Society'
Publication date: 01/05/2019
Field of study

The primary objective of this white paper is to illustrate the importance of the thermal infrared in characterizing terrestrial planets leveraging our experience in characterizing extra-solar jovian worlds

Open Repository and Bibliography - Liège

A Realistic Roadmap to Formation Flying Space Interferometry

Author: Absil Oliver
Alicia Aarnio
Anugu Narsireddy
Baines Ellyn
Bayo Amelia
Berger Jean-Philippe
Cleeves L. Ilsedore
Dale Daniel
Danchi William
de Wit W. J.
Defrère Denis
Domagal-Goldman Shawn
Elvis Martin
Froebrich Dirk
Gai Mario
Gandhi Poshak
Garcia Paulo
Gardner Tyler
Gies Douglas
Gonzalez Jean-François
Gunter Brian
Hoenig Sebastian
Ireland Michael
Jorgensen Anders M.
Kishimoto Makoto
Klarmann Lucia
Kloppenborg Brian
Kluska Jacques
Knight J. Scott
Kral Quentin
Kraus Stefan
Labadie Lucas
Lawson Peter
LeBouquin Jean-Baptiste
Leisawitz David
Lightsey E.Glenn
Linz Hendrik
Lipscy Sarah
MacGregor Meredith
Matsuo Hiroshi
Mennesson Bertrand
Meyer Michael
Michael Ernest A.
Millour Florentin
Monnier John
Mozurkewich David
Norris Ryan
Ollivier Marc
Packham Chris
Petrov Romain
Pope Benjamin
Pueyo Laurent
Quanz Sascha
Ragland Sam
Rau Gioia
Regaly Zsolt
Riva Alberto
Roettenbacher Rachael
Savini Giorgio
Setterholm Benjamin
Sewilo Marta
Smith Michael
Spencer Locke
ten Brummelaar Theo
Turner Neal
van Belle Gerard
Weigelt Gerd
Wittkowski Markus
Publication venue: 'American Astronomical Society'
Publication date: 01/09/2019
Field of study

The ultimate astronomical observatory would be a formation flying space interferometer, combining sensitivity and stability with high angular resolution. The smallSat revolution offers a new and maturing prototyping platform for space interferometry and we put forward a realistic plan for achieving first stellar fringes in space by 2030

HAL Descartes

HAL-INSU

Kent Academic Repository