Search CORE

384 research outputs found

Sparser Johnson-Lindenstrauss Transforms

Author: Kane Daniel M.
Nelson Jelani
Publication venue
Publication date: 05/02/2014
Field of study

We give two different and simple constructions for dimensionality reduction in

\ell_2

via linear mappings that are sparse: only an

O(\varepsilon)

-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion

1+\varepsilon

with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas (JCSS 2003) and Dasgupta, Kumar, and Sarl\'{o}s (STOC 2010). Such distributions can be used to speed up applications where

\ell_2

dimensionality reduction is used.Comment: v6: journal version, minor changes, added Remark 23; v5: modified abstract, fixed typos, added open problem section; v4: simplified section 4 by giving 1 analysis that covers both constructions; v3: proof of Theorem 25 in v2 was written incorrectly, now fixed; v2: Added another construction achieving same upper bound, and added proof of near-tight lower bound for DKS schem

arXiv.org e-Print Archive

Sequential projection pursuit for optimal transformation of autoregressive coefficients for damage detection in an experimental wind turbine blade

Author: Hoell Simon
Omenzetter Piotr
Publication venue: 'Elsevier BV'
Publication date: 12/09/2017
Field of study

Peer reviewedPublisher PD

Isometric sketching of any set via the Restricted Isometry Property

Author: Oymak Samet
Recht Benjamin
Soltanolkotabi Mahdi
Publication venue
Publication date: 06/10/2015
Field of study

In this paper we show that for the purposes of dimensionality reduction certain class of structured random matrices behave similarly to random Gaussian matrices. This class includes several matrices for which matrix-vector multiply can be computed in log-linear time, providing efficient dimensionality reduction of general sets. In particular, we show that using such matrices any set from high dimensions can be embedded into lower dimensions with near optimal distortion. We obtain our results by connecting dimensionality reduction of any set to dimensionality reduction of sparse vectors via a chaining argument.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

Building Gene Expression Profile Classifiers with a Simple and Efficient Rejection Option in R

Author: Benso Alfredo
Di Carlo Stefano
Politano Gianfranco Michele Maria
Savino Alessandro
Ur Rehman Hafeez
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: The collection of gene expression profiles from DNA microarrays and their analysis with pattern recognition algorithms is a powerful technology applied to several biological problems. Common pattern recognition systems classify samples assigning them to a set of known classes. However, in a clinical diagnostics setup, novel and unknown classes (new pathologies) may appear and one must be able to reject those samples that do not fit the trained model. The problem of implementing a rejection option in a multi-class classifier has not been widely addressed in the statistical literature. Gene expression profiles represent a critical case study since they suffer from the curse of dimensionality problem that negatively reflects on the reliability of both traditional rejection models and also more recent approaches such as one-class classifiers. Results: This paper presents a set of empirical decision rules that can be used to implement a rejection option in a set of multi-class classifiers widely used for the analysis of gene expression profiles. In particular, we focus on the classifiers implemented in the R Language and Environment for Statistical Computing (R for short in the remaining of this paper). The main contribution of the proposed rules is their simplicity, which enables an easy integration with available data analysis environments. Since in the definition of a rejection model tuning of the involved parameters is often a complex and delicate task, in this paper we exploit an evolutionary strategy to automate this process. This allows the final user to maximize the rejection accuracy with minimum manual intervention. Conclusions: This paper shows how the use of simple decision rules can be used to help the use of complex machine learning algorithms in real experimental setups. The proposed approach is almost completely automated and therefore a good candidate for being integrated in data analysis flows in labs where the machine learning expertise required to tune traditional classifiers might not be availabl

Springer - Publisher Connector

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Deterministic parallel algorithms for bilinear objective functions

Author: Harris David G.
Publication venue
Publication date: 21/06/2018
Field of study

Many randomized algorithms can be derandomized efficiently using either the method of conditional expectations or probability spaces with low independence. A series of papers, beginning with work by Luby (1988), showed that in many cases these techniques can be combined to give deterministic parallel (NC) algorithms for a variety of combinatorial optimization problems, with low time- and processor-complexity. We extend and generalize a technique of Luby for efficiently handling bilinear objective functions. One noteworthy application is an NC algorithm for maximal independent set. On a graph

G

with

m

edges and

n

vertices, this takes

\tilde O(\log^2 n)

time and

(m + n) n^{o(1)}

processors, nearly matching the best randomized parallel algorithms. Other applications include reduced processor counts for algorithms of Berger (1997) for maximum acyclic subgraph and Gale-Berlekamp switching games. This bilinear factorization also gives better algorithms for problems involving discrepancy. An important application of this is to automata-fooling probability spaces, which are the basis of a notable derandomization technique of Sivakumar (2002). Our method leads to large reduction in processor complexity for a number of derandomization algorithms based on automata-fooling, including set discrepancy and the Johnson-Lindenstrauss Lemma

arXiv.org e-Print Archive

Increasing pattern recognition accuracy for chemical sensing by evolutionary based drift compensation

Author: A. Scionti
A. Tonda
Aliwell
Artursson
Chen
Di Carlo
Duda
E. Sanchez
Falasconi
G. Squillero
Gobbi
Gutierrez-Osuna
Hansen
Hansen
Hansen
Haugen
Hines
Hui
Ionescu
Kuhn
Llobet
M. Falasconi
Marco
Natale
Nelli
Owens
Padilla
Pardo
Pearce
Polster
S. Di Carlo
Sharma
Sisk
Tibshirani
Tomic
Vlachos
Zuppa
Zuppa
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

Artificial olfaction systems, which mimic human olfaction by using arrays of gas chemical sensors combined with pattern recognition methods, represent a potentially low-cost tool in many areas of industry such as perfumery, food and drink production, clinical diagnosis, health and safety, environmental monitoring and process control. However, successful applications of these systems are still largely limited to specialized laboratories. Sensor drift, i.e., the lack of a sensor's stability over time, still limits real in dustrial setups. This paper presents and discusses an evolutionary based adaptive drift-correction method designed to work with state-of-the-art classification systems. The proposed approach exploits a cutting-edge evolutionary strategy to iteratively tweak the coefficients of a linear transformation which can transparently correct raw sensors' measures thus mitigating the negative effects of the drift. The method learns the optimal correction strategy without the use of models or other hypotheses on the behavior of the physical chemical sensors

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Hal-Diderot

PORTO Publications Open Repository TOrino

High-dimensional Black-box Optimization via Divide and Approximate Conquer

Author: Tang Ke
Yang Peng
Yao Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/03/2016
Field of study

Divide and Conquer (DC) is conceptually well suited to high-dimensional optimization by decomposing a problem into multiple small-scale sub-problems. However, appealing performance can be seldom observed when the sub-problems are interdependent. This paper suggests that the major difficulty of tackling interdependent sub-problems lies in the precise evaluation of a partial solution (to a sub-problem), which can be overwhelmingly costly and thus makes sub-problems non-trivial to conquer. Thus, we propose an approximation approach, named Divide and Approximate Conquer (DAC), which reduces the cost of partial solution evaluation from exponential time to polynomial time. Meanwhile, the convergence to the global optimum (of the original problem) is still guaranteed. The effectiveness of DAC is demonstrated empirically on two sets of non-separable high-dimensional problems.Comment: 7 pages, 2 figures, conferenc

arXiv.org e-Print Archive