Search CORE

23,242 research outputs found

Stacked Generalization Approach to Improve Prediction of Molecular Atomization Energies

Author: Wang Ruobing
Publication venue
Publication date: 24/05/2018
Field of study

Machine learning holds the promise of learning the energy functional via examples, bypassing the need to solve complicated quantum-chemical equations and realizing efficient computing of molecular electronic properties.Comment: 15 pages, 4 Figur

arXiv.org e-Print Archive

Inductive machine learning of optimal modular structures: Estimating solutions using support vector machines

Author: Hanna S.
Publication venue
Publication date: 19/09/2007
Field of study

Structural optimization is usually handled by iterative methods requiring repeated samples of a physics-based model, but this process can be computationally demanding. Given a set of previously optimized structures of the same topology, this paper uses inductive learning to replace this optimization process entirely by deriving a function that directly maps any given load to an optimal geometry. A support vector machine is trained to determine the optimal geometry of individual modules of a space frame structure given a specified load condition. Structures produced by learning are compared against those found by a standard gradient descent optimization, both as individual modules and then as a composite structure. The primary motivation for this is speed, and results show the process is highly efficient for cases in which similar optimizations must be performed repeatedly. The function learned by the algorithm can approximate the result of optimization very closely after sufficient training, and has also been found effective at generalizing the underlying optima to produce structures that perform better than those found by standard iterative methods

High-Speed Tracking with Kernelized Correlation Filters

Author: Batista Jorge
Caseiro Rui
Henriques João F.
Martins Pedro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/11/2014
Field of study

The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies -- any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the Discrete Fourier Transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new Kernelized Correlation Filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call Dual Correlation Filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source

arXiv.org e-Print Archive

Recent Advance in Content-based Image Retrieval: A Literature Survey

Author: Li Houqiang
Tian Qi
Zhou Wengang
Publication venue
Publication date: 02/09/2017
Field of study

The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval. With the ignorance of visual content as a ranking clue, methods with text search techniques for visual retrieval may suffer inconsistency between the text words and visual content. Content-based image retrieval (CBIR), which makes use of the representation of visual content to identify relevant images, has attracted sustained attention in recent two decades. Such a problem is challenging due to the intention gap and the semantic gap problems. Numerous techniques have been developed for content-based image retrieval in the last decade. The purpose of this paper is to categorize and evaluate those algorithms proposed during the period of 2003 to 2016. We conclude with several promising directions for future research.Comment: 22 page

arXiv.org e-Print Archive

Deep Learning and its Application to LHC Physics

Author: Cranmer Kyle
Guest Dan
Whiteson Daniel
Publication venue: 'Annual Reviews'
Publication date: 29/06/2018
Field of study

Machine learning has played an important role in the analysis of high-energy physics data for decades. The emergence of deep learning in 2012 allowed for machine learning tools which could adeptly handle higher-dimensional and more complex problems than previously feasible. This review is aimed at the reader who is familiar with high energy physics but not machine learning. The connections between machine learning and high energy physics data analysis are explored, followed by an introduction to the core concepts of neural networks, examples of the key results demonstrating the power of deep learning for analysis of LHC data, and discussion of future prospects and concerns.Comment: Posted with permission from the Annual Review of Nuclear and Particle Science, Volume 68. (c) 2018 by Annual Reviews, http://www.annualreviews.or

arXiv.org e-Print Archive

3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

Author: Daniilidis Kostas
Hu Xiaoyan
Leonardos Spyridon
Zhou Xiaowei
Publication venue
Publication date: 01/06/2015
Field of study

We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image. To alleviate the reconstruction ambiguity, a widely-used approach is to confine the unknown 3D shape within a shape space built upon existing shapes. While this approach has proven to be successful in various applications, a challenging issue remains, i.e., the joint estimation of shape parameters and camera-pose parameters requires to solve a nonconvex optimization problem. The existing methods often adopt an alternating minimization scheme to locally update the parameters, and consequently the solution is sensitive to initialization. In this paper, we propose a convex formulation to address this problem and develop an efficient algorithm to solve the proposed convex program. We demonstrate the exact recovery property of the proposed method, its merits compared to alternative methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201

arXiv.org e-Print Archive

A Triangle Algorithm for Semidefinite Version of Convex Hull Membership Problem

Author: Kalantari Bahman
Publication venue
Publication date: 19/05/2019
Field of study

Given a subset

\mathbf{S}=\{A_1, \dots, A_m\}

\mathbb{S}^n

, the set of

n \times n

real symmetric matrices, we define its {\it spectrahull} as the set

SH(\mathbf{S}) = \{p(X) \equiv (Tr(A_1 X), \dots, Tr(A_m X))^T : X \in \mathbf{\Delta}_n\}

, where

{\bf \Delta}_n

is the {\it spectraplex},

\{ X \in \mathbb{S}^n : Tr(X)=1, X \succeq 0 \}

. We let {\it spectrahull membership} (SHM) to be the problem of testing if a given

b \in \mathbb{R}^m

lies in

SH(\mathbf{S})

. On the one hand when

A_i

's are diagonal matrices, SHM reduces to the {\it convex hull membership} (CHM), a fundamental problem in LP. On the other hand, a bounded SDP feasibility is reducible to SHM. By building on the {\it Triangle Algorithm} (TA) \cite{kalchar,kalsep}, developed for CHM and its generalization, we design a TA for SHM, where given

\varepsilon

, in

O(1/\varepsilon^2)

iterations it either computes a hyperplane separating

b

from

SH(\mathbf{S})

, or

X_\varepsilon \in \mathbf{\Delta}_n

such that

\Vert p(X_\varepsilon) - b \Vert \leq \varepsilon R

R

maximum error over

\mathbf{\Delta}_n

. Under certain conditions iteration complexity improves to

O(1/\varepsilon)

or even

O(\ln 1/\varepsilon)

. The worst-case complexity of each iteration is

O(mn^2)

, plus testing the existence of a pivot, shown to be equivalent to estimating the least eigenvalue of a symmetric matrix. This together with a semidefinite version of Carath\'eodory theorem allow implementing TA as if solving a CHM, resorting to the {\it power method} only as needed, thereby improving the complexity of iterations. The proposed Triangle Algorithm for SHM is simple, practical and applicable to general SDP feasibility and optimization. Also, it extends to a spectral analogue of SVM for separation of two spectrahulls.Comment: 18 page

arXiv.org e-Print Archive

Identification of functionally related enzymes by learning-to-rank methods

Author: Airola Antti
De Baets Bernard
Fober Thomas
Glinca Serghei
Hüllermeier Eyke
Klebe Gerhard
Pahikkala Tapio
Stock Michiel
Waegeman Willem
Publication venue
Publication date: 01/01/2014
Field of study

Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes

arXiv.org e-Print Archive

Persistent-Homology-based Machine Learning and its Applications -- A Survey

Author: Lee Si Xian
Pun Chi Seng
Xia Kelin
Publication venue
Publication date: 01/11/2018
Field of study

A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplification and intrinsic structure characterization, and has been applied to various areas successfully. However, the combination of PH and machine learning has been hindered greatly by three challenges, namely topological representation of data, PH-based distance measurements or metrics, and PH-based feature representation. With the development of topological data analysis, progresses have been made on all these three problems, but widely scattered in different literatures. In this paper, we provide a systematical review of PH and PH-based supervised and unsupervised models from a computational perspective. Our emphasizes are the recent development of mathematical models and tools, including PH softwares and PH-based functions, feature representations, kernels, and similarity models. Essentially, this paper can work as a roadmap for the practical application of PH-based machine learning tools. Further, we consider different topological feature representations in different machine learning models, and investigate their impacts on the protein secondary structure classification.Comment: 42 pages; 6 figures; 9 table

arXiv.org e-Print Archive

A Distributed Approach towards Discriminative Distance Metric Learning

Author: Li Jun
Lin Xun
Rui Xiaoguang
Rui Yong
Tao Dacheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/05/2019
Field of study

Distance metric learning is successful in discovering intrinsic relations in data. However, most algorithms are computationally demanding when the problem size becomes large. In this paper, we propose a discriminative metric learning algorithm, and develop a distributed scheme learning metrics on moderate-sized subsets of data, and aggregating the results into a global solution. The technique leverages the power of parallel computation. The algorithm of the aggregated distance metric learning (ADML) scales well with the data size and can be controlled by the partition. We theoretically analyse and provide bounds for the error induced by the distributed treatment. We have conducted experimental evaluation of ADML, both on specially designed tests and on practical image annotation tasks. Those tests have shown that ADML achieves the state-of-the-art performance at only a fraction of the cost incurred by most existing methods

arXiv.org e-Print Archive