9,443 research outputs found
Improved sign-based learning algorithm derived by the composite nonlinear Jacobi process
In this paper a globally convergent first-order training algorithm is proposed that uses sign-based information of the batch error measure in the framework of the nonlinear Jacobi process. This approach allows us to equip the recently proposed JacobiâRprop method with the global convergence property, i.e. convergence to a local minimizer from any initial starting point. We also propose a strategy that ensures the search direction of the globally convergent JacobiâRprop is a descent one. The behaviour of the algorithm is empirically investigated in eight benchmark problems. Simulation results verify that there are indeed improvements on the convergence success of the algorithm
Supervised learning with quantum enhanced feature spaces
Machine learning and quantum computing are two technologies each with the
potential for altering how computation is performed to address previously
untenable problems. Kernel methods for machine learning are ubiquitous for
pattern recognition, with support vector machines (SVMs) being the most
well-known method for classification problems. However, there are limitations
to the successful solution to such problems when the feature space becomes
large, and the kernel functions become computationally expensive to estimate. A
core element to computational speed-ups afforded by quantum algorithms is the
exploitation of an exponentially large quantum state space through controllable
entanglement and interference. Here, we propose and experimentally implement
two novel methods on a superconducting processor. Both methods represent the
feature space of a classification problem by a quantum state, taking advantage
of the large dimensionality of quantum Hilbert space to obtain an enhanced
solution. One method, the quantum variational classifier builds on [1,2] and
operates through using a variational quantum circuit to classify a training set
in direct analogy to conventional SVMs. In the second, a quantum kernel
estimator, we estimate the kernel function and optimize the classifier
directly. The two methods present a new class of tools for exploring the
applications of noisy intermediate scale quantum computers [3] to machine
learning.Comment: Fixed typos, added figures and discussion about quantum error
mitigatio
Automatic categorization of diverse experimental information in the bioscience literature
Background:
Curation of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple, readily automated procedure to utilize training papers of similar data types from different bodies of literature such as C. elegans and D. melanogaster to identify papers with any of these data types for a single database. This approach has great significance because for some data types, especially those of low occurrence, a single corpus often does not have enough training papers to achieve satisfactory performance.
Results:
We successfully tested the method on ten data types from WormBase, fifteen data types from FlyBase and three data types from Mouse Genomics Informatics (MGI). It is being used in the curation work flow at WormBase for automatic association of newly published papers with ten data types including RNAi, antibody, phenotype, gene regulation, mutant allele sequence, gene expression, gene product interaction, overexpression phenotype, gene interaction, and gene structure correction.
Conclusions:
Our methods are applicable to a variety of data types with training set containing several hundreds to a few thousand documents. It is completely automatic and, thus can be readily incorporated to different workflow at different literature-based databases. We believe that the work presented here can contribute greatly to the tremendous task of automating the important yet labor-intensive biocuration effort
Effective Classification using a small Training Set based on Discretization and Statistical Analysis
This work deals with the problem of producing a fast and accurate data classification, learning it from a possibly small set of records that are already classified. The proposed approach is based on the framework of the so-called Logical Analysis of Data (LAD), but enriched with information obtained from statistical considerations on the data. A number of discrete optimization problems are solved in the different steps of the procedure, but their computational demand can be controlled. The accuracy of the proposed approach is compared to that of the standard LAD algorithm, of Support Vector Machines and of Label Propagation algorithm on publicly available datasets of the UCI repository. Encouraging results are obtained and discusse
Recommended from our members
Incremental learning of independent, overlapping, and graded concept descriptions with an instance-based process framework
Supervised learning algorithms make several simplifying assumptions concerning the characteristics of the concept descriptions to be learned. For example, concepts are often assumed to be (1) defined with respect to the same set of relevant attributes, (2) disjoint in instance space, and (3) have uniform instance distributions. While these assumptions constrain the learning task, they unfortunately limit an algorithm's applicability. We believe that supervised learning algorithms should learn attribute relevancies independently for each concept, allow instances to be members of any subset of concepts, and represent graded concept descriptions. This paper introduces a process framework for instance-based learning algorithms that exploit only specific instance and performance feedback information to guide their concept learning processes. We also introduce Bloom, a specific instantiation of this framework. Bloom is a supervised, incremental, instance-based learning algorithm that learns relative attribute relevancies independently for each concept, allows instances to be members of any subset of concepts, and represents graded concept memberships. We describe empirical evidence to support our claims that Bloom can learn independent, overlapping, and graded concept descriptions
Evolving stochastic learning algorithm based on Tsallis entropic index
In this paper, inspired from our previous algorithm, which was based on the theory of Tsallis statistical mechanics, we develop a new evolving stochastic learning algorithm for neural networks. The new algorithm combines deterministic and stochastic search steps by employing a different adaptive stepsize for each network weight, and applies a form of noise that is characterized by the nonextensive entropic index q, regulated by a weight decay term. The behavior of the learning algorithm can be made more stochastic or deterministic depending on the trade off between the temperature T and the q values. This is achieved by introducing a formula that defines a time-dependent relationship between these two important learning parameters. Our experimental study verifies that there are indeed improvements in the convergence speed of this new evolving stochastic learning algorithm, which makes learning faster than using the original Hybrid Learning Scheme (HLS). In addition, experiments are conducted to explore the influence of the entropic index q and temperature T on the convergence speed and stability of the proposed method
- âŠ