96,742 research outputs found
Decision table for classifying point sources based on FIRST and 2MASS databases
With the availability of multiwavelength, multiscale and multiepoch
astronomical catalogues, the number of features to describe astronomical
objects has increases. The better features we select to classify objects, the
higher the classification accuracy is. In this paper, we have used data sets of
stars and quasars from near infrared band and radio band. Then best-first
search method was applied to select features. For the data with selected
features, the algorithm of decision table was implemented. The classification
accuracy is more than 95.9%. As a result, the feature selection method improves
the effectiveness and efficiency of the classification method. Moreover the
result shows that decision table is robust and effective for discrimination of
celestial objects and used for preselecting quasar candidates for large survey
projects.Comment: 10 pages. accepted by Advances in Space Researc
Fast Selection of Spectral Variables with B-Spline Compression
The large number of spectral variables in most data sets encountered in
spectral chemometrics often renders the prediction of a dependent variable
uneasy. The number of variables hopefully can be reduced, by using either
projection techniques or selection methods; the latter allow for the
interpretation of the selected variables. Since the optimal approach of testing
all possible subsets of variables with the prediction model is intractable, an
incremental selection approach using a nonparametric statistics is a good
option, as it avoids the computationally intensive use of the model itself. It
has two drawbacks however: the number of groups of variables to test is still
huge, and colinearities can make the results unstable. To overcome these
limitations, this paper presents a method to select groups of spectral
variables. It consists in a forward-backward procedure applied to the
coefficients of a B-Spline representation of the spectra. The criterion used in
the forward-backward procedure is the mutual information, allowing to find
nonlinear dependencies between variables, on the contrary of the generally used
correlation. The spline representation is used to get interpretability of the
results, as groups of consecutive spectral variables will be selected. The
experiments conducted on NIR spectra from fescue grass and diesel fuels show
that the method provides clearly identified groups of selected variables,
making interpretation easy, while keeping a low computational load. The
prediction performances obtained using the selected coefficients are higher
than those obtained by the same method applied directly to the original
variables and similar to those obtained using traditional models, although
using significantly less spectral variables
Quadratic Projection Based Feature Extraction with Its Application to Biometric Recognition
This paper presents a novel quadratic projection based feature extraction
framework, where a set of quadratic matrices is learned to distinguish each
class from all other classes. We formulate quadratic matrix learning (QML) as a
standard semidefinite programming (SDP) problem. However, the con- ventional
interior-point SDP solvers do not scale well to the problem of QML for
high-dimensional data. To solve the scalability of QML, we develop an efficient
algorithm, termed DualQML, based on the Lagrange duality theory, to extract
nonlinear features. To evaluate the feasibility and effectiveness of the
proposed framework, we conduct extensive experiments on biometric recognition.
Experimental results on three representative biometric recogni- tion tasks,
including face, palmprint, and ear recognition, demonstrate the superiority of
the DualQML-based feature extraction algorithm compared to the current
state-of-the-art algorithm
A generic optimising feature extraction method using multiobjective genetic programming
In this paper, we present a generic, optimising feature extraction method using multiobjective genetic programming. We re-examine the feature extraction problem and show that effective feature extraction can significantly enhance the performance of pattern recognition systems with simple classifiers. A framework is presented to evolve optimised feature extractors that transform an input pattern space into a decision space in which maximal class separability is obtained. We have applied this method to real world datasets from the UCI Machine Learning and StatLog databases to verify our approach and compare our proposed method with other reported results. We conclude that our algorithm is able to produce classifiers of superior (or equivalent) performance to the conventional classifiers examined, suggesting removal of the need to exhaustively evaluate a large family of conventional classifiers on any new problem. (C) 2010 Elsevier B.V. All rights reserved
Repository-based plasmid design
There was an explosion in the amount of commercially available DNA in sequence repositories over the last decade. The number of such plasmids increased from 12,000 to over 300,000 among three of the largest repositories: iGEM, Addgene, and DNASU. A challenge in biodesign remains how to use these and other repository-based sequences effectively, correctly, and seamlessly. This work describes an approach to plasmid design where a plasmid is specified as simply a DNA sequence or list of features. The proposed software then finds the most cost-effective combination of synthetic and PCR-prepared repository fragments to build the plasmid via Gibson assemblyÂź. It finds existing DNA sequences in both user-specified and public DNA databases: iGEM, Addgene, and DNASU. Such a software application is introduced and characterized against all post-2005 iGEM composite parts and all Addgene vectors submitted in 2018 and found to reduce costs by 34% versus a purely synthetic plasmid design approach. The described software will improve current plasmid assembly workflows by shortening design times, improving build quality, and reducing costs.Accepted manuscrip
Interactive retrieval of video using pre-computed shot-shot similarities
A probabilistic framework for content-based interactive video retrieval is described. The developed indexing of video fragments originates from the probability of the user's positive judgment about key-frames of video shots. Initial estimates of the probabilities are obtained from low-level feature representation. Only statistically significant estimates are picked out, the rest are replaced by an appropriate constant allowing efficient access at search time without loss of search quality and leading to improvement in most experiments. With time, these probability estimates are updated from the relevance judgment of users performing searches, resulting in further substantial increases in mean average precision
- âŠ