10,895 research outputs found
SkILL - a Stochastic Inductive Logic Learner
Probabilistic Inductive Logic Programming (PILP) is a rel- atively unexplored
area of Statistical Relational Learning which extends classic Inductive Logic
Programming (ILP). This work introduces SkILL, a Stochastic Inductive Logic
Learner, which takes probabilistic annotated data and produces First Order
Logic theories. Data in several domains such as medicine and bioinformatics
have an inherent degree of uncer- tainty, that can be used to produce models
closer to reality. SkILL can not only use this type of probabilistic data to
extract non-trivial knowl- edge from databases, but it also addresses
efficiency issues by introducing a novel, efficient and effective search
strategy to guide the search in PILP environments. The capabilities of SkILL
are demonstrated in three dif- ferent datasets: (i) a synthetic toy example
used to validate the system, (ii) a probabilistic adaptation of a well-known
biological metabolism ap- plication, and (iii) a real world medical dataset in
the breast cancer domain. Results show that SkILL can perform as well as a
deterministic ILP learner, while also being able to incorporate probabilistic
knowledge that would otherwise not be considered
Random Relational Rules
Exhaustive search in relational learning is generally infeasible, therefore some form of heuristic search is usually employed, such as in FOIL[1]. On the other hand, so-called stochastic discrimination provides a framework for combining arbitrary numbers of weak classifiers (in this case randomly generated relational rules) in a way where accuracy improves with additional rules, even after maximal accuracy on the training data has been reached. [2] The weak classifiers must have a slightly higher probability of covering instances of their target class than of other classes. As the rules are also independent and identically distributed, the Central Limit theorem applies and as the number of weak classifiers/rules grows, coverages for different classes resemble well-separated normal distributions. Stochastic discrimination is closely related to other ensemble methods like Bagging, Boosting, or Random forests, all of which have been tried in relational learning [3, 4, 5]
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
Machine Learning with Abstention for Automated Liver Disease Diagnosis
This paper presents a novel approach for detection of liver abnormalities in
an automated manner using ultrasound images. For this purpose, we have
implemented a machine learning model that can not only generate labels (normal
and abnormal) for a given ultrasound image but it can also detect when its
prediction is likely to be incorrect. The proposed model abstains from
generating the label of a test example if it is not confident about its
prediction. Such behavior is commonly practiced by medical doctors who, when
given insufficient information or a difficult case, can chose to carry out
further clinical or diagnostic tests before generating a diagnosis. However,
existing machine learning models are designed in a way to always generate a
label for a given example even when the confidence of their prediction is low.
We have proposed a novel stochastic gradient based solver for the learning with
abstention paradigm and use it to make a practical, state of the art method for
liver disease classification. The proposed method has been benchmarked on a
data set of approximately 100 patients from MINAR, Multan, Pakistan and our
results show that the proposed scheme offers state of the art classification
performance.Comment: Preprint version before submission for publication. complete version
published in proc. 15th International Conference on Frontiers of Information
Technology (FIT 2017), December 18-20, 2017, Islamabad, Pakistan.
http://ieeexplore.ieee.org/document/8261064
Distributed classifier based on genetically engineered bacterial cell cultures
We describe a conceptual design of a distributed classifier formed by a
population of genetically engineered microbial cells. The central idea is to
create a complex classifier from a population of weak or simple classifiers. We
create a master population of cells with randomized synthetic biosensor
circuits that have a broad range of sensitivities towards chemical signals of
interest that form the input vectors subject to classification. The randomized
sensitivities are achieved by constructing a library of synthetic gene circuits
with randomized control sequences (e.g. ribosome-binding sites) in the front
element. The training procedure consists in re-shaping of the master population
in such a way that it collectively responds to the "positive" patterns of input
signals by producing above-threshold output (e.g. fluorescent signal), and
below-threshold output in case of the "negative" patterns. The population
re-shaping is achieved by presenting sequential examples and pruning the
population using either graded selection/counterselection or by
fluorescence-activated cell sorting (FACS). We demonstrate the feasibility of
experimental implementation of such system computationally using a realistic
model of the synthetic sensing gene circuits.Comment: 31 pages, 9 figure
- …