2,327 research outputs found
Random Relational Rules
Exhaustive search in relational learning is generally infeasible, therefore some form of heuristic search is usually employed, such as in FOIL[1]. On the other hand, so-called stochastic discrimination provides a framework for combining arbitrary numbers of weak classifiers (in this case randomly generated relational rules) in a way where accuracy improves with additional rules, even after maximal accuracy on the training data has been reached. [2] The weak classifiers must have a slightly higher probability of covering instances of their target class than of other classes. As the rules are also independent and identically distributed, the Central Limit theorem applies and as the number of weak classifiers/rules grows, coverages for different classes resemble well-separated normal distributions. Stochastic discrimination is closely related to other ensemble methods like Bagging, Boosting, or Random forests, all of which have been tried in relational learning [3, 4, 5]
Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models
Background: In this paper we present the approaches and methods employed in
order to deal with a large scale multi-label semantic indexing task of
biomedical papers. This work was mainly implemented within the context of the
BioASQ challenge of 2014. Methods: The main contribution of this work is a
multi-label ensemble method that incorporates a McNemar statistical
significance test in order to validate the combination of the constituent
machine learning algorithms. Some secondary contributions include a study on
the temporal aspects of the BioASQ corpus (observations apply also to the
BioASQ's super-set, the PubMed articles collection) and the proper adaptation
of the algorithms used to deal with this challenging classification task.
Results: The ensemble method we developed is compared to other approaches in
experimental scenarios with subsets of the BioASQ corpus giving positive
results. During the BioASQ 2014 challenge we obtained the first place during
the first batch and the third in the two following batches. Our success in the
BioASQ challenge proved that a fully automated machine-learning approach, which
does not implement any heuristics and rule-based approaches, can be highly
competitive and outperform other approaches in similar challenging contexts
Hierarchical Multi-resolution Mesh Networks for Brain Decoding
We propose a new framework, called Hierarchical Multi-resolution Mesh
Networks (HMMNs), which establishes a set of brain networks at multiple time
resolutions of fMRI signal to represent the underlying cognitive process. The
suggested framework, first, decomposes the fMRI signal into various frequency
subbands using wavelet transforms. Then, a brain network, called mesh network,
is formed at each subband by ensembling a set of local meshes. The locality
around each anatomic region is defined with respect to a neighborhood system
based on functional connectivity. The arc weights of a mesh are estimated by
ridge regression formed among the average region time series. In the final
step, the adjacency matrices of mesh networks obtained at different subbands
are ensembled for brain decoding under a hierarchical learning architecture,
called, fuzzy stacked generalization (FSG). Our results on Human Connectome
Project task-fMRI dataset reflect that the suggested HMMN model can
successfully discriminate tasks by extracting complementary information
obtained from mesh arc weights of multiple subbands. We study the topological
properties of the mesh networks at different resolutions using the network
measures, namely, node degree, node strength, betweenness centrality and global
efficiency; and investigate the connectivity of anatomic regions, during a
cognitive task. We observe significant variations among the network topologies
obtained for different subbands. We, also, analyze the diversity properties of
classifier ensemble, trained by the mesh networks in multiple subbands and
observe that the classifiers in the ensemble collaborate with each other to
fuse the complementary information freed at each subband. We conclude that the
fMRI data, recorded during a cognitive task, embed diverse information across
the anatomic regions at each resolution.Comment: 18 page
Semantics as a gateway to language
This paper presents an account of semantics as a system that integrates conceptual representations into language. I define the semantic system as an interface level of the conceptual system CS that translates conceptual representations into a format that is accessible by language. The analysis I put forward does not treat the make up of this level as idiosyncratic, but subsumes it under a unified notion of linguistic interfaces. This allows us to understand core aspects of the linguistic-conceptual interface as an instance of a general pattern underlying the correlation of linguistic and non-linguistic structures. By doing so, the model aims to provide a broader perspective onto the distinction between and interaction of conceptual and linguistic processes and the correlation of semantic and syntactic structures
EC3: Combining Clustering and Classification for Ensemble Learning
Classification and clustering algorithms have been proved to be successful
individually in different contexts. Both of them have their own advantages and
limitations. For instance, although classification algorithms are more powerful
than clustering methods in predicting class labels of objects, they do not
perform well when there is a lack of sufficient manually labeled reliable data.
On the other hand, although clustering algorithms do not produce label
information for objects, they provide supplementary constraints (e.g., if two
objects are clustered together, it is more likely that the same label is
assigned to both of them) that one can leverage for label prediction of a set
of unknown objects. Therefore, systematic utilization of both these types of
algorithms together can lead to better prediction performance. In this paper,
We propose a novel algorithm, called EC3 that merges classification and
clustering together in order to support both binary and multi-class
classification. EC3 is based on a principled combination of multiple
classification and multiple clustering methods using an optimization function.
We theoretically show the convexity and optimality of the problem and solve it
by block coordinate descent method. We additionally propose iEC3, a variant of
EC3 that handles imbalanced training data. We perform an extensive experimental
analysis by comparing EC3 and iEC3 with 14 baseline methods (7 well-known
standalone classifiers, 5 ensemble classifiers, and 2 existing methods that
merge classification and clustering) on 13 standard benchmark datasets. We show
that our methods outperform other baselines for every single dataset, achieving
at most 10% higher AUC. Moreover our methods are faster (1.21 times faster than
the best baseline), more resilient to noise and class imbalance than the best
baseline method.Comment: 14 pages, 7 figures, 11 table
- …