76,190 research outputs found
Named entity recognition using a new fuzzy support vector machine.
Recognizing and extracting exact name entities, like Persons, Locations, Organizations, Dates and Times are very useful to mining information from electronics resources and text. Learning to extract these types of data is called Named Entity Recognition(NER) task. Proper named entity recognition and extraction is important to solve most problems in hot research area such as Question Answering and Summarization Systems, Information Retrieval and Information Extraction, Machine Translation, Video Annotation, Semantic Web Search and Bioinformatics,
especially Gene identification, proteins and DNAs
names. Nowadays more researchers use three type of approaches namely, Rule-base NER, Machine Learning-base NER and Hybrid NER to identify names. Machine learning method is
more famous and applicable than others, because it’s more
portable and domain independent. Some of the Machine learning algorithms used in NER methods are, support vector machine(SVM), Hidden Markov Model, Maximum Entropy Model
(MEM) and Decision Tree. In this paper, we review these
methods and compare them based on precision in recognition and also portability using the Message Understanding Conference(MUC) named entity definition and its standard data set to find their strength and weakness of each these methods. We have improved the precision in NER from text using the new proposed method that calls FSVM for NER. In our method we have employed Support Vector Machine as one of the best machine learning algorithm for classification and we contribute a new fuzzy membership function thus removing the Support Vector Machine’s weakness points in NER precision and multi classification. The design of our method is a kind of One-Against-All multi classification technique to solve the traditional binary classifier in SVM
Building Gene Expression Profile Classifiers with a Simple and Efficient Rejection Option in R
Background: The collection of gene expression profiles from DNA microarrays and their analysis with pattern recognition algorithms is a powerful technology applied to several biological problems. Common pattern recognition systems classify samples assigning them to a set of known classes. However, in a clinical diagnostics setup, novel and unknown classes (new pathologies) may appear and one must be able to reject those samples that do not fit the trained model. The problem of implementing a rejection option in a multi-class classifier has not been widely addressed in the statistical literature. Gene expression profiles represent a critical case study since they suffer from the curse of dimensionality problem that negatively reflects on the reliability of both traditional rejection models and also more recent approaches such as one-class classifiers. Results: This paper presents a set of empirical decision rules that can be used to implement a rejection option in a set of multi-class classifiers widely used for the analysis of gene expression profiles. In particular, we focus on the classifiers implemented in the R Language and Environment for Statistical Computing (R for short in the remaining of this paper). The main contribution of the proposed rules is their simplicity, which enables an easy integration with available data analysis environments. Since in the definition of a rejection model tuning of the involved parameters is often a complex and delicate task, in this paper we exploit an evolutionary strategy to automate this process. This allows the final user to maximize the rejection accuracy with minimum manual intervention. Conclusions: This paper shows how the use of simple decision rules can be used to help the use of complex machine learning algorithms in real experimental setups. The proposed approach is almost completely automated and therefore a good candidate for being integrated in data analysis flows in labs where the machine learning expertise required to tune traditional classifiers might not be availabl
Assessing similarity of feature selection techniques in high-dimensional domains
Recent research efforts attempt to combine multiple feature selection techniques instead of using a single one. However, this combination is often made on an “ad hoc” basis, depending on the specific problem at hand, without considering the degree of diversity/similarity of the involved methods. Moreover, though it is recognized that different techniques may return quite dissimilar outputs, especially in high dimensional/small sample size domains, few direct comparisons exist that quantify these differences and their implications on classification performance. This paper aims to provide a contribution in this direction by proposing a general methodology for assessing the similarity between the outputs of different feature selection methods in high dimensional classification problems. Using as benchmark the genomics domain, an empirical study has been conducted to compare some of the most popular feature selection methods, and useful insight has been obtained about their pattern of agreement
EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch
Designing the structure of neural networks is considered one of the most
challenging tasks in deep learning, especially when there is few prior
knowledge about the task domain. In this paper, we propose an
Ecologically-Inspired GENetic (EIGEN) approach that uses the concept of
succession, extinction, mimicry, and gene duplication to search neural network
structure from scratch with poorly initialized simple network and few
constraints forced during the evolution, as we assume no prior knowledge about
the task domain. Specifically, we first use primary succession to rapidly
evolve a population of poorly initialized neural network structures into a more
diverse population, followed by a secondary succession stage for fine-grained
searching based on the networks from the primary succession. Extinction is
applied in both stages to reduce computational cost. Mimicry is employed during
the entire evolution process to help the inferior networks imitate the behavior
of a superior network and gene duplication is utilized to duplicate the learned
blocks of novel structures, both of which help to find better network
structures. Experimental results show that our proposed approach can achieve
similar or better performance compared to the existing genetic approaches with
dramatically reduced computation cost. For example, the network discovered by
our approach on CIFAR-100 dataset achieves 78.1% test accuracy under 120 GPU
hours, compared to 77.0% test accuracy in more than 65, 536 GPU hours in [35].Comment: CVPR 201
Using Neural Networks for Relation Extraction from Biomedical Literature
Using different sources of information to support automated extracting of
relations between biomedical concepts contributes to the development of our
understanding of biological systems. The primary comprehensive source of these
relations is biomedical literature. Several relation extraction approaches have
been proposed to identify relations between concepts in biomedical literature,
namely, using neural networks algorithms. The use of multichannel architectures
composed of multiple data representations, as in deep neural networks, is
leading to state-of-the-art results. The right combination of data
representations can eventually lead us to even higher evaluation scores in
relation extraction tasks. Thus, biomedical ontologies play a fundamental role
by providing semantic and ancestry information about an entity. The
incorporation of biomedical ontologies has already been proved to enhance
previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
One Decade of Development and Evolution of MicroRNA Target Prediction Algorithms
Nearly two decades have passed since the publication of the first study reporting the discovery of microRNAs (miRNAs). The key role of miRNAs in post-transcriptional gene regulation led to the performance of an increasing number of studies focusing on origins, mechanisms of action and functionality of miRNAs. In order to associate each miRNA to a specific functionality it is essential to unveil the rules that govern miRNA action. Despite the fact that there has been significant improvement exposing structural characteristics of the miRNA-mRNA interaction, the entire physical mechanism is not yet fully understood. In this respect, the development of computational algorithms for miRNA target prediction becomes increasingly important. This manuscript summarizes the research done on miRNA target prediction. It describes the experimental data currently available and used in the field and presents three lines of computational approaches for target prediction. Finally, the authors put forward a number of considerations regarding current challenges and future direction
- …