Search CORE

85,823 research outputs found

DNALinux Virtual Desktop Edition

Author: Sebastian Bassi
Virginia V. C. Gonzalez
Publication venue
Publication date: 09/08/2007
Field of study

The new version of DNALinux (VDE) is presented. DNALinux VDE is a departure from traditional distributions since it uses a virtual machine to bundle together the operating system and bioinformatics applications. The main advantage of this approach is that a virtualized environment doesn't affect a installed system. With a virtual machine a Linux system can be run under a Windows system, provided that the virtual machine player is installed. The included programs are listed and specifications to add more programs are explained. We believe that DNALinux could be used as a standardized virtual machine for learning, using, developing and testing bioinformatics applications

Crossref

Nature Precedings

Deep learning for supervised classification

Author: DI CIACCIO AGOSTINO
GIORGI Giovanni Maria
Publication venue: CLEUP
Publication date: 01/01/2016
Field of study

One of the most recent area in the Machine Learning research is Deep Learning. Deep Learning algorithms have been applied successfully to computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics. The key idea of Deep Learning is to combine the best techniques from Machine Learning to build powerful general‑purpose learning algorithms. It is a mistake to identify Deep Neural Networks with Deep Learning Algorithms. Other approaches are possible, and in this paper we illustrate a generalization of Stacking which has very competitive performances. In particular, we show an application of this approach to a real classification problem, where a three-stages Stacking has proved to be very effective

Archivio della ricerca- Università di Roma La Sapienza

An empirical comparison of supervised machine learning techniques in bioinformatics

Author: Gilbert D
Tan A C
Publication venue: Australian Computer Society
Publication date: 01/01/2003
Field of study

Research in bioinformatics is driven by the experimental data. Current biological databases are populated by vast amounts of experimental data. Machine learning has been widely applied to bioinformatics and has gained a lot of success in this research area. At present, with various learning algorithms available in the literature, researchers are facing difficulties in choosing the best method that can apply to their data. We performed an empirical study on 7 individual learning systems and 9 different combined methods on 4 different biological data sets, and provide some suggested issues to be considered when answering the following questions: (i) How does one choose which algorithm is best suitable for their data set? (ii) Are combined methods better than a single approach? (iii) How does one compare the effectiveness of a particular algorithm to the others

CiteSeerX

Brunel University Research Archive

Recommended from our members

Multi-class protein fold classification using a new ensemble machine learning approach.

Author: Deville Y
Gilbert D
Tan A
Publication venue: GIW
Publication date: 01/01/2003
Field of study

Protein structure classification represents an important process in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recent structural genomics initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. The amount of structural data has made traditional methods such as manual inspection of the protein structure become impossible. Machine learning has been widely applied to bioinformatics and has gained a lot of success in this research area. This work proposes a novel ensemble machine learning method that improves the coverage of the classifiers under the multi-class imbalanced sample sets by integrating knowledge induced from different base classifiers, and we illustrate this idea in classifying multi-class SCOP protein fold data. We have compared our approach with PART and show that our method improves the sensitivity of the classifier in protein fold classification. Furthermore, we have extended this method to learning over multiple data types, preserving the independence of their corresponding data sources, and show that our new approach performs at least as well as the traditional technique over a single joined data source. These experimental results are encouraging, and can be applied to other bioinformatics problems similarly characterised by multi-class imbalanced data sets held in multiple data sources

Brunel University Research Archive

Deep Learning for Metagenomic Data: using 2D Embeddings and Convolutional Neural Networks

Author: Chevaleyre Yann
Nguyen Thanh Hai
Prifti Edi
Sokolovska Nataliya
Zucker Jean-Daniel
Publication venue
Publication date: 01/12/2017
Field of study

Deep learning (DL) techniques have had unprecedented success when applied to images, waveforms, and texts to cite a few. In general, when the sample size (N) is much greater than the number of features (d), DL outperforms previous machine learning (ML) techniques, often through the use of convolution neural networks (CNNs). However, in many bioinformatics ML tasks, we encounter the opposite situation where d is greater than N. In these situations, applying DL techniques (such as feed-forward networks) would lead to severe overfitting. Thus, sparse ML techniques (such as LASSO e.g.) usually yield the best results on these tasks. In this paper, we show how to apply CNNs on data which do not have originally an image structure (in particular on metagenomic data). Our first contribution is to show how to map metagenomic data in a meaningful way to 1D or 2D images. Based on this representation, we then apply a CNN, with the aim of predicting various diseases. The proposed approach is applied on six different datasets including in total over 1000 samples from various diseases. This approach could be a promising one for prediction tasks in the bioinformatics field.Comment: Accepted at NIPS 2017 Workshop on Machine Learning for Health (https://ml4health.github.io/2017/); In Proceedings of the NIPS ML4H 2017 Workshop in Long Beach, CA, USA

arXiv.org e-Print Archive

HAL-IRD

Bioinformatics: a knowledge engineering approach

Author: Kasabov N
Publication venue: IEEE
Publication date: 27/05/2009
Field of study

The paper introduces the knowledge engineering (KE) approach for the modeling and the discovery of new knowledge in bioinformatics. This approach extends the machine learning approach with various rule extraction and other knowledge representation procedures. Examples of the KE approach, and especially of one of the recently developed techniques - evolving connectionist systems (ECOS), to challenging problems in bioinformatics are given, that include: DNA sequence analysis, microarray gene expression profiling, protein structure prediction, finding gene regulatory networks, medical prognostic systems, computational neurogenetic modeling

AUT Scholarly Commons

Learning what to read: Focused machine reading

Author: Morrison Clayton T.
Noriega-Atala Enrique
Surdeanu Mihai
Valenzuela-Escarcega Marco A.
Publication venue
Publication date: 01/01/2017
Field of study

Recent efforts in bioinformatics have achieved tremendous progress in the machine reading of biomedical literature, and the assembly of the extracted biochemical interactions into large-scale models such as protein signaling pathways. However, batch machine reading of literature at today's scale (PubMed alone indexes over 1 million papers per year) is unfeasible due to both cost and processing overhead. In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible. We introduce a family of algorithms for focused reading, including an intuitive, strong baseline, and a second approach which uses a reinforcement learning (RL) framework that learns when to explore (widen the search) or exploit (narrow it). We demonstrate that the RL approach is capable of answering more queries than the baseline, while being more efficient, i.e., reading fewer documents.Comment: 6 pages, 1 figure, 1 algorithm, 2 tables, accepted to EMNLP 201

arXiv.org e-Print Archive

Crossref

Analysis of Stock Market using Machine Learning

Author: G Skanda
Mandi Riteesh
Publication venue: IJEAP
Publication date: 25/05/2021
Field of study

Machine Learning is a prominent area of research that emphasizes on finding patterns in existential data. The field of Machine Learning, can be concisely described as enabling computers to make productive predictions using previous experiences. As there is a large amount of information being available everywhere, it is very important to analyze this data in order to extract some useful information and thus developing algorithms based on this analysis. This can hence be done through data mining and Machine Learning. In addition to many other fields, Machine Learning models have broad applications in the field Bioinformatics. The complexity involved in biological analysis has led to the development of experienced Machine Learning methods. This research paper discusses the importance of a data-driven approach, compared to the formalization of traditional Artificial Intelligence and also primarily focuses on a key approach to forecast company's workflow using Machine learning

International Journal of Engineering and Applied Physics

Analysis of Stock Market using Machine Learning

Author: G Skanda
Mandi Riteesh
Publication venue: IJEAP
Publication date: 25/05/2021
Field of study

International Journal of Engineering and Applied Physic

International Journal of Engineering and Applied Physics