Search CORE

1,733 research outputs found

Extracting protein-protein interactions from text using rich feature vectors and feature selection

Author: De Baets Bernard
Saeys Yvan
Van de Peer Yves
Van Landeghem Sofie
Publication venue: Turku Centre for Computer Sciences (TUCS)
Publication date: 01/01/2008
Field of study

Because of the intrinsic complexity of natural language, automatically extracting accurate information from text remains a challenge. We have applied rich featurevectors derived from dependency graphs to predict protein-protein interactions using machine learning techniques. We present the first extensive analysis of applyingfeature selection in this domain, and show that it can produce more cost-effective models. For the first time, our technique was also evaluated on several large-scalecross-dataset experiments, which offers a more realistic view on model performance. During benchmarking, we encountered several fundamental problems hindering comparability with other methods. We present a set of practical guidelines to set up ameaningful evaluation. Finally, we have analysed the feature sets from our experiments before and after feature selection, and evaluated the contribution of both lexical and syntacticinformation to our method. The gained insight will be useful to develop better performing methods in this domain

Ghent University Academic Bibliography

Learning Dictionaries for Named Entity Recognition using Minimal Supervision

Author: Collins Michael
Neelakantan Arvind
Publication venue
Publication date: 01/01/2014
Field of study

This paper describes an approach for automatic construction of dictionaries for Named Entity Recognition (NER) using large amounts of unlabeled data and a few seed examples. We use Canonical Correlation Analysis (CCA) to obtain lower dimensional embeddings (representations) for candidate phrases and classify these phrases using a small number of labeled examples. Our method achieves 16.5% and 11.3% F-1 score improvement over co-training on disease and virus NER respectively. We also show that by adding candidate phrase embeddings as features in a sequence tagger gives better performance compared to using word embeddings.Comment: In 14th Conference of the European Chapter of the Association for Computational Linguistic, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

One Decade of Development and Evolution of MicroRNA Target Prediction Algorithms

Author: Alexiou
Altuvia
Baek
Bandyopadhyay
Barreau
Bartel
Bartel
Betel
Betel
Cai
Chandra
Chi
Didiano
Elisa Ficarra
Enright
Friedman
Friedman
Gaidatzis
Garcia
Griffiths-Jones
Grimson
Guo
Hafner
Hsu
Jacobsen
Ji
Jin
John
Kertesz
Khan
Kim
Kiriakidou
Kruger
Kumar
Lagos-Quintana
Lall
Lee
Lee
Lewis
Lim
Lund
Maragkakis
Mendes
Min
Miranda
Muckstein
Pandey
Papadopoulos
Paula H. Reyes∼Herrera
Reyes∼Herrera
Saetrom
Sandberg
Schmidt
Selbach
Stefani
Sturm
Tan
Thomas
Thomson
Vergoulis
Wang
Watanabe
Wen
Witkos
Xiao
Yan
Yang
Yang
Yousef
Zhao
Publication venue: Elsevier BV:PO Box 211, 1000 AE Amsterdam Netherlands:011 31 20 4853757, 011 31 20 4853642, 011 31 20 4853641, EMAIL: [email protected], INTERNET: http://www.elsevier.nl, Fax: 011 31 20 4853598
Publication date: 01/01/2012
Field of study

Nearly two decades have passed since the publication of the first study reporting the discovery of microRNAs (miRNAs). The key role of miRNAs in post-transcriptional gene regulation led to the performance of an increasing number of studies focusing on origins, mechanisms of action and functionality of miRNAs. In order to associate each miRNA to a specific functionality it is essential to unveil the rules that govern miRNA action. Despite the fact that there has been significant improvement exposing structural characteristics of the miRNA-mRNA interaction, the entire physical mechanism is not yet fully understood. In this respect, the development of computational algorithms for miRNA target prediction becomes increasingly important. This manuscript summarizes the research done on miRNA target prediction. It describes the experimental data currently available and used in the field and presents three lines of computational approaches for target prediction. Finally, the authors put forward a number of considerations regarding current challenges and future direction

Elsevier - Publisher Connector

Crossref

PubMed Central

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

PORTO Publications Open Repository TOrino

Efficient Correlated Topic Modeling with Topic Embedding

Author: Berg-Kirkpatrick Taylor
He Junxian
Hu Zhiting
Huang Ying
Xing Eric P.
Publication venue
Publication date: 01/07/2017
Field of study

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations through the closeness between the topic vectors. Our method enables efficient inference in the low-dimensional embedding space, reducing previous cubic or quadratic time complexity to linear w.r.t the topic size. We further speedup variational inference with a fast sampler to exploit sparsity of topic occurrence. Extensive experiments show that our approach is capable of handling model and data scales which are several orders of magnitude larger than existing correlation results, without sacrificing modeling quality by providing competitive or superior performance in document classification and retrieval.Comment: KDD 2017 oral. The first two authors contributed equall

arXiv.org e-Print Archive

Crossref

Towards a Protein-Protein Interaction information extraction system: recognizing named entities

Author: Alfred
Antonio Molina
Aronson
Bader
Baeza-yates
Danger
Denny
Dingare
Ferran Pla
Giles
Habib
Hersh
Kerrien
Leaman
Lee
Levenshtein
Li
Lindberg
McCandless
Miller
Mishra
Nadkarni
Orchard
Pagel
Paolo Rosso
Phizicky
Rebholz-Schuhmann
Reguly
Ristad
Roxana Danger
Salwinski
Schneider
Smith
Song
Sun
Thomas
Tsai
Tsuruoka
Wang
Zanzoni
Publication venue: 'Elsevier BV'
Publication date: 01/02/2014
Field of study

[EN] The majority of biological functions of any living being are related to Protein Protein Interactions (PPI). PPI discoveries are reported in form of research publications whose volume grows day after day. Consequently, automatic PPI information extraction systems are a pressing need for biologists. In this paper we are mainly concerned with the named entity detection module of PPIES (the PPI information extraction system we are implementing) which recognizes twelve entity types relevant in PPI context. It is composed of two sub-modules: a dictionary look-up with extensive normalization and acronym detection, and a Conditional Random Field classifier. The dictionary look-up module has been tested with Interaction Method Task (IMT), and it improves by approximately 10% the current solutions that do not use Machine Learning (ML). The second module has been used to create a classifier using the Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA 04) data set. It does not use any external resources, or complex or ad hoc post-processing, and obtains 77.25%, 75.04% and 76.13 for precision, recall, and F1-measure, respectively, improving all previous results obtained for this data set.This work has been funded by MICINN, Spain, as part of the "Juan de la Cierva" Program and the Project DIANA-Applications (TIN2012-38603-C02-01), as well as the by the European Commission as part of the WIQ-EI IRSES Project (Grant No. 269180) within the FP 7 Marie Curie People Framework.Danger Mercaderes, RM.; Pla Santamaría, F.; Molina Marco, A.; Rosso, P. (2014). Towards a Protein-Protein Interaction information extraction system: recognizing named entities. Knowledge-Based Systems. 57:104-118. https://doi.org/10.1016/j.knosys.2013.12.010S1041185

Crossref

RiuNet

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences

Author: Keum Jongsoo
Lee Ingoo
Nam Hojung
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 05/11/2018
Field of study

Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors are shown to be not informative enough to predict accurate DTIs. Thus, in this study, we employ a convolutional neural network (CNN) on raw protein sequences to capture local residue patterns participating in DTIs. With CNN on protein sequences, our model performs better than previous protein descriptor-based models. In addition, our model performs better than the previous deep learning model for massive prediction of DTIs. By examining the pooled convolution results, we found that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches.Comment: 26 pages, 7 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Non-linear dynamical analysis of resting tremor for demand-driven deep brain stimulation.

Author: Aziz Tipu
Camara Carmen
Parkkonen Lauri
Pereda Ernesto
Subramaniyam Narayan P.
Warwick Kevin
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Parkinson's Disease (PD) is currently the second most common neurodegenerative disease. One of the most characteristic symptoms of PD is resting tremor. Local Field Potentials (LFPs) have been widely studied to investigate deviations from the typical patterns of healthy brain activity. However, the inherent dynamics of the Sub-Thalamic Nucleus (STN) LFPs and their spatiotemporal dynamics have not been well characterized. In this work, we study the non-linear dynamical behaviour of STN-LFPs of Parkinsonian patients using ε -recurrence networks. RNs are a non-linear analysis tool that encodes the geometric information of the underlying system, which can be characterised (for example, using graph theoretical measures) to extract information on the geometric properties of the attractor. Results show that the activity of the STN becomes more non-linear during the tremor episodes and that ε -recurrence network analysis is a suitable method to distinguish the transitions between movement conditions, anticipating the onset of the tremor, with the potential for application in a demand-driven deep brain stimulation system

Multidisciplinary Digital Publishing Institute

Central Archive at the University of Reading

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Aaltodoc Publication Archive

Coventry University Pure Portal

Universidad Carlos III de Madrid e-Archivo

Multi-Class Classifier in Parkinson’s Disease Using an Evolutionary Multi-Objective Optimization Algorithm

Author: Rojas Ruiz Fernando José
Rojas Valenzuela Ignacio
Valenzuela Cansino Olga
Publication venue: 'MDPI AG'
Publication date: 16/03/2022
Field of study

This work was funded by the Spanish Ministry of Sciences, Innovation and Universities under Project RTI-2018-101674-B-I00 and the projects from Junta de Andalucia B-TIC-414, A-TIC-530-UGR20 and P20-00163.In this contribution, a novel methodology for multi-class classification in the field of Parkinson’s disease is proposed. The methodology is structured in two phases. In a first phase, the most relevant volumes of interest (VOI) of the brain are selected by means of an evolutionary multi-objective optimization (MOE) algorithm. Each of these VOIs are subjected to volumetric feature extraction using the Three-Dimensional Discrete Wavelet Transform (3D-DWT). When applying 3D-DWT, a high number of coefficients is obtained, requiring the use of feature selection/reduction algorithms to find the most relevant features. The method used in this contribution is based on Mutual Redundancy (MI) and Minimum Maximum Relevance (mRMR) and PCA. To optimize the VOI selection, a first group of 550 MRI was used for the 5 classes: PD, SWEDD, Prodromal, GeneCohort and Normal. Once the Pareto Front of the solutions is obtained (with varying degrees of complexity, reflected in the number of selected VOIs), these solutions are tested in a second phase. In order to analyze the SVM classifier accuracy, a test set of 367 MRI was used. The methodology obtains relevant results in multi-class classification, presenting several solutions with different levels of complexity and precision (Pareto Front solutions), reaching a result of 97% as the highest precision in the test data.Spanish Government RTI-2018-101674-B-I00Junta de Andalucia B-TIC-414 A-TIC-530-UGR20 P20-0016

Repositorio Institucional Universidad de Granada