Search CORE

841 research outputs found

The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis

Author: De Bodt Stefanie
Drebert Zuzanna
Inzé Dirk
Van de Peer Yves
Van Landeghem Sofie
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date: 01/01/2013
Field of study

Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies

Ghent University Academic Bibliography

PubMed Central

Non-coding yet non-trivial: a review on the computational genomics of lincRNAs

Author
Publication venue: BioMed Central
Publication date: 22/12/2015
Field of study

Springer - Publisher Connector

Classification of microarray gene expression cancer data by using artificial intelligence methods

Author: Mumbuçoğlu Mehmet Şükrü
Publication venue: Hasan Kalyoncu Üniversitesi
Publication date: 01/01/2019
Field of study

Günümüzde bilgisayar teknolojilerinin gelişmesi ile birçok alanda yapılan çalışmaları etkilemiştir. Moleküler biyoloji ve bilgisayar teknolojilerinde meydana gelen gelişmeler biyoinformatik adlı bilimi ortaya çıkarmıştır. Biyoinformatik alanında meydana gelen hızlı gelişmeler, bu alanda çözülmeyi bekleyen birçok probleme çözüm olma yolunda büyük katkılar sağlamıştır. DNA mikroarray gen ekspresyonlarının sınıflandırılması da bu problemlerden birisidir. DNA mikroarray çalışmaları, biyoinformatik alanında kullanılan bir teknolojidir. DNA mikroarray veri analizi, kanser gibi genlerle alakalı hastalıkların teşhisinde çok etkin bir rol oynamaktadır. Hastalık türüne bağlı gen ifadeleri belirlenerek, herhangi bir bireyin hastalıklı gene sahip olup olmadığı büyük bir başarı oranı ile tespit edilebilir. Bireyin sağlıklı olup olmadığının tespiti için, mikroarray gen ekspresyonları üzerinde yüksek performanslı sınıflandırma tekniklerinin kullanılması büyük öneme sahiptir. DNA mikroarray’lerini sınıflandırmak için birçok yöntem bulunmaktadır. Destek Vektör Makinaları, Naive Bayes, k-En yakın Komşu, Karar Ağaçları gibi birçok istatistiksel yöntemler yaygın olarak kullanlmaktadır. Fakat bu yöntemler tek başına kullanıldığında, mikroarray verilerini sınıflandırmada her zaman yüksek başarı oranları vermemektedir. Bu yüzden mikroarray verilerini sınıflandırmada yüksek başarı oranları elde etmek için yapay zekâ tabanlı yöntemlerin de kullanılması yapılan çalışmalarda görülmektedir. Bu çalışmada, bu istatistiksel yöntemlere ek olarak yapay zekâ tabanlı ANFIS gibi bir yöntemi kullanarak daha yüksek başarı oranları elde etmek amaçlanmıştır. İstatistiksel sınıflandırma yöntemleri olarak K-En Yakın Komşuluk, Naive Bayes ve Destek Vektör Makineleri kullanılmıştır. Burada Göğüs ve Merkezi Sinir Sistemi kanseri olmak üzere iki farklı kanser veri seti üzerinde çalışmalar yapılmıştır. Sonuçlardan elde edilen bilgilere göre, genel olarak yapay zekâ tabanlı ANFIS tekniğinin, istatistiksel yöntemlere göre daha başarılı olduğu tespit edilmiştir

DSpace@HKU

deepBase: a database for deeply annotating and mining deep sequencing data

Author: Chen Yue-Qin
Qu Liang-Hu
Shao Peng
Yang Jian-Hua
Zhou Hui
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Advances in high-throughput next-generation sequencing technology have reshaped the transcriptomic research landscape. However, exploration of these massive data remains a daunting challenge. In this study, we describe a novel database, deepBase, which we have developed to facilitate the comprehensive annotation and discovery of small RNAs from transcriptomic data. The current release of deepBase contains deep sequencing data from 185 small RNA libraries from diverse tissues and cell lines of seven organisms: human, mouse, chicken, Ciona intestinalis, Drosophila melanogaster, Caenhorhabditis elegans and Arabidopsis thaliana. By analyzing ∼14.6 million unique reads that perfectly mapped to more than 284 million genomic loci, we annotated and identified ∼380 000 unique ncRNA-associated small RNAs (nasRNAs), ∼1.5 million unique promoter-associated small RNAs (pasRNAs), ∼4.0 million unique exon-associated small RNAs (easRNAs) and ∼6 million unique repeat-associated small RNAs (rasRNAs). Furthermore, 2038 miRNA and 1889 snoRNA candidates were predicted by miRDeep and snoSeeker. All of the mapped reads can be grouped into about 1.2 million RNA clusters. For the purpose of comparative analysis, deepBase provides an integrative, interactive and versatile display. A convenient search option, related publications and other useful information are also provided for further investigation. deepBase is available at: http://deepbase.sysu.edu.cn/

CiteSeerX

PubMed Central

Rough Set Soft Computing Cancer Classification and Network: One Stone, Two Birds

Author: Zhang Yue
Publication venue: Libertas Academica
Publication date: 01/07/2010
Field of study

Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article

Directory of Open Access Journals

PubMed Central

Computational analysis of noncoding RNAs

Author: Akutsu
Alkan
Amaral
Amaral
Ambros
Anders
Andronescu
Andronescu
Andronescu
Aravin
Au
Bartel
Bartel
Bateman
Bentwich
Berezikov
Bernhart
Bon
Bon
Bonnet
Breaker
Busch
Cabili
Chen
Chen
Chen
Clark
Coventry
Deigan
del Val
Ding
Dinger
Dirks
Do
Do
Dowell
ENCODE Project Consortium
Findei
Flamm
Freyhult
Frhlich
Friedländer
Frith
Galperin
Gardner
Gardner
Gautheret
Grad
Griffith
Grishok
Gruber
Guttman
Guttman
Harmanci
Havgaard
He
Hendrix
Hertel
Hertel
Hofacker
Hofacker
Hofacker
Jiang
Katz
Kertesz
Knudsen
Kozomara
Kruger
Lagesen
Lai
Laing
Laslett
Lau
Li
Lim
Lin
Lorenz
Lowe
Lowe
Lu
Lu
Lucks
Lyngso
Macke
Markham
Mathelier
Mathews
Mathews
Mathews
Mattick
McCaskill
Menzel
Mituyama
Mohl
Mortazavi
Mourier
Mückstein
Nagel
Nam
Nawrocki
Noller
Nussinov
Ohler
Pang
Parisien
Pasquinelli
Pedersen
Pervouchine
Pfeffer
Reeder
Regalia
Ren
Reuter
Rivas
Rivas
Rivas
Rivas
Roberts
Robertson
Robinson
Ruan
Ruby
Salari
Sankoff
Sato
Schattner
Schnall-Levin
Schnall-Levin
Seemann
Seemann
Seetin
Seitz
Sethupathy
Shi
Sperschneider
Stark
Stark
Tabaska
Tang
Torarinsson
Trapnell
Trapnell
Uemura
Underwood
van Bakel
Wang
Wang
Washietl
Washietl
Washietl
Washietl
Washietl
Washietl
Washietl
Weeks
Weinberg
Weinberg
Will
Will
Will
Wolfinger
Wu
Wuchty
Xayaphoummine
Xia
Xie
Xue
Yao
Zerbino
zu Siederdissen
Zuker
Zuker
Publication venue: 'Wiley'
Publication date: 01/11/2012
Field of study

Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.Austrian Science Fund (Schrodinger Fellowship J2966-B12)German Research Foundation (grant WI 3628/1-1 to SW)National Institutes of Health (U.S.) (NIH award 1RC1CA147187

DSpace@MIT

Crossref

PubMed Central

Decoding function through comparative genomics: from animal evolution to human disease

Author: Maxwell Evan Kyle
Publication venue
Publication date: 12/03/2016
Field of study

Deciphering the functionality encoded in the genome constitutes an essential first step to understanding the context through which mutations can cause human disease. In this dissertation, I present multiple studies based on the use or development of comparative genomics techniques to elucidate function (or lack of function) from the genomes of humans and other animal species. Collectively, these studies focus on two biological entities encoded in the human genome: genes related to human disease susceptibility and those that encode microRNAs - small RNAs that have important gene-regulatory roles in normal biological function and in human disease. Extending this work, I investigated the evolution of these biological entities within animals to shed light on how their underlying functions arose and how they can be modeled in non-human species. Additionally, I present a new tool that uses large-scale clinical genomic data to identify human mutations that may affect microRNA regulatory functions, thereby providing a method by which state-of-the-art genomic technologies can be fully utilized in the search for new disease mechanisms and potential drug targets. The scientific contributions made in this dissertation utilize current data sets generated using high-throughput sequencing technologies. For example, recent whole-genome sequencing studies of the most distant animal lineages have effectively restructured the animal tree of life as we understand it. The first two chapters utilize data from this new high-confidence animal phylogeny - in addition to data generated in the course of my work - to demonstrate that (1) certain classes of human disease have uncommonly large proportions of genes that evolved with the earliest animals and/or vertebrates, and (2) that canonical microRNA functionality - absent in at least two of the early branching animal lineages - likely evolved after the first animals. In the third chapter, I expand upon recent research in predicting microRNA target sites, describing a novel tool for predicting clinically significant microRNA target site variants and demonstrating its applicability to the analysis of clinical genomic data. Thus, the studies detailed in this dissertation represent significant advances in our understanding of the functions of disease genes and microRNAs from both an evolutionary and a clinical perspective

Boston University Institutional Repository (OpenBU)

Recommended from our members

A novel deep mining model for effective knowledge discovery from omics data

Author: Alzubaidi A
Lotfi A
Tepper J
Publication venue: 'Elsevier BV'
Publication date: 01/04/2020
Field of study

Knowledge discovery from omics data has become a common goal of current approaches to personalised cancer medicine and understanding cancer genotype and phenotype. However, high-throughput biomedical datasets are characterised by high dimensionality and relatively small sample sizes with small signal-to-noise ratios. Extracting and interpreting relevant knowledge from such complex datasets therefore remains a significant challenge for the fields of machine learning and data mining. In this paper, we exploit recent advances in deep learning to mitigate against these limitations on the basis of automatically capturing enough of the meaningful abstractions latent with the available biological samples. Our deep feature learning model is proposed based on a set of non-linear sparse Auto-Encoders that are deliberately constructed in an under-complete manner to detect a small proportion of molecules that can recover a large proportion of variations underlying the data. However, since multiple projections are applied to the input signals, it is hard to interpret which phenotypes were responsible for deriving such predictions. Therefore, we also introduce a novel weight interpretation technique that helps to deconstruct the internal state of such deep learning models to reveal key determinants underlying its latent representations. The outcomes of our experiment provide strong evidence that the proposed deep mining model is able to discover robust biomarkers that are positively and negatively associated with cancers of interest. Since our deep mining model is problem-independent and data-driven, it provides further potential for this research to extend beyond its cognate disciplines

Nottingham Trent Institutional Repository (IRep)

Recommended from our members

Lower Expression of Genes near microRNA in C. elegans Germline

Author: Fukuoka Yutaka
Inaoka Hidenori
Kohane Isaac Samuel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/02/2012
Field of study

Background: MicroRNAs (miRNAs) are recently discovered short non-protein-coding RNA molecules. miRNAs are increasingly implicated in tissue-specific transcriptional control and particularly in development. Because there is mounting evidence for the localized component of transcriptional control, we investigated if there is a distance-dependent effect of miRNA. Results: We analyzed gene expression levels around the 84 of 113 know miRNAs for which there are nearby gene that were measured in the data in two independent C. elegans expression data sets. The expression levels are lower for genes in the vicinity of 59 of 84 (71%) miRNAs as compared to genes far from such miRNAs. Analysis of the genes with lower expression in proximity to the miRNAs reveals increased frequency matching of the 7 nucleotide "seed"s of these miRNAs. Conclusion: We found decreased messenger RNA (mRNA) abundance, localized within a 10 kb of chromosomal distance of some miRNAs, in C. elegans germline. The increased frequency of seed matching near miRNA can explain, in part, the localized effects

Harvard University - DASH