Search CORE

22 research outputs found

Statistical methods to improve the analysis of biological data: Benchmarking phenotypes, protein function prediction, and spatial modelling of gene expression

Author: Zhou Naihui
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2020
Field of study

Data collected in biological experiments comes in all shapes and sizes, including DNA and protein sequences, mRNA counts, spatial interactions, protein annotations, phenotypic images and so on. In order to make sense of this myriad of data, novel statistical methods are needed to not only model the biological data, but also to assess the accuracy of predictions. In this thesis, I present three research studies that perform statistical analysis in the benchmarking, assessment and modelling of genetic data, demonstrating diversity of bioinformatics research. The approach taken here is to tailor statistical methods for specific data types. To provide quality benchmark data for phenotypic image processing and assessment, a Generalized Linear Mixed effects model was used to compare the performance of different groups of people (lay people recruited through Amazon Mechanical Turk versus experts) in their efficacy to highlight key elements in phenotypic images collected from corn fields. The analyzed images were then used as ground-truth for the training and testing of automated methods. We concluded that properly managed crowdsourcing can be used to establish large volumes of viable ground truth data at a low cost and high quality, especially in the context of high throughput plant phenotyping. To assess the quality of computational protein function predictions, the third Critical Assessment of Functional Annotation (CAFA) was launched to evaluate predictions in the form of a community challenge. Each protein is associated with multiple functions represented by Gene Ontology terms (labels). These ontological terms form a hierarchical structure, and the frequency of each term is not distributed uniformly among different proteins. Precision-recall based assessment metrics were not enough to account for the non-uniform prior distribution of this multi-label problem, so semantic-distance based methods were developed for better model assessment. We concluded that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than expectations set by baseline methods, it leaves considerable room and need for improvement. The CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation databases, computational function prediction, and our ability to manage big data in the era of large experimental screens. To model the spatial dependency of gene expression on the 3D structure of the genome, a Poisson Hierarchical Markov Random Field model (PhiMRF) was developed for gene expression data that accounts for the pairwise spatial interaction from HiC experiments. The quantitative expression of genes on human chromosomes 1, 4, 5, 6, 8, 9, 12, 19, 20 , 21 and X all showed meaningful positive intra-chromosomal spatial dependency. Moreover, the spatial dependency is much stronger than the dependency based on linear gene neighborhoods, suggesting that 3D chromosome structures such as chromatin loops and Topologically Associating Domains (TADs) are indeed strongly correlated with gene expression levels. The results both confirm and quantify the spatial correlation in gene expression. In addition, PhiMRF improves upon the stochastic modelling of gene expression that is currently widely used in differential expression analyses. PhiMRF is available at https://github.com/ashleyzhou972/PhiMRF as an R package

Digital Repository @ Iowa State University (ISU)

Evaluation on Transfer Efficiency at Integrated Transport Terminals through Multilevel Grey Evaluation

Author: Chen Shaokuan
Kou Chunge
Leng Yan
Li Qing
Liang Yanjiao
Xu Zichuan
Zhou Naihui
Publication venue: Published by Elsevier Ltd.
Publication date: 31/12/2012
Field of study

AbstractTransfer efficiency in integrated transportation terminal is greatly important for both passengers and operational companies. In this paper, we proposed various criteria and a hierarchy index system to evaluate the performance of the transfer condition inside Beijing South Railway Station. To make the assessment more scientific, we assign weightings to each of them by integrated weighting method. Then we use an evaluation method, Multi-level Grey Evaluation, to calculate the performance indexes of different transfer modes in the station and further we compare the ranking results of transfer efficiency of different transfer modes

Elsevier - Publisher Connector

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Author: Alborzi Seyed Ziaeddin
Altenhoff Adrian
Amezola Miguel
Antczak Magdalena
Aridhi Sabeur
Asgari Ehsaneddin
Atalay Volkan
Babbitt Patricia C.
Barot Meet
Ben-Hur Asa
Benso Alfredo
Bergquist Timothy R.
Berselli Michele
Bhat Prajwal
Björne Jari
Black Gage S.
Boecker Florian
Bonneau Richard
Borukhov Itamar
Bosco Giovanni
Boudellioua Imane
Brackenridge Danielle A.
Brenner Steven E.
Cao Renzhi
Carraro Marco
Casadio Rita
Cetin-Atalay Rengul
Chandler Caleb
Chang Jia-Ming
Cheng Jianlin
Chi Po-Han
Cozzetto Domenico
Crocker Alex W.
Dai Suyang
Dalkiran Alperen
Das Sayoni
Davidović Radoslav S.
Davis Larry
Dayton Jonathan B.
Dessimoz Christophe
Devignes Marie-Dominique
Di Carlo Stefano
Dogan Tunca
Dzeroski Saso
Emily Koo Da Chen
Fa Rui
Fabris Fabio
Falda Marco
Fang Hai
Fernández José M.
Fontana Paolo
Frank Yotam
Frasca Marco
Freddolino Peter L.
Freitas Alex A.
Friedberg Iddo
Gemovic Branislava
Georghiou George
Ginter Filip
Gligorijević Vladimir
Goldberg Tatyana
Gough Julian
Greene Casey S.
Grossi Giuliano
Hakala Kai
Hamid Md Nafiz
Hoehndorf Robert
Hogan Deborah A.
Holm Liisa
Hou Jie
Hou Jie
Hurto Rebecca L.
Jain Aashish
Jeffery Constance J.
Jiang Yuxiang
Jo Dane
Johnson Devon
Jones David T.
Kacsoh Balint Z.
Kaewphan Suwisa
Kahanda Indika
Kihara Daisuke
Kulmanov Maxat
Larsen Dallas J.
Lavezzo Enrico
Lee Alexandra J.
Lees Jonathan Gill
Lewis Kimberley A.
Liao Wen-Hung
Lichtarge Olivier
Linial Michal
Liu Yi-Wei
Mao Qizhong
Martelli Pier Luigi
Martin Maria J.
McGuffin Liam
McHardy Alice C.
Medlar Alan J.
Mehryary Farrokh
Mesiti Marco
Moen Hans
Mofrad Mohammad R. K.
Mooney Sean D.
Nguyen Huy N.
Notaro Marco
Novikov Ilya
Omdahl Ashton R.
Orengo Christine A.
O’Donovan Claire
Paccanaro Alberto
Pascarelli Stefano
Perovic Vladimir R.
Petrini Alessandro
Piovesan Damiano
Politano Gianfranco
Profiti Giuseppe
Radivojac Predrag
Re Matteo
Reeb Jonas
Rehman Hafeez Ur
Renaux Alexandre
Rifaioglu Ahmet S.
Ritchie David W.
Roche Daniel B.
Rodriguez Jose Manuel
Romero Alfonso E.
Rose Peter W.
Rost Burkhard
Sagers Luke W.
Saidi Rabie
Salakoski Tapio
Savojardo Castrense
Sillitoe Ian
Suh Erica
Sumonja Neven
Supek Fran
Thurlby Natalie
Tian Weidong
Tolvanen Martti E. E.
Toppo Stefano
Torres Mateo
Tosatto Silvio C. E.
Tress Michael L.
Tseng Wei-Cheng
Törönen Petri
Valentini Giorgio
Veljkovic Nevena
Vesztrocy Alex Wiarwick
Vidulin Vedrana
Vucetic Slobodan
Wan Cen
Wang Zheng
Wass Mark N.
Wilkins Angela
Yang Haixuan
Yao Shuwei
You Ronghui
Yunes Jeffrey M.
Zhang Chengxin
Zhang Feng
Zhang Shanshan
Zhang Yang
Zhang Zihan
Zhao Chenguang
Zhou Naihui
Zhu Shanfeng
Zosa Elaine
Šmuc Tomislav
Publication venue
Publication date: 01/01/2019
Field of study

Background The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. Conclusion We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.Peer reviewe

HAL-CentraleSupelec

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Edinburgh Research Explorer

REPISALUD

Archivio istituzionale della ricerca - Università di Padova

Helmholtz Zentrum für Infektionsforschung Repository

Central Archive at the University of Reading

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Repository of the Vinča Nuclear Institute (VinaR)

OpenMETU (Middle East Technical University)

Explore Bristol Research

Deep Blue Documents

Archivio istituzionale della ricerca - Fondazione Edmund Mach

HAL Clermont Université

Serveur académique lausannois

HAL Descartes

University of Miami: Scholarship Miami

Helsingin yliopiston digitaalinen arkisto

Hal-Diderot

Hacettepe University Institutional Repository

Repository for Publications and Research Data

INRIA a CCSD electronic archive server

UCL Discovery

Kent Academic Repository

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Author: Aashish Jain
Adrian Altenhoff
Ahmet S. Rifaioglu
Alan J. Medlar
Alberto Paccanaro
Alessandro Petrini
Alex A. Freitas
Alex W. Crocker
Alex Warwick Vesztrocy
Alexandra J. Lee
Alexandre Renaux
Alfonso E. Romero
Alfredo Benso
Alice C. McHardy
Alperen Dalkıran
Angela Wilkins
Asa Ben-Hur
Ashton R. Omdahl
Balint Z. Kacsoh
Branislava Gemovic
Burkhard Rost
Caleb Chandler
Casey S. Greene
Castrense Savojardo
Cen Wan
Chenguang Zhao
Chengxin Zhang
Christine A. Orengo
Christophe Dessimoz
Claire O’Donovan
Constance J. Jeffery
Da Chen Emily Koo
Daisuke Kihara
Dallas J. Larsen
Damiano Piovesan
Dane Jo
Daniel B. Roche
Danielle A. Brackenridge
David T. Jones
David W. Ritchie
Deborah A. Hogan
Devon Johnson
Domenico Cozzetto
Ehsaneddin Asgari
Elaine Zosa
Enrico Lavezzo
Erica Suh
Fabio Fabris
Farrokh Mehryary
Feng Zhang
Filip Ginter
Florian Boecker
Fran Supek
Gage S. Black
George Georghiou
Gianfranco Politano
Giorgio Valentini
Giovanni Bosco
Giuliano Grossi
Giuseppe Profiti
Hafeez Ur Rehman
Hai Fang
Haixuan Yang
Hans Moen
Heiko Schoof
Huy N. Nguyen
Ian Sillitoe
Iddo Friedberg
Ilya Novikov
Imane Boudellioua
Indika Kahanda
Itamar Borukhov
Jari Björne
Jeffrey M. Yunes
Jia-Ming Chang
Jianlin Cheng
Jie Hou
Jonas Reeb
Jonathan B. Dayton
Jonathan Gill Lees
Jose Manuel Rodriguez
José M. Fernández
Julian Gough
Kai Hakala
Kimberley A. Lewis
Larry Davis
Liam J. McGuffin
Liisa Holm
Magdalena Antczak
Marco Carraro
Marco Falda
Marco Frasca
Marco Mesiti
Marco Notaro
Maria J. Martin
Marie-Dominique Devignes
Mark N. Wass
Martti E.E. Tolvanen
Mateo Torres
Matteo Re
Maxat Kulmanov
Md Nafiz Hamid
Meet Barot
Michael L. Tress
Michal Linial
Michele Berselli
Miguel Amezola
Mohammad R.K. Mofrad
Naihui Zhou
Natalie Thurlby
Neven Sumonja
Nevena Veljkovic
Olivier Lichtarge
Paolo Fontana
Patricia C. Babbitt
Peter L. Freddolino
Peter W. Rose
Petri Törönen
Pier Luigi Martelli
Po-Han Chi
Prajwal Bhat
Predrag Radivojac
Qizhong Mao
Rabie Saidi
Radoslav S. Davidović
Rebecca L. Hurto
Rengul Cetin Atalay
Renzhi Cao
Richard Bonneau
Rita Casadio
Robert Hoehndorf
Ronghui You
Rui Fa
Sabeur Aridhi
Saso Dzeroski
Sayoni Das
Sean D. Mooney
Seyed Ziaeddin Alborzi
Shanfeng Zhu
Shanshan Zhang
Shuwei Yao
Silvio C.E. Tosatto
Slobodan Vucetic
Stefano Di Carlo
Stefano Pascarelli
Stefano Toppo
Steven E. Brenner
Suwisa Kaewphan
Suyang Dai
Tapio Salakoski
Tatyana Goldberg
Timothy R. Bergquist
Tomislav Šmuc
Tunca Dogan
Vedrana Vidulin
Vladimir Gligorijević
Vladimir R. Perovic
Volkan Atalay
Wei-Cheng Tseng
Weidong Tian
Wen-Hung Liao
Yang Zhang
Yi-Wei Liu
Yotam Frank
Yuxiang Jiang
Zheng Wang
Zihan Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/10/2022
Field of study

BackgroundThe Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.ResultsHere, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.ConclusionWe conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.</p

UTUPub

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Author: Zhou Naihui,
Publication venue
Publication date: 15/05/2020
Field of study

Ezid

Statistical methods to improve the analysis of biological data: Benchmarking phenotypes, protein function prediction, and spatial modelling of gene expression

Author: Zhou Naihui
Publication venue
Publication date: 01/01/2020
Field of study

Digital Repository @ Iowa State University (ISU)

A potential new treatment with upadacitinib for acquired reactive perforating collagenosis

Author: Linyi Song MD
Naihui Zhou MD
Wei Ding BS
Yuting Wang BS
Publication venue: Elsevier
Publication date: 01/06/2024
Field of study

Directory of Open Access Journals

Intercropping with Potato-Onion Enhanced the Soil Microbial Diversity of Tomato

Author: Chunxia Li
Danmei Gao
Fengzhi Wu
Naihui Li
Shaocan Chen
Xingang Zhou
Publication venue: 'MDPI AG'
Publication date: 02/06/2020
Field of study

Intercropping can achieve sustainable agricultural development by increasing plant diversity. In this study, we investigated the effects of tomato monoculture and tomato/potato-onion intercropping systems on tomato seedling growth and changes of soil microbial communities in greenhouse conditions. Results showed that the intercropping with potato-onion increased tomato seedling biomass. Compared with monoculture system, the alpha diversity of soil bacterial and fungal communities, beta diversity and abundance of bacterial community were increased in the intercropping system. Nevertheless, the beta-diversity and abundance of fungal community had no difference between the intercropping and monoculture systems. The relative abundances of some taxa (i.e., Acidobacteria-Subgroup-6, Arthrobacter, Bacillus, Pseudomonas) and several OTUs with the potential to promote plant growth were increased, while the relative abundances of some potential plant pathogens (i.e., Cladosporium) were decreased in the intercropping system. Redundancy analysis indicated that bacterial community structure was significantly influenced by soil organic carbon and pH, the fungal community structure was related to changes in soil organic carbon and available phosphorus. Overall, our results suggested that the tomato/potato-onion intercropping system altered soil microbial communities and improved the soil environment, which may be the main factor in promoting tomato growth

Multidisciplinary Digital Publishing Institute

Development of an Adaptive Fuzzy Integral-Derivative Line-of-Sight Method for Bathymetric LiDAR Onboard Unmanned Surface Vessel

Author: Guoqing Zhou
Guoshuai Jia
Jiasheng Xu
Jinhuang Wu
Ke Gao
Naihui Song
Xia Wang
Xiang Zhou
Publication venue: MDPI AG
Publication date: 01/07/2024
Field of study

Previous control methods developed by our research team cannot satisfy the high accuracy requirements of unmanned surface vessel (USV) path-tracking during bathymetric mapping because of the excessive overshoot and slow convergence speed. For this reason, this study developed an adaptive fuzzy integral-derivative line-of-sight (AFIDLOS) method for USV path-tracking control. Integral and derivative terms were added to counteract the effect of the sideslip angle with which the USV could be quickly guided to converge to the planned path for bathymetric mapping. To obtain high accuracy of the look-ahead distance, a fuzzy control method was proposed. The proposed method was verified using simulations and outdoor experiments. The results demonstrate that the AFIDLOS method can reduce the overshoot by 79.85%, shorten the settling time by 55.32% in simulation experiments, reduce the average cross-track error by 10.91% and can ensure a 30% overlap of neighboring strips of bathymetric LiDAR outdoor mapping when compared with the traditional guidance law

Directory of Open Access Journals