Search CORE

34 research outputs found

Table1_AnnotaPipeline: An integrated tool to annotate eukaryotic proteins using multi-omics data.XLSX

Author: Edmundo Carlos Grisard (10098655)
Eric Kazuo Kawagoe (14155002)
Glauber Wagner (173002)
Guilherme Augusto Maia (14154999)
Renato Simões Moreira (12531639)
Tatiany Aparecida Teixeira Soratto (9221146)
Vilmar Benetti Filho (12531642)
Publication venue: 'Frontiers Media SA'
Publication date: 22/11/2022
Field of study

Assignment of gene function has been a crucial, laborious, and time-consuming step in genomics. Due to a variety of sequencing platforms that generates increasing amounts of data, manual annotation is no longer feasible. Thus, the need for an integrated, automated pipeline allowing the use of experimental data towards validation of in silico prediction of gene function is of utmost relevance. Here, we present a computational workflow named AnnotaPipeline that integrates distinct software and data types on a proteogenomic approach to annotate and validate predicted features in genomic sequences. Based on FASTA (i) nucleotide or (ii) protein sequences or (iii) structural annotation files (GFF3), users can input FASTQ RNA-seq data, MS/MS data from mzXML or similar formats, as the pipeline uses both transcriptomic and proteomic information to corroborate annotations and validate gene prediction, providing transcription and expression evidence for functional annotation. Reannotation of the available Arabidopsis thaliana, Caenorhabditis elegans, Candida albicans, Trypanosoma cruzi, and Trypanosoma rangeli genomes was performed using the AnnotaPipeline, resulting in a higher proportion of annotated proteins and a reduced proportion of hypothetical proteins when compared to the annotations publicly available for these organisms. AnnotaPipeline is a Unix-based pipeline developed using Python and is available at: https://github.com/bioinformatics-ufsc/AnnotaPipeline.</p

The Francis Crick Institute

Table2_AnnotaPipeline: An integrated tool to annotate eukaryotic proteins using multi-omics data.docx

Author: Edmundo Carlos Grisard (10098655)
Eric Kazuo Kawagoe (14155002)
Glauber Wagner (173002)
Guilherme Augusto Maia (14154999)
Renato Simões Moreira (12531639)
Tatiany Aparecida Teixeira Soratto (9221146)
Vilmar Benetti Filho (12531642)
Publication venue: 'Frontiers Media SA'
Publication date: 22/11/2022
Field of study

The Francis Crick Institute

Classification of ESTs in Gene Ontology category.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

The ESTs of L. longipalpis were submitted to a search against the three categories of Gene Ontology (NCBI). The e-value cutoff was 1.0e-5.</p

The Francis Crick Institute

β-defensin sequence analysis.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

(A) Neighbor-joining tree of putative β-defensins: L. longipalpis 1 (BAR005E03/AM091821, male reproductive organs and whole female cDNA libraries), L. longipalpis 2 (EU124626, midgut female library) and L. longipalpis 3 (EX211140, midgut female library), A. aegypti (AEL009861), A. gambiae (AGAP007049), D. melanogaster (CG10433), and B. mori (NP_001106745). Bootstrap percentage values indicated in nodes are based on 1000 replicates. (B) Multiple alignment of putative β-defensin of male reproductive tracts from L. longipalpis and its orthologues in Diptera. Conserved amino acids are indicated by (*).</p

The Francis Crick Institute

Cyclophilin sequence analysis.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

(A) Neighbor-joining tree of putative cyclophilin L. longipalpis (RAAPBAR022E08/AM092289, male reproductive organs and whole female cDNA libraries), A. gambiae (AGAP007088-PA), A. aegypti (AAEL013279), D. melanogaster (FBpp0071844/CG2852) and A. mellifera (NP_001229473). Bootstrap percentage values indicated in nodes are based on 1000 replicates. (B) Multiple alignment of putative cyclophilin of male reproductive tracts from L. longipalpis and its orthologues in Diptera. Conserved amino acids are indicated by (*).</p

The Francis Crick Institute

Putative L. longipalpis mRGPs.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

N- Number of reads. DB- Database. PTN- Protein. COEBE4D- carboxylesterase, beta esterase. Crisp- Cysteine-rich secreted proteins.*ESTs that have yielded best matches to mRGPs/Acps from protein databases (three against A. aegypti and two against A. gambiae). AGAP00 Sequences come from AgamP3.6_vectorbase and AAEL0 Sequences come from AaegL1.2_vetorbase.</p

The Francis Crick Institute

Protease inhibitor sequence analysis.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

(A) Neighbor-joining tree of putative protease inhibitor L. longipalpis (RAAPBAR023H02/EW989852 B male reproductive organs and midgut female cDNA library), A. aegypti (AAEL000551), A. gambiae (AGAP011319), and Apis mellifera (XP_003250953). Bootstrap percentage values indicated in nodes are based on 1000 replicates. (B) Multiple alignment of putative protease inhibitor of male reproductive tracts from L. longipalpis and its orthologues in Diptera. Conserved amino acids are indicated by (*).</p

The Francis Crick Institute

Astacin metalloprotease sequence analysis.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

(A) Neighbor-joining tree of putative astacin from L. longipalpis (RAAPBAR022F08 male reproductive organs cDNA libraries), L. longipalpis 2 (AM088883 whole female cDNA libraries) and L. longipalpis 3 (Lulo-Astacin A8CW49_LUTLO, midgut female library) A. aegypti (AAEL013449), A. gambiae (AGAP010764), D. melanogaster (FBpp0080341/CG15254) and Nasonia vitripenis (NV12552). Bootstrap percentage values indicated in nodes are based on 1000 replicates. (B) Multiple alignment of putative astacin of male reproductive tracts from L. longipalpis and its orthologues in Diptera. Conserved amino acids are indicated by (*).</p

The Francis Crick Institute

Thioredoxin sequence analysis.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

(A) Neighbor-joining tree of putative thioredoxin L. longipalpis (RAAPBAR020D12 male reproductive organs cDNA libraries), A. aegypti (AAEL010777), A. gambiae (AGAP009584-PA) and Tribolium castaneum (XM_962894.2). Bootstrap percentage values indicated in nodes are based on 1000 replicates. (B) Multiple alignment of putative thioredoxin of male reproductive tracts from L. longipalpis and its orthologues in Diptera. Conserved amino acids are indicated by (*).</p

The Francis Crick Institute

ESTs with other specific function.

Author: Alberto M. R. Davila (173010)
Alexandre A. Peixoto (138715)
Camila J. Mazzoni (172989)
Denise B. S. Dias (172981)
Glauber Wagner (173002)
Jorge A. C. Bretãs (172984)
Nataly A. Souza (138702)
Renata V. D. M. Azevedo (172978)
Rodolpho M. Albano (172995)
Publication venue
Publication date
Field of study

N- Number of reads. OBP-Odorant Binding Protein.</p

The Francis Crick Institute