12 research outputs found

    Computational Tools for Analyzing Correlations Between Microbial Biological Diversity and Ecosystems

    Get PDF
    Metagenomics has developed into a reliable mechanism to analyze microbial diversity of microbial communities in the recent years. Through the use of next-generation sequencing, metagenomic studies can generate billions of short sequencing reads that are processed by computational tools. However, with the rapid adoption of metagenomics, a large amount of data has been produced. This high level of data production requires the development of computational tools and pipelines to manage data scalability and performance. In this thesis, we developed several tools that will aid in the exploration of the large amount of DNA sequence data, and we further developed a bioinformatic pipeline that will enhance the use of the developed tools by researchers with minimum computational background while also making them available for widespread use across the field of microbiology so that the research community can further contribute to development of these tools to overcome the growing computational challenges resultant from continued technological advances in high throughput DNA sequencing

    A comparative study on psychiatric disorders: Identification of shared pathways and common agents

    Get PDF
    Distinct but closely related diseases generally present shared symptoms, which address possible overlaps among their pathogenic mechanisms. Identification of significantly impacted shared pathways and other common agents are expected to elucidate etiology of these disorders and to help design better intervention strategies. In this research effort, we studied six psychiatric disorders including schizophrenia (SCZ), anorexia (AN), bipolar disorder (BD), depressive disorder (DD), autism (AU) and attention deficit hyperactivity disorder (ADHD). Our methodology can be classified into the following two parts: In Part I, common susceptibility genes; and in Part II, genome-wide association studies (GWAS) data were used to find enriched pathways of psychiatric disorders. 59 KEGG pathways were commonly identified in both parts. 31 of these pathways are disease pathways. Pathways related to cancer and infectious diseases were predominant compared to others. Most of the acquired pathways were in accordance with previous studies in literature. A combination of susceptibility genes and GWAS data is an effective approach to identify significantly impacted pathways in multifactorial diseases. In this respect, shared modules were determined after applying hierarchical clustering of the enriched pathways. These identified modules may tell us the association of psychiatric disorders with the enriched pathways. Taken all together, common pathways and shared modules are expected to highlight the causative factors and important mechanisms behind complex psychiatric diseases, leading to effective drug discovery. © 2022 IEEE

    Single Nucleotide Polymorphism Applications in Animals

    Get PDF
    Polymorphism occurs between individuals of the same species. It can be distinguished phenotypically in production, disease resistance and tolerance, coat color, height, weight… etc and genotypically using molecular markers for instance Restriction Fragment Length Polymorphisms (RFLPs), Minisatellites or VNTRs (Variable Numbers of Tandem Repeats) Microsatellites or SSRs (Simple Sequence Repeats), large (copy number variants) and small segmental deletions/insertions/duplications, Single Nucleotide Polymorphisms (SNPs).SNPs are changes of single base to another base in a given region of the genome. It is widely accepted molecular marker due to high throughput technology and statistically sound approach. They are used to identify candidate genes, gene mapping, QTL identification, and generally applied in marker assisted breeding. It has been used in disease diagnosis and pharmacology to identify the correct medication for the individual patients. It has also advantage in food safety and quality assurance scheme.  It helps to recognize the species in the meat market, which is protected from illegal substitution of high price meat with low price one and unhealthy meat. Moreover SNPs are better marker than others in its application and advanced technology that support its acceptability.Therefore, this review article illustrates the application of SNPs (single nucleotide polymorphism) markers. Keywords: SNPs, Genomics, Breeding, Disease, Food safet

    NGS data analysis: a review of major tools and pipeline frameworks for variant discovery

    Get PDF
    [EN]The analysis of genetic data has always been a problem due to the large amount of information available and the difficulty in isolating that which is relevant. However, over the years progress in sequencing techniques has been accompanied by a development of computer techniques to the current application of artificial intelligence. We can summarize the phases of sequence analysis in the following: quality assessment, alignment, pre-variant processing, variant calling and variant annotation. In this article we will review and comment on the tools used in each phase of genetic sequencing, and analyze the drawbacks and advantages offered by each of them

    Molecular Mechanisms of Crop Domestication Revealed by Comparative Analysis of the Transcriptomes Between Cultivated and Wild Soybeans

    Get PDF
    Soybean is one of the key crops necessary to meet the food requirement of the increasing global population. However, in order to meet this need, the quality and quantity of soybean yield must be greatly enhanced. Soybean yield advancement depends on the presence of favorable genes in the genome pool that have significantly changed during domestication. To make use of those domesticated genes, this study involved seven cultivated, G. max, and four wild-type, G. soja, soybeans. Their genomes were studied from developing pods to decipher the molecular mechanisms underlying crop domestication. Specifically, their transcriptomes were analyzed comparatively to previous related studies, with the intention of contributing further to the literature. For these goals, several bioinformatics applications were utilized, including De novo transcriptome assembly, transcriptome abundance quantification, and discovery of differentially expressed genes (DEGs) and their functional annotations and network visualizations. The results revealed 1,247 DEGs, 916 of which were upregulated in the cultivated soybean in comparison to wild type. Findings were mostly corresponded to literature review results, especially regarding genes affecting two focused, domesticated-related pod-shattering resistance and seed size traits. These traits were shown to be upregulated in cultivated soybeans and down-regulated in wild type. However, the opposite trend was shown in disease-related genes, which were down-regulated or not even present in the cultivated soybean genome. Further, 47 biochemical functions of the identified DEGs at the cellular level were revealed, providing some knowledge about the molecular mechanisms of genes related to the two aforementioned subjected traits. While our findings provide valuable insight about the molecular mechanisms of soybean domestication attributed to annotation of differentially expressed genes and transcripts, these results must be dissected further and/or reprocessed with a higher number of samples in order to advance the field

    Identifying disease associated genes by network propagation

    Full text link

    A Review of Recent Gene Expression-Based and DNA Methylation-Based Mathematical Cell Type Deconvolution Methods

    Get PDF
    In recent years, many cell type deconvolution methods based on DNA methylation data and gene expression data have been developed. Both of these two methods have its special advantages and disadvantages, e.g., DNA methylation-based methods’ data source is usually more stable than gene expression and DNA methylation is easier to measure in FFPE tissues or formalin-fixed paraffin-embedded, while some gene-expression data like scRNA-seq data usually has high cost and complexity. On the other hand, gene expression-based deconvolution methods currently have many more available methods than DNA methylation-based deconvolution methods, which leads to DNA methylation-based methods in many cases can learn from the existing gene expression-based methods, e.g., the EMeth learns from ICeD-T while the MethylCIBERSORT learns from CIBERSORT. Since both of these two kinds of different data-based methods are powerful tools to realize the purpose of cell type-specific deconvolution and may could benefit each other’s development, as well as they have been still rapidly developing in recent years with believably more coming new methods in the future. It may be well worth looking back and comparing some recent gene expression data-based and DNA methylation-based deconvolution methods to get some comprehensive sense of this field’s development and directions on both two different data-based deconvolution method

    Система виявлення вторгнень у комп'ютерну мережу

    Get PDF
    Пояснювальна записка дипломного проекту складається з чотирьох розділів, містить 32 таблиці, 8 додатків та 57 джерел – загалом 102 сторінок. Об`єкт дослідження: вторгнення у комп’ютерну мережу. Мета магістерської дисертації: підвищення ефективності виявлення вторгнень за рахунок алгоритмів машинного навчання У першому розділі було проведено огляд рішень існуючих систем, їх актуальність та загальні відомості про системи виявлення вторгнень. У другому розділі було проведено Формування вимог до системи. У третьому розділі було проведено розробку системи з вибором технологій, які необхідні для створення, порівняння їх між та вибір найбільш підходящих. У четвертому розділі було виконано етап розроблення стартап-проекту.The explanatory note of the diploma project consists of four sections, contains 32 table, 8 applications and 57 sources - a total of 102 pages. The object of study: instruction detection system. The aim of the diploma project: increasing the efficiency of intrusion detection due to machine learning algorithms The first section reviewed the solutions of existing systems, their relevance and general information about intrusion detection systems. In the second section, the formation of system requirements was carried out. In the third section, a system was developed with a choice of technologies needed to create, compare them between and select the most appropriate. In the fourth section, the stage of developing a startup project was performed

    Caracterización molecular del gen que codifica al receptor tipo-Toll 2 (TLR2) en alpacas (Vicugna pacos) y llamas (Lama glama)

    Get PDF
    Caracteriza molecularmente la ORF (Open Reading Frame) completa del gen que codifica al TLR2 en alpacas y llamas. Para esto se colectaron 10 muestras de sangre de animales adultos (5 alpacas y 5 llamas) clínicamente saludables de cada especie, para la obtención de células mononucleares de sangre periférica por centrifugación. A partir de estas células se extrajo el ARN total y se evalúo la integridad del ARN y cantidad por fluorometría. El ARN viable fue sometido a una RT-PCR para la amplificación del ORF completo del TLR2. Los mejores productos fueron seleccionados para su clonación y secuenciación. Las secuencias de nucleótidos fueron editadas, ensambladas, alineadas y convertidas a secuencias de aminoácidos, usando los programas Chromas de Technelysium Pty Ltd y, SeqMan, EditSeq, y MegAlign Pro del paquete bioinformático DNASTAR Lasergene, respectivamente. Para el análisis filogenético se utilizó el software MEGA X, la predicción de los dominios estructurales de los TLR2 fue realizado utilizando los softwares en línea SMART y LRRsearch. Como resultado del estudio se obtuvo 3 clones, uno de alpaca y 2 de llamas, las 7 secuencias de dichos clones confirmaron la existencia de cómo mínimo 3 isoformas para este gen en alpacas y llamas. Al análisis filogenético se observa que dichas isoformas tienen una estrecha relación genética mostrando un porcentaje de similaridad que varía entre 99.1% al 99.5% cuando comparamos las secuencias de un mismo clon, y entre 97.6% al 99.5% cuando se comparan los diferentes clones. Las secuencias fueron comparadas con los genes predictivos de TLR2 en camélidos de viejo mundo obteniendo una similaridad en el rango de 96.1% al 96.7%. Nuestras 7 secuencias fueron anotadas y depositadas en el GenBank (MT350705 - MT350711). Se concluye que las isoformas de alpacas y llamas obtenidas en nuestro estudio son diferentes a las descritas en otros camélidos. Se espera que la metodología desarrollada en este estudio pueda ser replicada en animales enfermos, con el fin de poder identificar polimorfismo de un solo nucleótido (SNPs), para diferenciar genéticamente a los animales resistentes y susceptibles a infecciones en campo.Perú. Universidad Nacional Mayor de San Marcos. Vicerrectorado de Investigación y Posgrado de la UNMSM y al FONDECYT – CONCYTEC a través del PCONFIGI – 2019, Proyecto N° A19080081 (Resolución Rectoral N° 035556-R-19) y por el Proyecto de Ciencias Básicas – FONDECYT con N° de contrato 355-2019
    corecore