1,057 research outputs found
Genotype investigator for Genome Wide Analysis (GIGwA) [P1111]
Comparing genetic variations in relation to the functional data is essential to understand the adaptation of organisms to their ecosystems. However, now with the data deluge produced by the Next Generation Sequencing (NGS) arise some computational challenges regarding storage, search, sharing, analysis, and visualization of data that redefine some practices in data management. In this domain, the traditional relational databases are widely used approaches to store and query data in various forms. However, their major drawback is the lack of flexibility to design the field structures of the data. In addition, relational databases are not efficient to retrieve Gigabytes of data. In this context, we used the emerging technology called NoSQL for " Not only SQL ", which refers to non-relational database management systems designed for large-scale data storage and massively-parallel data processing. GIGwA was mainly developed to manage genomic, transcriptomic and genotyping data from NGS analysis because biologists need to handle large VCF files to filter, query and extract data for their research. However, most of existing tools are mainly targeted at experienced users by providing command line API. The aim of GIGwA is to provide a Web user interface to make the system accessible to users from the biological field. (Texte intégral
Enabling knowledge management in the agronomic domain. W448
The drastic growth in data in the recent years, within the Agronomic sciences has brought the concept of knowledge management to the forefront. Some of the factors that contribute to this change include a) conducting high-throughput experiments have become affordable, the time spent in generating data through these experiments are minuscule when compared to its integration and analysis; b) publishing data over the web is fairly trivial and c) multiple databases exist for each type of data (i.e. 'omics' data) with a possible overlap or slight variation in its coverage [1, 2]. In most cases these sources remain autonomous and disconnected. Hence, efficiently managed data and the underlying knowledge in principle will make data analysis straightforward aiding in more efficient decision making. At the Institute of Computational Biology (IBC), we are involved in developing methods to aid data integration and knowledge management within the domain of Agronomic sciences to improve information accessibility and interoperability. To this end, we address the challenge by pursuing several complementary research directions towards: distributed, heterogeneous data integration. This talk will focus mainly on, ongoing projects at IBC: a) Agronomic Linked Data (AgroLD): is a Semantic Web knowledge base designed to integrate data from various publically available plant centric data sources. These include Gramene, Oryzabase, TAIR and resources from the South Green platform among many others. The aim of AgroLD project is to provide a portal for bioinformaticians and domain experts to exploit the homogenized data towards enabling to bridge the knowledge. b) GIGwA: is a tool developed to manage genomic, transcriptomic and genotyping large data resulting from NGS analyses. Often biologists are required to handle large VCF files to filter, query and extract data for their research. The existing tools are mainly targeted for experienced users by providing command line APIs. With GIGwA, we aim to provide a web user interface to make the system accessible to users from the biological field. (Résumé d'auteur
Le rôle épistémique de certaines simulations informatiques fondamentales en théorie de l'évolution
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal
Simultaneous identification of specifically interacting paralogs and inter-protein contacts by Direct-Coupling Analysis
Understanding protein-protein interactions is central to our understanding of
almost all complex biological processes. Computational tools exploiting rapidly
growing genomic databases to characterize protein-protein interactions are
urgently needed. Such methods should connect multiple scales from evolutionary
conserved interactions between families of homologous proteins, over the
identification of specifically interacting proteins in the case of multiple
paralogs inside a species, down to the prediction of residues being in physical
contact across interaction interfaces. Statistical inference methods detecting
residue-residue coevolution have recently triggered considerable progress in
using sequence data for quaternary protein structure prediction; they require,
however, large joint alignments of homologous protein pairs known to interact.
The generation of such alignments is a complex computational task on its own;
application of coevolutionary modeling has in turn been restricted to proteins
without paralogs, or to bacterial systems with the corresponding coding genes
being co-localized in operons. Here we show that the Direct-Coupling Analysis
of residue coevolution can be extended to connect the different scales, and
simultaneously to match interacting paralogs, to identify inter-protein
residue-residue contacts and to discriminate interacting from noninteracting
families in a multiprotein system. Our results extend the potential
applications of coevolutionary analysis far beyond cases treatable so far.Comment: Main Text 19 pages Supp. Inf. 16 page
Evaluation des outils terminologiques : enjeux, difficultés et propositions
International audienceCas particulier parmi les tâches de traitement automatique des langues, l'acquisition terminologique n'a guère fait l'objet d'évaluation systématique jusqu'à présent. Les campagnes qui ont eu lieu sont récentes et limitées. Il est cependant nécessaire de conduire des évaluations pour faire le bilan des recherches passées, mesurer les progrès accomplis et les angles morts. Cet article défend l'idée qu'on peut définir des protocoles d'évaluation comparative même pour des tâches complexes comme la terminologie computationnelle. La méthode proposée s'appuie sur une décomposition des outils d'analyse terminologique en fonctionnalités élémentaires ainsi que sur la définition de mesures de précision et de rappel adaptées aux problèmes terminologiques, à savoir la complexité des produits terminologiques, la dépendance aux applications, le rôle de l'interaction avec l'utilisateur et la variabilité des terminologies de référence
Genome-wide diversity and gene expression profiling of Babesia microti isolates identify polymorphic genes that mediate host-pathogen interactions
Babesia microti, a tick-transmitted, intraerythrocytic protozoan parasite circulating mainly among small mammals, is the primary cause of human babesiosis. While most cases are transmitted by Ixodes ticks, the disease may also be transmitted through blood transfusion and perinatally. A comprehensive analysis of genome composition, genetic diversity, and gene expression profiling of seven B. microti isolates revealed that genetic variation in isolates from the Northeast United States is almost exclusively associated with genes encoding the surface proteome and secretome of the parasite. Furthermore, we found that polymorphism is restricted to a small number of genes, which are highly expressed during infection. In order to identify pathogen-encoded factors involved in host-parasite interactions, we screened a proteome array comprised of 174 B. microti proteins, including several predicted members of the parasite secretome. Using this immuno-proteomic approach we identified several novel antigens that trigger strong host immune responses during the onset of infection. The genomic and immunological data presented herein provide the first insights into the determinants of B. microti interaction with its mammalian hosts and their relevance for understanding the selective pressures acting on parasite evolution
THEORIE COGNITIVE DE LA CULTURE (une alternative évolutionniste à la sociobiologie et à la sélection collective)
Les théories sociobiologiques, « matérialistes » et de sélection collectif au niveau des groupes postulent des normes comme unités fonctionnelles de la sélection naturelle et culturelle. Ces théories ignorent la façon dont ces normes sont représentées dans l'esprit et comment elles causent les comportements. Ces normes sont souvent des réflexions commodes de l'homme de la rue ou des rapports pratiques de chercheurs solitaires résumant le flux des expériences dans telle ou telle «culture». Elles ne sont que des balises des tendances comportementales et non des règles de comportement, rendant possible la communication et le consensus dans des situations nouvelles ou incertaines. Faute de contenu ou de limites fiables, les normes ne peuvent se répliquer assez fidèlement pour satisfaire la sélection darwinienne. Aussi, la notion d'un ensemble de normes régissant une société, ou sa « vision du monde », et qui posséderait des avantages d'un point de vue darwinien pour une culture tout entière n'a guère de sens. En effet, les cultures n'ont pas de propriétés ou de bornes prédéfinies nécessaires à l'héritabilité, mais elles s'inscrivent en une multitude de modalités leur permettant de se diffuser, se transformer, se mélanger, s'éteindre et même réémerger. Une « expérience de jardin » menée dans les Basses Terres maya illustre les avantages qu'il y a à adopter une approche évolutionniste différente, l'épidémiologie culturelle, si l'on veut procéder à l'analyse causale de la formation et du développement des sociétés. Cette approche s'écarte sensiblement des approches essentialistes fondées sur des normes et des règles qui réifient la culture. La perspective épidémiologique considère les distributions et variations d'idées et de comportements comme un objet d'étude à part entière, elle envisage le désaccord entre personnes comme un signal et non comme bruit ou déviance. A l'instar de l'espèce darwinienne, une culture n'a pas d'existence au-delà des individus et des contextes écologiques qui la constituent
Traitement de données bioinformatiques massives (Big Data)
The volumes of bioinformatics data available on the Web are constantly increasing.Access and joint exploitation of these highly distributed data (i.e, available in distributed Webdata sources) and highly heterogeneous (in text or tabulated les including images, in dierentformats, described with dierent levels of detail and dierent levels of quality ...) is essential forthe biological knowledge to progress. The purpose of this short report is to present in a simpleway the problems of the joint use of bioinformatics data.Les volumes des donnees bioinformatiques disponibles sur le Web sont en constanteaugmentation. L'acces et l'exploitation conjointe de ces donnees tres reparties (i.e., disponiblesdans des sources de donnees distribuees sur le Web) et fortement heterogenes (sous forme textuelleou sous forme de chiers tabules, incluant ou non des images, decrites avec dierents niveaux dedetails et de qualite. . . ), est essentielle pour que les connaissances en biologie puissent progresser.L'objectif de ce rapport est de presenter de facon simple les problemes poses par l'utilisationconjointe des donnees bioinformatiques
Occupational and non-occupational exposure of non-smokers to environmental tobacco smoke in Switzerland : preliminary results of an original campaig
A passive sampling device called Monitor of NICotine or "MoNIC", was constructed and evaluated by IST
laboratory for determining nicotine in Environmental Tobacco Smoke (ETS). Vapour nicotine was passively
collected on a potassium bisulfate treated glass fibre filter as collection medium. Analysis of amount of
nicotine on the treated filter by gas chromatography equipped with Thermoionic-Specific Detector (GCTSD)
after liquid-liquid extraction of 1mL of 5N NaOH : 1 mL of n-heptane saturated with NH3 using
quinoline as internal standard. Based on nicotine amount of 0.2 mg/cigarette as reference, the inhaled
Cigarette Equivalents (CE) by non-smokers can be calculated. Using the detected CE on the badge for nonsmokers,
and comparing with amount of nicotine and cotinine level in saliva of both smokers and exposed
non-smokers (N=49), we can confirm the use of the CE concept for estimating exposure to ETS.
The Valais CIPRET (Center of information and prevention of the addiction to smoking), is going to
organize a big campaign on the subject of the passive addiction to smoking entitled "Smoked passive, we
suffer from it, we die from it ". This campaign will take place in 2007 and has for objective to inform clearly
the population of Valais of the dangerousness of the passive smoke. More than 1'500 MoNIC badges were
gracefully distributed to Swiss population to perform a self-monitoring of population exposure level to ETS,
expressed in term of CE. Non-stimulated saliva were also collected to determine ETS biomarkers
nicotine/cotinine levels of participating volunteers.
Preliminary results of different levels of CE in occupational and non-occupational situations in relation with
ETS were presented in this study
- …
