Search CORE

261 research outputs found

Verification of high-level transformations with inductive refinement types

Author: Aiken Alexander
Albarghouthi Aws
Alexei
Andreescu Oana Fabiana
Benzaken Véronique
Bodin Martin
Cousot Patrick
Cousot Patrick
Evan Chang Bor-Yuh
Freeman Timothy S.
Mitchell Neil
Perrelle Valentin
Pham Tuan-Hung
Reynolds Andrew
Rival Xavier
Sloane Anthony M.
Toubhans Antoine
Vazou Niki
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

International audienceHigh-level transformation languages like Rascal include expressive features for manipulating large abstract syntax trees: first-class traversals, expressive pattern matching, backtrack-ing and generalized iterators. We present the design and implementation of an abstract interpretation tool, Rabit, for verifying inductive type and shape properties for transformations written in such languages. We describe how to perform abstract interpretation based on operational semantics, specifically focusing on the challenges arising when analyzing the expressive traversals and pattern matching. Finally, we evaluate Rabit on a series of transformations (normaliza-tion, desugaring, refactoring, code generators, type inference, etc.) showing that we can effectively verify stated properties. CCS Concepts • Software and its engineering → General programming languages; • Social and professional topics → History of programming languages

HAL-CentraleSupelec

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Copenhagen University Research Information System

The IT University of Copenhagen's Repository

HAL-Rennes 1

A review of computational approaches detecting microRNAs involved in cancer

Author: Barillot E
Cantini Laura
Caselle Michele
Forget A
Martignetti Loredana
Zinovyev A
Publication venue
Publication date: 01/01/2017
Field of study

Institutional Research Information System University of Turin

Isolation, identification, and complete genome sequence of a bovine adenovirus type 3 from cattle in China

Author: Cai Hong
Dong Xiu-Mei
Gao Yu-Ran
Li Zhao-Li
Lu Chuang
Meng Qing-Feng
Shi Hong-Fei
Xue Fei
Yu Zuo
Zhu Yuan-Mao
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Bovine adenovirus type 3 (BAV-3) belongs to the <it>Mastadenovirus </it>genus of the family <it>Adenoviridae </it>and is involved in respiratory and enteric infections of calves. The isolation of BAV-3 has not been reported prior to this study in China. In 2009, there were many cases in cattle showing similar clinical signs to BAV-3 infection and a virus strain, showing cytopathic effect in Madin-Darby bovine kidney cells, was isolated from a bovine nasal swab collected from feedlot cattle in Heilongjiang Province, China. The isolate was confirmed as a bovine adenovirus type 3 by PCR and immunofluorescence assay, and named as HLJ0955. So far only the complete genome sequence of prototype of BAV-3 WBR-1 strain has been reported. In order to further characterize the Chinese isolate HLJ0955, the complete genome sequence of HLJ0955 was determined. Results The size of the genome of the Chinese isolate HLJ0955 is 34,132 nucleotides in length with a G+C content of 53.6%. The coding sequences for gene regions of HLJ0955 isolate were similar to the prototype of BAV-3 WBR-1 strain, with 80.0-98.6% nucleotide and 87.5-98.8% amino acid identities. The genome of HLJ0955 strain contains 16 regions and four deletions in inverted terminal repeats, E1B region and E4 region, respectively. The complete genome and DNA binding protein gene based phylogenetic analysis with other adenoviruses were performed and the results showed that HLJ0955 isolate belonged to BAV-3 and clustered within the <it>Mastadenovirus </it>genus of the family <it>Adenoviridae</it>. Conclusions This is the first study to report the isolation and molecular characterization of BAV-3 from cattle in China. The phylogenetic analysis performed in this study supported the use of the DNA binding protein gene of adenovirus as an appropriate subgenomic target for the classification of different genuses of the family <it>Adenoviridae </it>on the molecular basis. Meanwhile, a large-scale pathogen and serological epidemiological investigations for BVA-3 infection might be carried out in cattle in China. This report will be a good beginning for further studies on BAV-3 in China.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Potential of the Julia programming language for high energy physics computing

Author: Acosta U. Hernandez
Briceño A. Moreno
Eschle J.
Gal T.
Giordano M.
Gras P.
Hegner B.
Heinrich L.
Kluth S.
Ling J.
Mato P.
Mikhasenko M.
Pivarski J.
Samaras-Tsakiris K.
Schulz O.
Stewart G. . A.
Strube J.
Vassilev V.
Publication venue
Publication date: 01/01/2023
Field of study

Research in high energy physics (HEP) requires huge amounts of computing and storage, putting strong constraints on the code speed and resource usage. To meet these requirements, a compiled high-performance language is typically used; while for physicists, who focus on the application when developing the code, better research productivity pleads for a high-level programming language. A popular approach consists of combining Python, used for the high-level interface, and C++, used for the computing intensive part of the code. A more convenient and efficient approach would be to use a language that provides both high-level programming and high-performance. The Julia programming language, developed at MIT especially to allow the use of a single language in research activities, has followed this path. In this paper the applicability of using the Julia language for HEP research is explored, covering the different aspects that are important for HEP code development: runtime performance, handling of large projects, interface with legacy code, distributed computing, training, and ease of programming. The study shows that the HEP community would benefit from a large scale adoption of this programming language. The HEP-specific foundation libraries that would need to be consolidated are identifiedComment: 32 pages, 5 figures, 4 table

arXiv.org e-Print Archive

HAL-CEA

CERN Document Server

Clinical Characteristics and Neuroanatomical Predictors of Acute Antidepressant Outcome for Patients with Comorbid Depression and Mild Cognitive Impairment

Author: Motter Jeffrey N
Publication venue: CUNY Academic Works
Publication date: 01/09/2019
Field of study

Background: Older adults presenting with both a depressive disorder (DEP) and cognitive impairment (CI) represent a unique, understudied population. The classification of cognitive impairment severity continues to be debated though it has recently been subtyped into late (LMCI) versus early (EMCI) stages. Previous studies have found associations between treatment outcome and both cortical thickness and white matter hyperintensities (WMH), though report inconsistent directionality and affected regions. In this study, we examined baseline clinical characteristics and neuroanatomical features as prognostic indicators for older adults with comorbid DEP and CI participating in an open antidepressant trial. EMCI is hypothesized to have greater cortical thickness and global cognition than LMCI. Antidepressant treatment remitters and responders are hypothesized to have greater cortical thickness and lower WMH burden than non-remitters and non-responders. Methods: Key inclusion criteria were diagnosis of major depression or dysthymic disorder with Hamilton Depression Rating Scale (HDRS) score \u3e14, and cognitive impairment defined by MMSE score ≥21 and impaired performance on the WMS-R Logical Memory II test. Patients were classified as EMCI or LMCI based on the 1.5 SD cutoff on tests of verbal memory, and compared on baseline clinical, neuropsychological, and anatomical characteristics. All patients underwent a baseline MRI scan and received open antidepressant treatment for 8 weeks. Cortical thickness was extracted using an automated brain segmentation and reconstruction program (FreeSurfer). Vertex-wise analyses were conducted using general linear models to evaluate the relationships between cortical thickness and clinical variables. Results: 79 DEP-CI patients were recruited, of whom 39 met criteria for EMCI and 40 for LMCI. The mean age was 68.9 and mean HDRS was 23.0. LMCI patients had significantly worse global cognition and smaller right hippocampal volume compared to EMCI patients. EMCI patients had thicker right medial orbitofrontal cortex than LMCI. MRI indices of cerebrovascular disease did not differ between MCI subtypes. Remitters had greater deep WMH burden, left medial orbitofrontal gyrus thickness, and right superior frontal gyrus thickness than non-remitters. Greater HDRS depressive severity was positively correlated with right pars triangularis thickness. Stronger ADAS-Cog global cognitive performance was positively correlated with thickness in diffuse cortical areas. Conclusions: Cognitive and neuronal loss markers differed between EMCI and LMCI among patients with DEP-CI, with LMCI being more likely to have the clinical and neuronal loss markers known to be associated with Alzheimer’s disease. Samples of DEP-CI exhibit unique patterns of cortical thickness and WMHs compared to their non-CI peers. Cortical thickness may serve as predictor of treatment remission and relates to both depressive severity and global cognition

City University of New York

A Mathematical Methodology for Determining the Temporal Order of Pathway Alterations Arising during Gliomagenesis

Human cancer is caused by the accumulation of genetic alterations in cells. Of special importance are changes that occur early during malignant transformation because they may result in oncogene addiction and thus represent promising targets for therapeutic intervention. We have previously described a computational approach, called Retracing the Evolutionary Steps in Cancer (RESIC), to determine the temporal sequence of genetic alterations during tumorigenesis from cross-sectional genomic data of tumors at their fully transformed stage. Since alterations within a set of genes belonging to a particular signaling pathway may have similar or equivalent effects, we applied a pathway-based systems biology approach to the RESIC methodology. This method was used to determine whether alterations of specific pathways develop early or late during malignant transformation. When applied to primary glioblastoma (GBM) copy number data from The Cancer Genome Atlas (TCGA) project, RESIC identified a temporal order of pathway alterations consistent with the order of events in secondary GBMs. We then further subdivided the samples into the four main GBM subtypes and determined the relative contributions of each subtype to the overall results: we found that the overall ordering applied for the proneural subtype but differed for mesenchymal samples. The temporal sequence of events could not be identified for neural and classical subtypes, possibly due to a limited number of samples. Moreover, for samples of the proneural subtype, we detected two distinct temporal sequences of events: (i) RAS pathway activation was followed by TP53 inactivation and finally PI3K2 activation, and (ii) RAS activation preceded only AKT activation. This extension of the RESIC methodology provides an evolutionary mathematical approach to identify the temporal sequence of pathway changes driving tumorigenesis and may be useful in guiding the understanding of signaling rearrangements in cancer development

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

FigShare

A computational intelligence analysis of G proteincoupled receptor sequinces for pharmacoproteomic applications

Author: Cárdenas Domínguez Martha Ivón
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2017
Field of study

Arguably, drug research has contributed more to the progress of medicine during the past decades than any other scientific factor. One of the main areas of drug research is related to the analysis of proteins. The world of pharmacology is becoming increasingly dependent on the advances in the fields of genomics and proteomics. This dependency brings about the challenge of finding robust methods to analyze the complex data they generate. Such challenge invites us to go one step further than traditional statistics and resort to approaches under the conceptual umbrella of artificial intelligence, including machine learning (ML), statistical pattern recognition and soft computing methods. Sound statistical principles are essential to trust the evidence base built through the use of such approaches. Statistical ML methods are thus at the core of the current thesis. More than 50% of drugs currently available target only four key protein families, from which almost a 30% correspond to the G Protein-Coupled Receptors (GPCR) superfamily. This superfamily regulates the function of most cells in living organisms and is at the centre of the investigations reported in the current thesis. No much is known about the 3D structure of these proteins. Fortunately, plenty of information regarding their amino acid sequences is readily available. The automatic grouping and classification of GPCRs into families and these into subtypes based on sequence analysis may significantly contribute to ascertain the pharmaceutically relevant properties of this protein superfamily. There is no biologically-relevant manner of representing the symbolic sequences describing proteins using real-valued vectors. This does not preclude the possibility of analyzing them using principled methods. These may come, amongst others, from the field of statisticalML. Particularly, kernel methods can be used to this purpose. Moreover, the visualization of high-dimensional protein sequence data can be a key exploratory tool for finding meaningful information that might be obscured by their intrinsic complexity. That is why the objective of the research described in this thesis is twofold: first, the design of adequate visualization-oriented artificial intelligence-based methods for the analysis of GPCR sequential data, and second, the application of the developed methods in relevant pharmacoproteomic problems such as GPCR subtyping and protein alignment-free analysis.Se podría decir que la investigación farmacológica ha desempeñado un papel predominante en el avance de la medicina a lo largo de las últimas décadas. Una de las áreas principales de investigación farmacológica es la relacionada con el estudio de proteínas. La farmacología depende cada vez más de los avances en genómica y proteómica, lo que conlleva el reto de diseñar métodos robustos para el análisis de los datos complejos que generan. Tal reto nos incita a ir más allá de la estadística tradicional para recurrir a enfoques dentro del campo de la inteligencia artificial, incluyendo el aprendizaje automático y el reconocimiento de patrones estadístico, entre otros. El uso de principios sólidos de teoría estadística es esencial para confiar en la base de evidencia obtenida mediante estos enfoques. Los métodos de aprendizaje automático estadístico son uno de los fundamentos de esta tesis. Más del 50% de los fármacos en uso hoy en día tienen como ¿diana¿ apenas cuatro familias clave de proteínas, de las que un 30% corresponden a la super-familia de los G-Protein Coupled Receptors (GPCR). Los GPCR regulan la funcionalidad de la mayoría de las células y son el objetivo central de la tesis. Se desconoce la estructura 3D de la mayoría de estas proteínas, pero, en cambio, hay mucha información disponible de sus secuencias de amino ácidos. El agrupamiento y clasificación automáticos de los GPCR en familias, y de éstas a su vez en subtipos, en base a sus secuencias, pueden contribuir de forma significativa a dilucidar aquellas de sus propiedades de interés farmacológico. No hay forma biológicamente relevante de representar las secuencias simbólicas de las proteínas mediante vectores reales. Esto no impide que se puedan analizar con métodos adecuados. Entre estos se cuentan las técnicas provenientes del aprendizaje automático estadístico y, en particular, los métodos kernel. Por otro lado, la visualización de secuencias de proteínas de alta dimensionalidad puede ser una herramienta clave para la exploración y análisis de las mismas. Es por ello que el objetivo central de la investigación descrita en esta tesis se puede desdoblar en dos grandes líneas: primero, el diseño de métodos centrados en la visualización y basados en la inteligencia artificial para el análisis de los datos secuenciales correspondientes a los GPCRs y, segundo, la aplicación de los métodos desarrollados a problemas de farmacoproteómica tales como la subtipificación de GPCRs y el análisis de proteinas no-alineadas

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Deep Representation Learning of Electronic Health Records to Unlock Patient Stratification at Scale

Author: Cherng Sarah
Danieletto Matteo
Dudley Joel T.
Furlanello Cesare
Glicksberg Benjamin S.
Landi Giulia
Landi Isotta
Lee Hao-Chih
Miotto Riccardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2020
Field of study

Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here we present an unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. We considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising of a total of 57,464 clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks, and autoencoders (i.e., ConvAE) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. ConvAE significantly outperformed several baselines in a clustering task to identify patients with different complex conditions, with 2.61 entropy and 0.31 purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson's disease and Alzheimer's disease, largely related to comorbidities, disease progression, and symptom severity. With these results, we demonstrate that ConvAE can generate patient representations that lead to clinically meaningful insights. This scalable framework can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.Comment: C.F. and R.M. share senior authorshi

arXiv.org e-Print Archive

Directory of Open Access Journals

Pathogenic Determinants of the Mycobacterium kansasii Complex: An Unsuspected Role for Distributive Conjugal Transfer.

Author: Bertelli C.
Greub G.
Jaton K.
Pillonel T.
Tagini F.
Publication venue: 'MDPI AG'
Publication date: 10/02/2021
Field of study

The Mycobacterium kansasii species comprises six subtypes that were recently classified into six closely related species; Mycobacterium kansasii (formerly M. kansasii subtype 1), Mycobacterium persicum (subtype 2), Mycobacterium pseudokansasii (subtype 3), Mycobacterium ostraviense (subtype 4), Mycobacterium innocens (subtype 5) and Mycobacterium attenuatum (subtype 6). Together with Mycobacterium gastri, they form the M. kansasii complex. M. kansasii is the most frequent and most pathogenic species of the complex. M. persicum is classically associated with diseases in immunosuppressed patients, and the other species are mostly colonizers, and are only very rarely reported in ill patients. Comparative genomics was used to assess the genetic determinants leading to the pathogenicity of members of the M. kansasii complex. The genomes of 51 isolates collected from patients with and without disease were sequenced and compared with 24 publicly available genomes. The pathogenicity of each isolate was determined based on the clinical records or public metadata. A comparative genomic analysis showed that all M. persicum, M. ostraviense, M innocens and M. gastri isolates lacked the ESX-1-associated EspACD locus that is thought to play a crucial role in the pathogenicity of M. tuberculosis and other non-tuberculous mycobacteria. Furthermore, M. kansasii was the only species exhibiting a 25-Kb-large genomic island encoding for 17 type-VII secretion system-associated proteins. Finally, a genome-wide association analysis revealed that two consecutive genes encoding a hemerythrin-like protein and a nitroreductase-like protein were significantly associated with pathogenicity. These two genes may be involved in the resistance to reactive oxygen and nitrogen species, a required mechanism for the intracellular survival of bacteria. Three non-pathogenic M. kansasii lacked these genes likely due to two distinct distributive conjugal transfers (DCTs) between M. attenuatum and M. kansasii, and one DCT between M. persicum and M. kansasii. To our knowledge, this is the first study linking DCT to reduced pathogenicity

Multidisciplinary Digital Publishing Institute

Serveur académique lausannois

mBLAST: Keeping up with the sequencing explosion for (meta) genome analysis

Author: Abubucker Sahar
Baldhandapani Venkat
Becker Eric
Davis Curtis
Gong Wei
Hudson Matthew E
Khetani Radhika
Kota Karthik
Martin John
Mitreva Makedonka
Weinstock George M
Wylie Kristine M
Publication venue: Digital Commons@Becker
Publication date: 01/01/2013
Field of study

Digital Commons@Becker