Search CORE

12,778 research outputs found

Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare

Author: Duchene Florence
Garbay Catherine
Rialle Vincent
Publication venue
Publication date: 25/11/2004
Field of study

For the last years, time-series mining has become a challenging issue for researchers. An important application lies in most monitoring purposes, which require analyzing large sets of time-series for learning usual patterns. Any deviation from this learned profile is then considered as an unexpected situation. Moreover, complex applications may involve the temporal study of several heterogeneous parameters. In that paper, we propose a method for mining heterogeneous multivariate time-series for learning meaningful patterns. The proposed approach allows for mixed time-series -- containing both pattern and non-pattern data -- such as for imprecise matches, outliers, stretching and global translating of patterns instances in time. We present the early results of our approach in the context of monitoring the health status of a person at home. The purpose is to build a behavioral profile of a person by analyzing the time variations of several quantitative or qualitative parameters recorded through a provision of sensors installed in the home

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

Multiple instance learning for sequence data with across bag dependencies

Author: Aridhi Sabeur
Maddouri Mondher
Nguifo Engelbert Mephu
Zoghlami Manel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In Multiple Instance Learning (MIL) problem for sequence data, the instances inside the bags are sequences. In some real world applications such as bioinformatics, comparing a random couple of sequences makes no sense. In fact, each instance may have structural and/or functional relations with instances of other bags. Thus, the classification task should take into account this across bag relation. In this work, we present two novel MIL approaches for sequence data classification named ABClass and ABSim. ABClass extracts motifs from related instances and use them to encode sequences. A discriminative classifier is then applied to compute a partial classification result for each set of related sequences. ABSim uses a similarity measure to discriminate the related instances and to compute a scores matrix. For both approaches, an aggregation method is applied in order to generate the final classification result. We applied both approaches to solve the problem of bacterial Ionizing Radiation Resistance prediction. The experimental results of the presented approaches are satisfactory

arXiv.org e-Print Archive

HAL Clermont Université

INRIA a CCSD electronic archive server

The C-Terminal Domain of the Arabinosyltransferase Mycobacterium tuberculosis EmbC Is a Lectin-Like Carbohydrate Binding Module

The D-arabinan-containing polymers arabinogalactan (AG) and lipoarabinomannan (LAM) are essential components of the unique cell envelope of the pathogen Mycobacterium tuberculosis. Biosynthesis of AG and LAM involves a series of membrane-embedded arabinofuranosyl (Araf) transferases whose structures are largely uncharacterised, despite the fact that several of them are pharmacological targets of ethambutol, a frontline drug in tuberculosis therapy. Herein, we present the crystal structure of the C-terminal hydrophilic domain of the ethambutol-sensitive Araf transferase M. tuberculosis EmbC, which is essential for LAM synthesis. The structure of the C-terminal domain of EmbC (EmbCCT) encompasses two sub-domains of different folds, of which subdomain II shows distinct similarity to lectin-like carbohydrate-binding modules (CBM). Co-crystallisation with a cell wall-derived di-arabinoside acceptor analogue and structural comparison with ligand-bound CBMs suggest that EmbCCT contains two separate carbohydrate binding sites, associated with subdomains I and II, respectively. Single-residue substitution of conserved tryptophan residues (Trp868, Trp985) at these respective sites inhibited EmbC-catalysed extension of LAM. The same substitutions differentially abrogated binding of di- and penta-arabinofuranoside acceptor analogues to EmbCCT, linking the loss of activity to compromised acceptor substrate binding, indicating the presence of two separate carbohydrate binding sites, and demonstrating that subdomain II indeed functions as a carbohydrate-binding module. This work provides the first step towards unravelling the structure and function of a GT-C-type glycosyltransferase that is essential in M. tuberculosis. Author Summary Top Tuberculosis (TB), an infectious disease caused by the bacillus Mycobacterium tuberculosis, burdens large swaths of the world population. Treatment of active TB typically requires administration of an antibiotic cocktail over several months that includes the drug ethambutol. This front line compound inhibits a set of arabinosyltransferase enzymes, called EmbA, EmbB and EmbC, which are critical for the synthesis of arabinan, a vital polysaccharide in the pathogen's unique cell envelope. How precisely ethambutol inhibits arabinosyltransferase activity is not clear, in part because structural information of its pharmacological targets has been elusive. Here, we report the high-resolution structure of the C-terminal domain of the ethambutol-target EmbC, a 390-amino acid fragment responsible for acceptor substrate recognition. Combining the X-ray crystallographic analysis with structural comparisons, site-directed mutagenesis, activity and ligand binding assays, we identified two regions in the C-terminal domain of EmbC that are capable of binding acceptor substrate mimics and are critical for activity of the full-length enzyme. Our results begin to define structure-function relationships in a family of structurally uncharacterised membrane-embedded glycosyltransferases, which are an important target for tuberculosis therapy

Public Library of Science (PLOS)

Crossref

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

Juelich Shared Electronic Resources

Recommended from our members

The regulatory and transcriptional landscape associated with carbon utilization in a filamentous fungus.

Author: Benz J Philipp
Blow Matthew J
Calhoun Sara
Dietschmann Axel
Glass N Louise
Grigoriev Igor V
Huberman Lori B
Kowbel David J
Lee Juna
Lipzen Anna
Monti Remo
O'Malley Ronan C
Singan Vasanth R
Thieme Nils
Wu Vincent W
Xiong Yi
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Filamentous fungi, such as Neurospora crassa, are very efficient in deconstructing plant biomass by the secretion of an arsenal of plant cell wall-degrading enzymes, by remodeling metabolism to accommodate production of secreted enzymes, and by enabling transport and intracellular utilization of plant biomass components. Although a number of enzymes and transcriptional regulators involved in plant biomass utilization have been identified, how filamentous fungi sense and integrate nutritional information encoded in the plant cell wall into a regulatory hierarchy for optimal utilization of complex carbon sources is not understood. Here, we performed transcriptional profiling of N. crassa on 40 different carbon sources, including plant biomass, to provide data on how fungi sense simple to complex carbohydrates. From these data, we identified regulatory factors in N. crassa and characterized one (PDR-2) associated with pectin utilization and one with pectin/hemicellulose utilization (ARA-1). Using in vitro DNA affinity purification sequencing (DAP-seq), we identified direct targets of transcription factors involved in regulating genes encoding plant cell wall-degrading enzymes. In particular, our data clarified the role of the transcription factor VIB-1 in the regulation of genes encoding plant cell wall-degrading enzymes and nutrient scavenging and revealed a major role of the carbon catabolite repressor CRE-1 in regulating the expression of major facilitator transporter genes. These data contribute to a more complete understanding of cross talk between transcription factors and their target genes, which are involved in regulating nutrient sensing and plant biomass utilization on a global level

eScholarship - University of California

Recommended from our members

Mitigation of off-target toxicity in CRISPR-Cas9 screens for essential non-coding elements.

Author: Aradhana
Bassik Michael C
Bintu Lacramioara
Ego Braeden K
Greenleaf William J
Greenside Peyton G
Hess Gaelen T
Kaplow Irene M
Kundaje Anshul
Li Amy
Marinov Georgi K
Morgens David W
Phanstiel Douglas H
Snyder Michael P
Spees Kaitlyn
Trevino Alexandro E
Truong Alisa
Tycko Josh
Ursu Oana
Wainberg Michael
Yao David
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Pooled CRISPR-Cas9 screens are a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we investigate Cas9, dCas9, and CRISPRi/a off-target activity in screens for essential regulatory elements. The sgRNAs with the largest effects in genome-scale screens for essential CTCF loop anchors in K562 cells were not single guide RNAs (sgRNAs) that disrupted gene expression near the on-target CTCF anchor. Rather, these sgRNAs had high off-target activity that, while only weakly correlated with absolute off-target site number, could be predicted by the recently developed GuideScan specificity score. Screens conducted in parallel with CRISPRi/a, which do not induce double-stranded DNA breaks, revealed that a distinct set of off-targets also cause strong confounding fitness effects with these epigenome-editing tools. Promisingly, filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and enabled identification of essential regulatory elements

eScholarship - University of California

EXMOTIF: efficient structured motif extraction

Author: A Apostolico
A Apostolico
A Brazma
A Carvalho
A Carvalho
A Policriti
AM Carvalho
D Thakurta
E Eskin
E Eskin
G Benson
G Pavesi
G Pavesi
J van Helden
J Zhu
L Marsan
M Friberg
M Zhang
MF Sagot
MJ Zaki
Mohammed J Zaki
N Pisanti
P Michailidis
S Sinha
S Sinha
TL Bailey
Yongqiang Zhang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Extracting motifs from sequences is a mainstay of bioinformatics. We look at the problem of mining structured motifs, which allow variable length gaps between simple motif components. We propose an efficient algorithm, called EXMOTIF, that given some sequence(s), and a structured motif template, extracts all frequent structured motifs that have quorum q. Potential applications of our method include the extraction of single/composite regulatory binding sites in DNA sequences. RESULTS: EXMOTIF is efficient in terms of both time and space and is shown empirically to outperform RISO, a state-of-the-art algorithm. It is also successful in finding potential single/composite transcription factor binding sites. CONCLUSION: EXMOTIF is a useful and efficient tool in discovering structured motifs, especially in DNA sequences. The algorithm is available as open-source at:

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central