12,778 research outputs found
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
Multiple instance learning for sequence data with across bag dependencies
In Multiple Instance Learning (MIL) problem for sequence data, the instances
inside the bags are sequences. In some real world applications such as
bioinformatics, comparing a random couple of sequences makes no sense. In fact,
each instance may have structural and/or functional relations with instances of
other bags. Thus, the classification task should take into account this across
bag relation. In this work, we present two novel MIL approaches for sequence
data classification named ABClass and ABSim. ABClass extracts motifs from
related instances and use them to encode sequences. A discriminative classifier
is then applied to compute a partial classification result for each set of
related sequences. ABSim uses a similarity measure to discriminate the related
instances and to compute a scores matrix. For both approaches, an aggregation
method is applied in order to generate the final classification result. We
applied both approaches to solve the problem of bacterial Ionizing Radiation
Resistance prediction. The experimental results of the presented approaches are
satisfactory
The C-Terminal Domain of the Arabinosyltransferase Mycobacterium tuberculosis EmbC Is a Lectin-Like Carbohydrate Binding Module
The D-arabinan-containing polymers arabinogalactan (AG) and lipoarabinomannan (LAM) are essential components of the unique cell envelope of the pathogen Mycobacterium tuberculosis. Biosynthesis of AG and LAM involves a series of membrane-embedded arabinofuranosyl (Araf) transferases whose structures are largely uncharacterised, despite the fact that several of them are pharmacological targets of ethambutol, a frontline drug in tuberculosis therapy. Herein, we present the crystal structure of the C-terminal hydrophilic domain of the ethambutol-sensitive Araf transferase M. tuberculosis EmbC, which is essential for LAM synthesis. The structure of the C-terminal domain of EmbC (EmbCCT) encompasses two sub-domains of different folds, of which subdomain II shows distinct similarity to lectin-like carbohydrate-binding modules (CBM). Co-crystallisation with a cell wall-derived di-arabinoside acceptor analogue and structural comparison with ligand-bound CBMs suggest that EmbCCT contains two separate carbohydrate binding sites, associated with subdomains I and II, respectively. Single-residue substitution of conserved tryptophan residues (Trp868, Trp985) at these respective sites inhibited EmbC-catalysed extension of LAM. The same substitutions differentially abrogated binding of di- and penta-arabinofuranoside acceptor analogues to EmbCCT, linking the loss of activity to compromised acceptor substrate binding, indicating the presence of two separate carbohydrate binding sites, and demonstrating that subdomain II indeed functions as a carbohydrate-binding module. This work provides the first step towards unravelling the structure and function of a GT-C-type glycosyltransferase that is essential in M. tuberculosis. Author Summary Top Tuberculosis (TB), an infectious disease caused by the bacillus Mycobacterium tuberculosis, burdens large swaths of the world population. Treatment of active TB typically requires administration of an antibiotic cocktail over several months that includes the drug ethambutol. This front line compound inhibits a set of arabinosyltransferase enzymes, called EmbA, EmbB and EmbC, which are critical for the synthesis of arabinan, a vital polysaccharide in the pathogen's unique cell envelope. How precisely ethambutol inhibits arabinosyltransferase activity is not clear, in part because structural information of its pharmacological targets has been elusive. Here, we report the high-resolution structure of the C-terminal domain of the ethambutol-target EmbC, a 390-amino acid fragment responsible for acceptor substrate recognition. Combining the X-ray crystallographic analysis with structural comparisons, site-directed mutagenesis, activity and ligand binding assays, we identified two regions in the C-terminal domain of EmbC that are capable of binding acceptor substrate mimics and are critical for activity of the full-length enzyme. Our results begin to define structure-function relationships in a family of structurally uncharacterised membrane-embedded glycosyltransferases, which are an important target for tuberculosis therapy
Recommended from our members
The regulatory and transcriptional landscape associated with carbon utilization in a filamentous fungus.
Filamentous fungi, such as Neurospora crassa, are very efficient in deconstructing plant biomass by the secretion of an arsenal of plant cell wall-degrading enzymes, by remodeling metabolism to accommodate production of secreted enzymes, and by enabling transport and intracellular utilization of plant biomass components. Although a number of enzymes and transcriptional regulators involved in plant biomass utilization have been identified, how filamentous fungi sense and integrate nutritional information encoded in the plant cell wall into a regulatory hierarchy for optimal utilization of complex carbon sources is not understood. Here, we performed transcriptional profiling of N. crassa on 40 different carbon sources, including plant biomass, to provide data on how fungi sense simple to complex carbohydrates. From these data, we identified regulatory factors in N. crassa and characterized one (PDR-2) associated with pectin utilization and one with pectin/hemicellulose utilization (ARA-1). Using in vitro DNA affinity purification sequencing (DAP-seq), we identified direct targets of transcription factors involved in regulating genes encoding plant cell wall-degrading enzymes. In particular, our data clarified the role of the transcription factor VIB-1 in the regulation of genes encoding plant cell wall-degrading enzymes and nutrient scavenging and revealed a major role of the carbon catabolite repressor CRE-1 in regulating the expression of major facilitator transporter genes. These data contribute to a more complete understanding of cross talk between transcription factors and their target genes, which are involved in regulating nutrient sensing and plant biomass utilization on a global level
Recommended from our members
Mitigation of off-target toxicity in CRISPR-Cas9 screens for essential non-coding elements.
Pooled CRISPR-Cas9 screens are a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we investigate Cas9, dCas9, and CRISPRi/a off-target activity in screens for essential regulatory elements. The sgRNAs with the largest effects in genome-scale screens for essential CTCF loop anchors in K562 cells were not single guide RNAs (sgRNAs) that disrupted gene expression near the on-target CTCF anchor. Rather, these sgRNAs had high off-target activity that, while only weakly correlated with absolute off-target site number, could be predicted by the recently developed GuideScan specificity score. Screens conducted in parallel with CRISPRi/a, which do not induce double-stranded DNA breaks, revealed that a distinct set of off-targets also cause strong confounding fitness effects with these epigenome-editing tools. Promisingly, filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and enabled identification of essential regulatory elements
EXMOTIF: efficient structured motif extraction
BACKGROUND: Extracting motifs from sequences is a mainstay of bioinformatics. We look at the problem of mining structured motifs, which allow variable length gaps between simple motif components. We propose an efficient algorithm, called EXMOTIF, that given some sequence(s), and a structured motif template, extracts all frequent structured motifs that have quorum q. Potential applications of our method include the extraction of single/composite regulatory binding sites in DNA sequences. RESULTS: EXMOTIF is efficient in terms of both time and space and is shown empirically to outperform RISO, a state-of-the-art algorithm. It is also successful in finding potential single/composite transcription factor binding sites. CONCLUSION: EXMOTIF is a useful and efficient tool in discovering structured motifs, especially in DNA sequences. The algorithm is available as open-source at:
- …