207 research outputs found

    Summarizing a set of time series by averaging: From Steiner sequence to compact multiple alignment

    Get PDF
    AbstractSummarizing a set of sequences is an old topic that has been revived in the last decade, due to the increasing availability of sequential datasets. The definition of a consensus object is on the center of data analysis issues, since it crystallizes the underlying organization of the data.Dynamic Time Warping (DTW) is currently the most relevant similarity measure between sequences for a large panel of applications, since it makes it possible to capture temporal distortions. In this context, averaging a set of sequences is not a trivial task, since the average sequence has to be consistent with this similarity measure.The Steiner theory and several works in computational biology have pointed out the connection between multiple alignments and average sequences. Taking inspiration from these works, we introduce the notion of compact multiple alignment, which allows us to link these theories to the problem of summarizing under time warping. Having defined the link between the multiple alignment and the average sequence, the second part of this article focuses on the scan of the space of compact multiple alignments in order to provide an average sequence of a set of sequences. We propose to use a genetic algorithm based on a specific representation of the genotype inspired by genes. This representation of the genotype makes it possible to consistently paint the fitness landscape.Experiments carried out on standard datasets show that the proposed approach outperforms existing methods

    Assessing the geometric diversity of cytochrome P450 ligand conformers by hierarchical clustering with a stop criterion.

    Get PDF
    International audienceAn algorithm is presented, which exhibits a computed number of rigid conformers of an input small molecule, covering the geometric diversity in the conformational space, with minimal structural redundancy. The algorithm calls a conformer generator, then performs an agglomerative hierarchical clustering with the modified clustering gain as the stop criterion. The number of classes is computed without an arbitrary parameter. A representative conformer is selected in each class, and nonrepresentative conformers are discarded. For illustration, the algorithm has been applied on a database containing 70 ligands of the cytochrome CYP 3A4, showing that the structural flexibility of each ligand is indeed handled via a small number of its representative conformers. The method is valid for all small molecules

    Découverte de motifs d'évolution significatifs dans les séries temporelles d'images satellites

    Get PDF
    International audienceLes séries temporelles d'images satellites (ou Satellite Image Time Series - SITS) sont d'importantes sources d'informations sur l'évolution du territoire. Étudier ces images permet de comprendre les changements sur des zones précises mais aussi de découvrir des schémas d'évolution à grande échelle. Toutefois, découvrir ces phénomènes impose de répondre à plusieurs défis qui sont liés aux caractéristiques des SITS et à leurs contraintes. Premièrement, chaque pixel d'une image satellite est décrit par plusieurs valeurs (les niveaux radiométriques sur différentes longueurs d'ondes). Deuxièmement, ces motifs d'évolution portent sur des périodes très longues et ne sont pas forcément synchrones selon les régions. Troisièmement, les régions qui ne sont pas concernées par des évolutions signiticatives sont majoritaires et leur domination rend difficile l'extraction des motifs d'évolution. Dans cet article, nous proposons une méthode qui répond à ces difficultés et nous la validons sur une série d'images satellites acquises sur une période de 20 ans

    RE2O3 dissolution kinetics and mechanisms in CAS silicate melt: Influence of the rare earth

    Get PDF
    Fine particles of sand, dust or volcanic ashes ingested by aircraft engines are well-known to damage 8YPSZ Thermal Barrier Coating (TBC). In service, these particles deposit on hot TBC surface (≥ 1200°C) as molten silicate and infiltrate coating porous microstructure. They are mainly constituted of CaO-MgO-Al2O3-SiO2 (CMAS) in variable proportions and also contain metallic oxides. Gd2Zr2O7 TBC has shown efficiency to mitigate synthetic CMAS infiltration due to its reactivity with CMAS [1]. Indeed, the dissolution reaction leads to rapid formation of a sealing-layer in the topcoat mainly constituted of crystalline Ca2Gd8(SiO4)6O2 apatite. However, this phase is not always stable in contact with CMAS and many rare-earth silicates may compete with apatite crystallization [2]. Several rare-earth oxides RE2O3 can be considered to replace yttria in ZrO2-based TBC but little is known on reaction kinetics and thermodynamics involving RE2O3 and multi-component CMAS system. Please click Additional Files below to see the full abstract

    Mid- and Far-Infrared Marker Bands of the Metal Coordination Sites of the Histidine Side Chains in the Protein Cu,Zn-Superoxide Dismutase

    Get PDF
    International audienceVibrational spectroscopy gives important information on the properties of ligand and metal–ligand bonds in metalloenzymes. Infrared spectroscopy is appealing for the study of metal active sites that are not amenable to Raman spectroscopy. We present a combined experimental and theoretical approach to analyze the mid- and far-IR spectra of Cu,Zn-superoxide dismutase (Cu,Zn-SOD) as a probe of the histidine ligands. This metalloenzyme provides a unique model to identify specific IR signatures of metal–histidine coordination and to study their alterations as a function of the metal (copper/zinc), the copper valence state (+I/+II), the histidine coordination mode (Nτ and Nπ) and the histidine protonation state. DFT calculations combined with normal mode descriptions from potential energy distribution calculations were performed on two slightly different cluster models. Differences in the constraints at the side chain of one histidine Cu ligand sensibly modify the geometric parameters and vibrational properties. Electrochemically induced FTIR difference spectroscopy provided mid- and far-IR fingerprint spectra of the Cu protein in aqueous media that are sensitive to the redox state of the Cu centre at the active site. Comparisons of the DFT predictions with the experimental IR modes of the histidine ligands at the Cu,Zn-SOD active site showed that useful mid-IR markers of histidine Nτ and Nπ coordination were predicted with good accuracy. The DFT analysis further demonstrated a link between the ν(C4–C5) mode frequency of His46 and the specific properties of the His46–Cu bond in Cu,Zn-SOD. A combined theoretical and experimental approach on samples in H2O and 2H2O or 15N-labelled samples identified the contributions from the histidine side chain modes in the 669–629 cm–1 region

    Similarities and differences in the biochemical and enzymological properties of the four isomaltases from Saccharomyces cerevisiae

    Get PDF
    AbstractThe yeast Saccharomyces cerevisiae IMA multigene family encodes four isomaltases sharing high sequence identity from 65% to 99%. Here, we explore their functional diversity, with exhaustive in-vitro characterization of their enzymological and biochemical properties. The four isoenzymes exhibited a preference for the α-(1,6) disaccharides isomaltose and palatinose, with Michaëlis–Menten kinetics and inhibition at high substrates concentration. They were also able to hydrolyze trisaccharides bearing an α-(1,6) linkage, but also α-(1,2), α-(1,3) and α-(1,5) disaccharides including sucrose, highlighting their substrate ambiguity. While Ima1p and Ima2p presented almost identical characteristics, our results nevertheless showed many singularities within this protein family. In particular, Ima3p presented lower activities and thermostability than Ima2p despite only three different amino acids between the sequences of these two isoforms. The Ima3p_R279Q variant recovered activity levels of Ima2p, while the Leu-to-Pro substitution at position 240 significantly increased the stability of Ima3p and supported the role of prolines in thermostability. The most distant protein, Ima5p, presented the lowest optimal temperature and was also extremely sensitive to temperature. Isomaltose hydrolysis by Ima5p challenged previous conclusions about the requirement of specific amino acids for determining the specificity for α-(1,6) substrates. We finally found a mixed inhibition by maltose for Ima5p while, contrary to a previous work, Ima1p inhibition by maltose was competitive at very low isomaltose concentrations and uncompetitive as the substrate concentration increased. Altogether, this work illustrates that a gene family encoding proteins with strong sequence similarities can lead to enzyme with notable differences in biochemical and enzymological properties

    Line shift, line asymmetry, and the 6Li/7Li isotopic ratio determination

    No full text
    Accepted for publication on A&A LettersContext: Line asymmetries are generated by convective Doppler shifts in stellar atmospheres, especially in metal-poor stars, where convective motions penetrate to higher atmospheric levels. Such asymmetries are usually neglected in abundance analyses. The determination of the 6Li/7Li isotopic ratio is prone to suffering from such asymmetries, as the contribution of 6Li is a slight blending reinforcement of the red wing of each component of the corresponding 7Li line, with respect to its blue wing. Aims: The present paper studies the halo star HD 74000 and estimates the impact of convection-related asymmetries on the Li isotopic ratio determination. Method: Two methods are used to meet this aim. The first, which is purely empirical, consists in deriving a template profile from another element that can be assumed to originate in the same stellar atmospheric layers as Li I, producing absorption lines of approximately the same equivalent width as individual components of the 7Li I resonance line. The second method consists in conducting the abundance analysis based on NLTE line formation in a 3D hydrodynamical model atmosphere, taking into account the effects of photospheric convection. Results: The results of the first method show that the convective asymmetry generates an excess absorption in the red wing of the 7Li absorption feature that mimics the presence of 6Li at a level comparable to the hitherto published values. This opens the possibility that only an upper limit on 6Li/7Li has thus far been derived. The second method confirms these findings. Conclusions: From this work, it appears that a systematic reappraisal of former determinations of 6Li abundances in halo stars is warranted

    ShapeDBA: Generating Effective Time Series Prototypes using ShapeDTW Barycenter Averaging

    Full text link
    Time series data can be found in almost every domain, ranging from the medical field to manufacturing and wireless communication. Generating realistic and useful exemplars and prototypes is a fundamental data analysis task. In this paper, we investigate a novel approach to generating realistic and useful exemplars and prototypes for time series data. Our approach uses a new form of time series average, the ShapeDTW Barycentric Average. We therefore turn our attention to accurately generating time series prototypes with a novel approach. The existing time series prototyping approaches rely on the Dynamic Time Warping (DTW) similarity measure such as DTW Barycentering Average (DBA) and SoftDBA. These last approaches suffer from a common problem of generating out-of-distribution artifacts in their prototypes. This is mostly caused by the DTW variant used and its incapability of detecting neighborhood similarities, instead it detects absolute similarities. Our proposed method, ShapeDBA, uses the ShapeDTW variant of DTW, that overcomes this issue. We chose time series clustering, a popular form of time series analysis to evaluate the outcome of ShapeDBA compared to the other prototyping approaches. Coupled with the k-means clustering algorithm, and evaluated on a total of 123 datasets from the UCR archive, our proposed averaging approach is able to achieve new state-of-the-art results in terms of Adjusted Rand Index.Comment: Published in AALTD workshop at ECML/PKDD 202
    • …
    corecore