177 research outputs found

    Poisson approximation for search of rare words in DNA sequences

    Get PDF
    Using recent results on the occurrence times of a string of symbols in a stochastic process with mixing properties, we present a new method for the search of rare words in biological sequences generally modelled by a Markov chain. We obtain a bound on the error between the distribution of the number of occurrences of a word in a sequence (under a Markov model) and its Poisson approximation. A global bound is already given by a Chen-Stein method. Our approach, the psi-mixing method, gives local bounds. Since we only need the error in the tails of distribution, the global uniform bound of Chen-Stein is too large and it is a better way to consider local bounds. We search for two thresholds on the number of occurrences from which we can regard the studied word as an over-represented or an under-represented one. A biological role is suggested for these over- or under-represented words. Our method gives such thresholds for a panel of words much broader than the Chen-Stein method. Comparing the methods, we observe a better accuracy for the psi-mixing method for the bound of the tails of distribution. We also present the software PANOW (available at http://stat.genopole.cnrs.fr/software/panowdir/) dedicated to the computation of the error term and the thresholds for a studied word.Comment: 29 pages, 0 figure

    Sharp errors for point-wise Poisson approximations in mixing processes

    Get PDF
    International audienceWe describe the statistics of the number of occurrences of a string of symbols in a stochastic process: taking a string A of length n, we prove that the number of visits to A up to time t, denoted by N t , has approximately a Poisson distribution. We provide a sharp error for this approximation. In contrast to previous works which present uniform error terms based on the total variation distance, our error is point-wise. As a byproduct we obtain that all the moments of N t are finite. Moreover, we obtain explicit approximations for all of them. Our result holds for processes that verify the φ-mixing condition. The error term is explicitly expressed as a function of the rate function φ and is easily computable

    SMM: An R Package for Estimation and Simulation of Discrete-time semi-Markov Models

    Get PDF
    International audienceSemi-Markov models, independently introduced by Lévy (1954), Smith (1955) and Takacs (1954), are a generalization of the well-known Markov models. For semi-Markov models, sojourn times can be arbitrarily distributed, while sojourn times of Markov models are constrained to be exponentially distributed (in continuous time) or geometrically distributed (in discrete time). The aim of this paper is to present the R package SMM, devoted to the simulation and estimation of discrete-time multi-state semi-Markov and Markov models. For the semi-Markov case we have considered: parametric and non-parametric estimation; with and without censoring at the beginning and/or at the end of sample paths; one or several independent sample paths. Several discrete-time distributions are considered for the parametric estimation of sojourn time distributions of semi-Markov chains: Uniform, Geometric, Poisson, Discrete Weibull and Binomial Negative

    Une modélisation multi-physique et multi-phasique du contact lubrifié

    Get PDF
    De nombreuses hypothèses sont classiquement utilisées pour décrire le comportement du fluide dans un contact lubrifié : film continu, viscosité constante dans l épaisseur, film mince, fluide newtonien Or, certaines s avèrent erronées dès lors que l on s intéresse aux contacts Elasto- HydroDynamiques fortement glissants ou à la répartition du lubrifiant en sortie de contact. Une approche numérique originale, basée sur un retour aux équations de la mécanique des fluides générale et prenant en compte le couplage fluide/solide et les effets thermiques sont proposés ici dans le but d apporter des éléments physiques supplémentaires aux modélisations usuelles. Dans un premier temps, l influence des effets thermiques sur l évolution du frottement dans les contacts Thermo-EHD est mise en évidence. La présence d un minimum de frottement pour le cas du glissement pur est expliquée par l analyse des transferts thermiques entre le lubrifiant et les solides. L origine des modifications locales d épaisseur de film observées et l existence même d une épaisseur de film lubrifiant pour les cas de vitesse d entraînement nulle sont alors reliées à la présence d un fort gradient de viscosité dans l épaisseur de film. Une comparaison qualitative avec des éléments expérimentaux de la littérature est réalisée, validant les tendances obtenues. Dans un second temps, l écoulement à surface libre du lubrifiant en périphérie du contact est étudié expérimentalement puis numériquement par une méthode à interface diffuse. Le rôle des effets capillaires est analysé et les résultats numériques confrontés à des résultats issus de la littérature. Un bon accord est obtenu tant qualitativement que quantitativement. Validé par l étude numérique diphasique (air/lubrifiant) réalisée, un modèle analytique simplifié est alors développé, prédisant une loi de répartition du lubrifiant en sortie de contact. La zone de sortie des contacts EHD est ensuite traitée par un modèle de cavitation vaporeuse et la prise en compte nécessaire de l air environnant est discutée. Enfin, une première modélisation tridimensionnelle de l écoulement à surface libre du lubrifiant autour d un contact ponctuel est réalisée mettant en avant l influence des effets capillaires et la faisabilité d une telle approche.Classically, many assumptions are used to model the fluid behaviour in a lubricated contact : continuous film, constant viscosity across the film thickness, film thickness is very thin compared to other contact dimensions, Newtonian lubricant... However, some of them are not well-founded for the study of Elasto-HydroDynamic contacts with high sliding or to estimate the liquid distribution at the exit of the contact. An original numerical approach, based on the general fluid mechanics equations and taking into account the fluid/solid coupling and thermal effects, is developed here in order to give more physical insights to the usual modelling. First of all, the thermal effects are shown on the friction coefficient evolution for Thermo- EHD contacts. A minimum value is found concerning the friction value for the pure sliding case. It is explained by analyzing the heat transfer between the solids and the lubricant. The origin of the resulting local modifications of the film thickness and the existence of a film thickness for zero entrainment velocity cases are related to the presence of a high viscosity gradient through the film. A qualitative comparison is performed with experimental data from literature, validating the results. Second, the free surface flow of the lubricant around the contact is experimentally and numerically studied with a diffuse interface method. The capillary effects on the air/lubricant meniscus position are analyzed and quantitatively compared with experimental data from literature. Good agreements are found. An analytical approach is then developed, based on the numerical study of the two-phase flow. An analytical law predicting the liquid distribution is obtained. The exit area of EHD contacts is then investigated with a vaporous cavitation model highlighting the necessity of taking into account the effects of surrounding air and surface wettabillity. Finally, a first approach of the tri-dimensional two-phase flow is performed, showing the capillary effects on the interface location.VILLEURBANNE-DOC'INSA-Bib. elec. (692669901) / SudocSudocFranceF

    Sharp error terms for return time statistics under mixing conditions

    Get PDF
    We describe the statistics of repetition times of a string of symbols in a stochastic process. Denote by T(A) the time elapsed until the process spells the finite string A and by S(A) the number of consecutive repetitions of A. We prove that, if the length of the string grows unbondedly, (1) the distribution of T(A), when the process starts with A, is well aproximated by a certain mixture of the point measure at the origin and an exponential law, and (2) S(A) is approximately geometrically distributed. We provide sharp error terms for each of these approximations. The errors we obtain are point-wise and allow to get also approximations for all the moments of T(A) and S(A). To obtain (1) we assume that the process is phi-mixing while to obtain (2) we assume the convergence of certain contidional probabilities

    Etude du piégeage de contaminants solides dans des contacts EHD

    Get PDF
    Cette étude porte sur l'analyse des phénomènes de piégeage de particules survenant dans des roulements à billes soumis à une lubrification polluée. Des outils à la fois numériques et expérimentaux ont permis de mieux appréhender un problème récurrent aux conséquences désastreuses. Des simulations numériques aussi bien que des tests sur machine bi-disques ont permis de mettre en évidence des paramètres clefs influant directement sur le taux de piégeage de contaminants. Des tests avec différents matériaux ainsi que différentes géométries ont permis de quantifier l'impact des paramètres d'un contact élastohydrodynamique sur le piégeage de particules

    Mathematical modeling at the livestock-wildlife interface: scoping review of drivers of disease transmission between species

    Get PDF
    Modeling of infectious diseases at the livestock-wildlife interface is a unique subset of mathematical modeling with many innate challenges. To ascertain the characteristics of the models used in these scenarios, a scoping review of the scientific literature was conducted. Fifty-six studies qualified for inclusion. Only 14 diseases at this interface have benefited from the utility of mathematical modeling, despite a far greater number of shared diseases. The most represented species combinations were cattle and badgers (for bovine tuberculosis, 14), and pigs and wild boar [for African (8) and classical (3) swine fever, and foot-and-mouth and disease (1)]. Assessing control strategies was the overwhelming primary research objective (27), with most studies examining control strategies applied to wildlife hosts and the effect on domestic hosts (10) or both wild and domestic hosts (5). In spatially-explicit models, while livestock species can often be represented through explicit and identifiable location data (such as farm, herd, or pasture locations), wildlife locations are often inferred using habitat suitability as a proxy. Though there are innate assumptions that may not be fully accurate when using habitat suitability to represent wildlife presence, especially for wildlife the parsimony principle plays a large role in modeling diseases at this interface, where parameters are difficult to document or require a high level of data for inference. Explaining observed transmission dynamics was another common model objective, though the relative contribution of involved species to epizootic propagation was only ascertained in a few models. More direct evidence of disease spill-over, as can be obtained through genomic approaches based on pathogen sequences, could be a useful complement to further inform such modeling. As computational and programmatic capabilities advance, the resolution of the models and data used in these models will likely be able to increase as well, with a potential goal being the linking of modern complex ecological models with the depth of dynamics responsible for pathogen transmission. Controlling diseases at this interface is a critical step toward improving both livestock and wildlife health, and mechanistic models are becoming increasingly used to explore the strategies needed to confront these diseases

    Map Style Formalization: Rendering Techniques Extension for Cartography

    Get PDF
    International audienceCartographic design requires controllable methods and tools to produce maps that are adapted to users' needs and preferences. The formalized rules and constraints for cartographic representation come mainly from the conceptual framework of graphic semiology. Most current Geographical Information Systems (GIS) rely on the Styled Layer Descriptor and Semiology Encoding (SLD/SE) specifications which provide an XML schema describing the styling rules to be applied on geographic data to draw a map. Although this formalism is relevant for most usages in cartography, it fails to describe complex cartographic and artistic styles. In order to overcome these limitations, we propose an extension of the existing SLD/SE specifications to manage extended map stylizations, by the means of controllable expressive methods. Inspired by artistic and cartographic sources (Cassini maps, mountain maps, artistic movements, etc.), we propose to integrate into our system three main expressive methods: linear stylization, patch-based region filling and vector texture generation. We demonstrate how our pipeline allows to personalize map rendering with expressive methods in several examples
    • …
    corecore