77 research outputs found

    Estimating the joint distribution of independent categorical variables via model selection

    Full text link
    Assume one observes independent categorical variables or, equivalently, one observes the corresponding multinomial variables. Estimating the distribution of the observed sequence amounts to estimating the expectation of the multinomial sequence. A new estimator for this mean is proposed that is nonparametric, non-asymptotic and implementable even for large sequences. It is a penalized least-squares estimator based on wavelets, with a penalization term inspired by papers of Birg\'{e} and Massart. The estimator is proved to satisfy an oracle inequality and to be adaptive in the minimax sense over a class of Besov bodies. The method is embedded in a general framework which allows us to recover also an existing method for segmentation. Beyond theoretical results, a simulation study is reported and an application on real data is provided.Comment: Published in at http://dx.doi.org/10.3150/08-BEJ155 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Estimating the number of change-points in a two-dimensional segmentation model without penalization

    Full text link
    In computational biology, numerous recent studies have been dedicated to the analysis of the chromatin structure within the cell by two-dimensional segmentation methods. Motivated by this application, we consider the problem of retrieving the diagonal blocks in a matrix of observations. The theoretical properties of the least-squares estimators of both the boundaries and the number of blocks proposed by L\'evy-Leduc et al. [2014] are investigated. More precisely, the contribution of the paper is to establish the consistency of these estimators. A surprising consequence of our results is that, contrary to the onedimensional case, a penalty is not needed for retrieving the true number of diagonal blocks. Finally, the results are illustrated on synthetic data.Comment: 30 pages, 8 figure

    Information criteria for inhomogeneous spatial point processes

    Full text link
    The theoretical foundation for a number of model selection criteria is established in the context of inhomogeneous point processes and under various asymptotic settings: infill, increasing domain, and combinations of these. For inhomogeneous Poisson processes we consider Akaike information criterion and the Bayesian information criterion, and in particular we identify the point process analogue of sample size needed for the Bayesian information criterion. Considering general inhomogeneous point processes we derive new composite likelihood and composite Bayesian information criteria for selecting a regression model for the intensity function. The proposed model selection criteria are evaluated using simulations of Poisson processes and cluster point processes.Comment: 6 figure

    A segmentation/clustering model for the analysis of array cgh data

    Get PDF
    Summary. Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing

    Evaluation of relevance of stochastic parameters on Hidden Markov Models

    Full text link
    International audiencePrediction of physical particular phenomenon is based on knowledge of the phenomenon. This knowledge helps us to conceptualize this phenomenon around different models. Hidden Markov Models (HMM) can be used for modeling complex processes. This kind of models is used as tool for fault diagnosis systems. Nowadays, industrial robots living in stochastic environment need faults detection to prevent any breakdown. In this paper, we wish to evaluate relevance of Hidden Markov Models parameters, without a priori knowledges. After a brief introduction of Hidden Markov Model, we present the most used selection criteria of models in current literature and some methods to evaluate relevance of stochastic events resulting from Hidden Markov Models. We support our study by an example of simulated industrial process by using synthetic model of Vrignat's study (Vrignat 2010). Therefore, we evaluate output parameters of the various tested models on this process, for finally come up with the most relevant model

    Single-Step Syngas-to-Distillates (S2D) Synthesis via Methanol and Dimethyl Ether Intermediates: Final Report

    Get PDF
    The objective of the work was to enhance price-competitive, synthesis gas (syngas)-based production of transportation fuels that are directly compatible with the existing vehicle fleet (i.e., vehicles fueled by gasoline, diesel, jet fuel, etc.). To accomplish this, modifications to the traditional methanol-to-gasoline (MTG) process were investigated. In this study, we investigated direct conversion of syngas to distillates using methanol and dimethyl ether intermediates. For this application, a Pd/ZnO/Al2O3 (PdZnAl) catalyst previously developed for methanol steam reforming was evaluated. The PdZnAl catalyst was shown to be far superior to a conventional copper-based methanol catalyst when operated at relatively high temperatures (i.e., >300°C), which is necessary for MTG-type applications. Catalytic performance was evaluated through parametric studies. Process conditions such as temperature, pressure, gas-hour-space velocity, and syngas feed ratio (i.e., hydrogen:carbon monoxide) were investigated. PdZnAl catalyst formulation also was optimized to maximize conversion and selectivity to methanol and dimethyl ether while suppressing methane formation. Thus, a PdZn/Al2O3 catalyst optimized for methanol and dimethyl ether formation was developed through combined catalytic material and process parameter exploration. However, even after compositional optimization, a significant amount of undesirable carbon dioxide was produced (formed via the water-gas-shift reaction), and some degree of methane formation could not be completely avoided. Pd/ZnO/Al2O3 used in combination with ZSM-5 was investigated for direct syngas-to-distillates conversion. High conversion was achieved as thermodynamic constraints are alleviated when methanol and dimethyl are intermediates for hydrocarbon formation. When methanol and/or dimethyl ether are products formed separately, equilibrium restrictions occur. Thermodynamic relaxation also enables the use of lower operating pressures than what would be allowed for methanol synthesis alone. Aromatic-rich hydrocarbon liquid (C5+), containing a significant amount of methylated benzenes, was produced under these conditions. However, selectivity control to liquid hydrocarbons was difficult to achieve. Carbon dioxide and methane formation was problematic. Furthermore, saturation of the olefinic intermediates formed in the zeolite, and necessary for gasoline production, occurred over PdZnAl. Thus, yield to desirable hydrocarbon liquid product was limited. Evaluation of other oxygenate-producing catalysts could possibly lead to future advances. Potential exists with discovery of other types of catalysts that suppress carbon dioxide and light hydrocarbon formation. Comparative techno-economics for a single-step syngas-to-distillates process and a more conventional MTG-type process were investigated. Results suggest operating and capital cost savings could only modestly be achieved, given future improvements to catalyst performance. Sensitivity analysis indicated that increased single-pass yield to hydrocarbon liquid is a primary need for this process to achieve cost competiveness

    Current tidal power technologies and their suitability for applications in coastal and marine areas

    Get PDF
    A considerable body of research is currently being performed to quantify available tidal energy resources and to develop efficient devices with which to harness them. This work is naturally focussed on maximising power generation from the most promising sites, and a review of the literature suggests that the potential for smaller scale, local tidal power generation from shallow near-shore sites has not yet been investigated. If such generation is feasible, it could have the potential to provide sustainable electricity for nearby coastal homes and communities as part of a distributed generation strategy, and would benefit from easier installation and maintenance, lower cabling and infrastructure requirements and reduced capital costs when compared with larger scale projects. This article reviews tidal barrages and lagoons, tidal turbines, oscillating hydrofoils and tidal kites to assess their suitability for small-scale electricity generation in shallow waters. This is achieved by discussing the power density, scalability, durability, maintainability, economic potential and environmental impacts of each concept. The performance of each technology in each criterion is scored against axial-flow turbines, allowing for them to be ranked according to their overall suitability. The review suggests that tidal kites and range devices are not suitable for small-scale shallow water applications due to depth and size requirements respectively. Cross-flow turbines appear to be the most suitable technology, as they have high power densities and a maximum size that is not constrained by water depth
    • 

    corecore