16 research outputs found

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Performance modelling and validation of biomass gasifiers for trigeneration plants

    Get PDF
    Esta tesis desarrolla un modelo sencillo pero riguroso de plantas de trigeneración con gasificación de biomasa para su simulación, diseño y evaluación preliminar. Incluye una revisión y estudio de diferentes modelos propuestos para el proceso de gasificación de biomasa.Desarrolla un modelo modificado de equilibrio termodinámico para su aplicación a procesos reales que no alcanzan el equilibrio así comodos modelos de redes neuronales basados en datos experimentales publicados: uno para gasificadores BFB y otro para gasificadores CFB. Ambos modelos, ofrecen la oportunidad de evaluar la influencia de las variaciones de la biomasa y las condiciones de operación en la calidad del gas producido. Estos modelos se integran en el modelo de la planta de trigeneración con gasificación de biomasa de pequeña-mediana escala y se proponen tres configuraciones para la generación de electricidad, frío y calor. Estas configuraciones se aplican a la planta de poligeneración ST-2 prevista en Cerdanyola del Vallés.This thesis develops a simple but rigorous model for simulation, design and preliminary evaluation of trigeneration plants based on biomass gasification. It includes a review and study of various models proposed for the biomass gasification process and different plant configurations. A modified thermodynamic equilibrium model is developed for application to real processes that do not reach equilibrium. In addition, two artificial neural network models, based on experimental published data, are also developed: one for BFB gasifiers and one for CFB gasifiers. Both models offer the opportunity to evaluate the influence of variations of biomass and operating conditions on the quality of gas produced. The different models are integrated into the global model of a small-medium scale biomass gasification trigeneration plant proposing three different configurations for the generation of electricity, heat and cold. These configurations are applied to a case study of the ST-2 polygeneration plant foreseen inCerdanyola del Valles

    Exploiting random projections and sparsity with random forests and gradient boosting methods - Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity

    Full text link
    Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested ``if-then-else'' questions, the testing nodes, leading to a set of predictions, the leaf nodes. Several of such trees are often combined together for state-of-the-art performance: random forest ensembles average the predictions of randomized decision trees trained independently in parallel, while tree boosting ensembles train decision trees sequentially to refine the predictions made by the previous ones. The emergence of new applications requires scalable supervised learning algorithms in terms of computational power and memory space with respect to the number of inputs, outputs, and observations without sacrificing accuracy. In this thesis, we identify three main areas where decision tree methods could be improved for which we provide and evaluate original algorithmic solutions: (i) learning over high dimensional output spaces, (ii) learning with large sample datasets and stringent memory constraints at prediction time and (iii) learning over high dimensional sparse input spaces. A first approach to solve learning tasks with a high dimensional output space, called binary relevance or single target, is to train one decision tree ensemble per output. However, it completely neglects the potential correlations existing between the outputs. An alternative approach called multi-output decision trees fits a single decision tree ensemble targeting simultaneously all the outputs, assuming that all outputs are correlated. Nevertheless, both approaches have (i) exactly the same computational complexity and (ii) target extreme output correlation structures. In our first contribution, we show how to combine random projection of the output space, a dimensionality reduction method, with the random forest algorithm decreasing the learning time complexity. The accuracy is preserved, and may even be improved by reaching a different bias-variance tradeoff. In our second contribution, we first formally adapt the gradient boosting ensemble method to multi-output supervised learning tasks such as multi-output regression and multi-label classification. We then propose to combine single random projections of the output space with gradient boosting on such tasks to adapt automatically to the output correlation structure. The random forest algorithm often generates large ensembles of complex models thanks to the availability of a large number of observations. However, the space complexity of such models, proportional to their total number of nodes, is often prohibitive, and therefore these modes are not well suited under stringent memory constraints at prediction time. In our third contribution, we propose to compress these ensembles by solving a L1-based regularization problem over the set of indicator functions defined by all their nodes. Some supervised learning tasks have a high dimensional but sparse input space, where each observation has only a few of the input variables that have non zero values. Standard decision tree implementations are not well adapted to treat sparse input spaces, unlike other supervised learning techniques such as support vector machines or linear models. In our fourth contribution, we show how to exploit algorithmically the input space sparsity within decision tree methods. Our implementation yields a significant speed up both on synthetic and real datasets, while leading to exactly the same model. It also reduces the required memory to grow such models by exploiting sparse instead of dense memory storage for the input matrix.Parmi les techniques d'apprentissage automatique, l'apprentissage supervisé vise à modéliser les relations entrée-sortie d'un système, à partir d'observations de son fonctionnement. Les arbres de décision caractérisent cette relation entrée-sortie à partir d'un ensemble hiérarchique de questions appelées les noeuds tests amenant à une prédiction, les noeuds feuilles. Plusieurs de ces arbres sont souvent combinés ensemble afin d'atteindre les performances de l'état de l'art: les ensembles de forêts aléatoires calculent la moyenne des prédictions d'arbres de décision randomisés, entraînés indépendamment et en parallèle alors que les ensembles d'arbres de boosting entraînent des arbres de décision séquentiellement, améliorant ainsi les prédictions faites par les précédents modèles de l'ensemble. L'apparition de nouvelles applications requiert des algorithmes d'apprentissage supervisé efficaces en terme de puissance de calcul et d'espace mémoire par rapport au nombre d'entrées, de sorties, et d'observations sans sacrifier la précision du modèle. Dans cette thèse, nous avons identifié trois domaines principaux où les méthodes d'arbres de décision peuvent être améliorées pour lequel nous fournissons et évaluons des solutions algorithmiques originales: (i) apprentissage sur des espaces de sortie de haute dimension, (ii) apprentissage avec de grands ensembles d'échantillons et des contraintes mémoires strictes au moment de la prédiction et (iii) apprentissage sur des espaces d'entrée creux de haute dimension. Une première approche pour résoudre des tâches d'apprentissage avec un espace de sortie de haute dimension, appelée "binary relevance" ou "single target", est l’apprentissage d’un ensemble d'arbres de décision par sortie. Toutefois, cette approche néglige complètement les corrélations potentiellement existantes entre les sorties. Une approche alternative, appelée "arbre de décision multi-sorties", est l’apprentissage d’un seul ensemble d'arbres de décision pour toutes les sorties, faisant l'hypothèse que toutes les sorties sont corrélées. Cependant, les deux approches ont (i) exactement la même complexité en temps de calcul et (ii) visent des structures de corrélation de sorties extrêmes. Dans notre première contribution, nous montrons comment combiner des projections aléatoires (une méthode de réduction de dimensionnalité) de l'espace de sortie avec l'algorithme des forêts aléatoires diminuant la complexité en temps de calcul de la phase d'apprentissage. La précision est préservée, et peut même être améliorée en atteignant un compromis biais-variance différent. Dans notre seconde contribution, nous adaptons d'abord formellement la méthode d'ensemble "gradient boosting" à la régression multi-sorties et à la classification multi-labels. Nous proposons ensuite de combiner une seule projection aléatoire de l'espace de sortie avec l’algorithme de "gradient boosting" sur de telles tâches afin de s'adapter automatiquement à la structure des corrélations existant entre les sorties. Les algorithmes de forêts aléatoires génèrent souvent de grands ensembles de modèles complexes grâce à la disponibilité d'un grand nombre d'observations. Toutefois, la complexité mémoire, proportionnelle au nombre total de noeuds, de tels modèles est souvent prohibitive, et donc ces modèles ne sont pas adaptés à des contraintes mémoires fortes lors de la phase de prédiction. Dans notre troisième contribution, nous proposons de compresser ces ensembles en résolvant un problème de régularisation basé sur la norme L1 sur l'ensemble des fonctions indicatrices défini par tous leurs noeuds. Certaines tâches d'apprentissage supervisé ont un espace d'entrée de haute dimension mais creux, où chaque observation possède seulement quelques variables d'entrée avec une valeur non-nulle. Les implémentations standards des arbres de décision ne sont pas adaptées pour traiter des espaces d'entrée creux, contrairement à d'autres techniques d'apprentissage supervisé telles que les machines à vecteurs de support ou les modèles linéaires. Dans notre quatrième contribution, nous montrons comment exploiter algorithmiquement le creux de l'espace d'entrée avec les méthodes d'arbres de décision. Notre implémentation diminue significativement le temps de calcul sur des ensembles de données synthétiques et réelles, tout en fournissant exactement le même modèle. Cela permet aussi de réduire la mémoire nécessaire pour apprendre de tels modèles en exploitant des méthodes de stockage appropriées pour la matrice des entrées

    Analyse von hydrologischen Extremereignissen unter Berücksichtigung von Unsicherheiten

    Get PDF
    In engineering practice for flood risk assessment it is of primary importance to provide an accurate design flood estimate corresponding to a given risk level. Developing efficient methodologies for assessing flood quantiles in ungauged river basins means to focus on Uncertainty Quantification (UQ). Uncertainty of the model parameters and observed measures is the subject of a relevant and ongoing research activity, in assessing the uncertainty in the design flood we deal with the uncertainty of the model output. In this thesis, the evaluations of the flood quantiles and the predictive uncertainty of these variables are provided by two different models. Within the framework of regional flood frequency analysis approaches, the Top-Kriging interpolation technique is used and the results are compared with the estimates of flood quantiles provided by an at site flood frequency analysis. Moreover, identification procedure of the uncertain parameters of the distributed hydrological model MOBIDIC (MOdello di Bilancio Idrologico DIstribuito e Continuo) was developed. Efficient tools to tackle the parameter identification and the evolution of uncertainty in hydrological modelling have been researched. Monte Carlo and related techniques, i.e. the sampling or ensembles procedures, are well-known, methods based on functional approximation, where the unknown Random Variables (RVs) are represented as functions of known and more simple independent RVs, are very recent and can help to accelerate the Bayesian update. In order to find the Bayesian solution of inverse problem, the Ensemble Kalman filter (EnKF) and Wiener’s Polynomial Chaos Expansion (PCE) methods are compared. The numerical evaluation of the analyzed Bayesian updating methods is carried out with reference to the hydrological model MOBIDIC. The proposed methodologies are applied to the case study of the Arno river basin, in Tuscany Region, Italy. The actual value of some model parameters is described in a Bayesian way through a probabilistic model: the parameters are considered as RVs, the impact of errors, or uncertainty, in the data are investigated. The quantification of the accuracy of the different models and the comparison of results from the interpolation techniques and from the hydrological model MOBIDIC are evaluated. Finally, a preliminary discussion on the ways to convey the results of UQ to stakeholders and to communicate the outcomes for flood risk assessment is carried out.In der technischen Praxis der Bewertung von Hochwasserrisikos ist es wichtig, eine präzise Wasserstandsvorhersage zusammen mit der Wahrscheinlichkeit ihres Auftretens zu liefern. Um effektive Methoden zur Bewertung von Hochwasserquantilen in Flüssen ohne Pegelmessung zu entwickeln, muss man den Fokus auf die Quantifizierung von Unsicherheit legen. Unsicherheiten bei Modellparametern und Messungen ist das Thema gegenwärtiger relevanter Forschungstätigkeit; bei der Einschätzung von Unsicherheiten in der Hochwassersimulation betrachten wir die Unsicherheit des Modells. In dieser Dissertation erfolgt die Evaluation der Quantile des Hochwasser und die Vorhersageunsicherheit dieser Variablen durch zwei unterschiedliche Modelle. Top Kriging Interpolationstechniken benutzt, und die Resulate werden mit Schätzungen der Hochwasserquantile verglichen, die durch eine Vorort-Hochwasserfrequenzanalyse gewonnen wurden. Zusätzlich wird das räumliche hydrologische Modell MOBIDIC, durch das die unsicheren Parameter identifiziert werden, eingeführt. Effiziente Werkzeuge zur Parameteridentifikation und Entwicklung von Unsicherheiten in hydrologischen Modellen werden identifiziert. Während Monte Carlo- und verwandte Techniken, z. B. Ensemble-Verfahren, wohlbekannt sind, sind Methoden der funktionalen Approximation, bei der die unbekannten Zufallsgrößen als Funktionen von bekannten und einfacheren Zufallsgrößen repräsentiert werden, relativ neu und können dazu dienen, Bayes´sche Aktualisierung zu beschleunigen. Um eine Bayes´sche Lösung des inversen Problems zu finden, werden der Ensemble Kalman Filter und Wieners Polynome Chaos-Entwicklungs verglichen. Die numerische Auswertung der analysierten Methoden zur Bayes´schen Aktualisierung wird in Bezug auf das hydrologische Modell MOBIDIC durchgeführt. Die Werte einiger Modellparameter werden auf Bayes´sche Weise anhand eines Wahrscheinlichkeitsmodells beschrieben: Die Parameter werden als Zufallsgrößen betrachtet und die Auswirkung von Fehlern oder Unsicherheiten bezüglich der Daten werden untersucht. Die Quantifikation der Genauigkeit der verschiedenen Modelle und der Vergleich der Resultate mit Hilfe von Interpolationstechniken und des MOBIDIC-Modells werden ausgewertet. Anschließend folgt eine vorläufige Diskussion über die Art und Weise, die Resultate der Quantifizierung von Unsicherheiten an Betroffene zu übermitteln und die Ergebnisse dieser Bewertung von Hochwasserrisiken zu veröffentlichen

    Bayesian inference for protein signalling networks

    Get PDF
    Cellular response to a changing chemical environment is mediated by a complex system of interactions involving molecules such as genes, proteins and metabolites. In particular, genetic and epigenetic variation ensure that cellular response is often highly specific to individual cell types, or to different patients in the clinical setting. Conceptually, cellular systems may be characterised as networks of interacting components together with biochemical parameters specifying rates of reaction. Taken together, the network and parameters form a predictive model of cellular dynamics which may be used to simulate the effect of hypothetical drug regimens. In practice, however, both network topology and reaction rates remain partially or entirely unknown, depending on individual genetic variation and environmental conditions. Prediction under parameter uncertainty is a classical statistical problem. Yet, doubly uncertain prediction, where both parameters and the underlying network topology are unknown, leads to highly non-trivial probability distributions which currently require gross simplifying assumptions to analyse. Recent advances in molecular assay technology now permit high-throughput data-driven studies of cellular dynamics. This thesis sought to develop novel statistical methods in this context, focussing primarily on the problems of (i) elucidating biochemical network topology from assay data and (ii) prediction of dynamical response to therapy when both network and parameters are uncertain

    Decadal sea-level changes in the Baltic Sea

    No full text

    Statistical Analysis and Forecasting of Economic Structural Change

    Get PDF
    In 1984, the University of Bonn (FRG) and IIASA created a joint research group to analyze the relationship between economic growth and structural change. The research team was to examine the commodity composition as well as the size and direction of commodity and credit flows among countries and regions. Krelle (1988) reports on the results of this "Bonn-IIASA" research project. At the same time, an informal IIASA Working Group was initiated to deal with problems of the statistical analysis of economic data in the context of structural change: What tools do we have to identify nonconstancy of model parameters? What type of models are particularly applicable to nonconstant structure? How is forecasting affected by the presence of nonconstant structure? What problems should be anticipated in applying these tools and models? Some 50 experts, mainly statisticians or econometricians from about 15 countries, came together in Lodz, Poland (May 1985); Berlin, GDR (June 1986); and Sulejov, Poland (September 1986) to present and discuss their findings. This volume contains a selected set of those conference contributions as well as several specially invited chapters. The introductory chapter "What can statistics contribute to the analysis of economic structural change?", discusses not only the role of statistics in the detection and assimilation of structural changes, but also the relevance of respective methods in the evaluation of econometric models. Trends in the development of these methods are indicated, and the contributions to the present volume are put into a broader context of empirical economics to help to bridge the gap between economists and statisticians. The chapters in the first section are concerned with the detection of parameter nonconstancy. The procedures discussed range from classical methods, such as the CUSUM test, to new concepts, particularly those based on nonparametric statistics. Several chapters assess the conditions under which these methods can be applied and their robustness under such conditions. The second section addresses models that are in some sense generalizations of nonconstant-parameter models, so that they can assimilate structural changes. The last section deals with real-life structural change situations

    Literature review of the remote sensing of natural resources

    Get PDF
    A bibliography is presented concerning remote sensing techniques. Abstracts of recent periodicals are included along with author, and keyword indexes
    corecore