50 research outputs found

    Confidence ellipsoids based on a general family of shrinkage estimators for a linear model with non-spherical disturbances

    Get PDF
    AbstractThis paper considers a general family of Stein rule estimators for the coefficient vector of a linear regression model with nonspherical disturbances, and derives estimators for the Mean Squared Error (MSE) matrix, and risk under quadratic loss for this family of estimators. The confidence ellipsoids for the coefficient vector based on this family of estimators are proposed, and the performance of the confidence ellipsoids under the criterion of coverage probability and expected volumes is investigated. The results of a numerical simulation are presented to illustrate the theoretical findings, which could be applicable in the area of economic growth modeling

    Stein-Like Estimation and Inference.

    Get PDF
    The dissertation addresses three issues in the use of Stein-like estimators of the classical normal linear regression model. The St. Louis equation is used to generate out-of-sample forecasts using least squares. These forecasts are compared to those produced by restricted least squares, pretest, and members of a general family of minimax shrinkage estimators using the root-mean-square error criterion. Bootstrap confidence intervals and ellipsoids are constructed which are centered at least squares and James-Stein estimators and their coverage probability and size is explored in a Monte Carlo experiment. A Stein-like estimator of the probit regression model is suggested and its quadratic risk properties are explored in a Monte Carlo experiment

    Non-Parametric Bayesian Methods for Linear System Identification

    Get PDF
    Recent contributions have tackled the linear system identification problem by means of non-parametric Bayesian methods, which are built on largely adopted machine learning techniques, such as Gaussian Process regression and kernel-based regularized regression. Following the Bayesian paradigm, these procedures treat the impulse response of the system to be estimated as the realization of a Gaussian process. Typically, a Gaussian prior accounting for stability and smoothness of the impulse response is postulated, as a function of some parameters (called hyper-parameters in the Bayesian framework). These are generally estimated by maximizing the so-called marginal likelihood, i.e. the likelihood after the impulse response has been marginalized out. Once the hyper-parameters have been fixed in this way, the final estimator is computed as the conditional expected value of the impulse response w.r.t. the posterior distribution, which coincides with the minimum variance estimator. Assuming that the identification data are corrupted by Gaussian noise, the above-mentioned estimator coincides with the solution of a regularized estimation problem, in which the regularization term is the l2 norm of the impulse response, weighted by the inverse of the prior covariance function (a.k.a. kernel in the machine learning literature). Recent works have shown how such Bayesian approaches are able to jointly perform estimation and model selection, thus overcoming one of the main issues affecting parametric identification procedures, that is complexity selection. While keeping the classical system identification methods (e.g. Prediction Error Methods and subspace algorithms) as a benchmark for numerical comparison, this thesis extends and analyzes some key aspects of the above-mentioned Bayesian procedure. In particular, four main topics are considered. 1. PRIOR DESIGN. Adopting Maximum Entropy arguments, a new type of l2 regularization is derived: the aim is to penalize the rank of the block Hankel matrix built with Markov coefficients, thus controlling the complexity of the identified model, measured by its McMillan degree. By accounting for the coupling between different input-output channels, this new prior results particularly suited when dealing for the identification of MIMO systems To speed up the computational requirements of the estimation algorithm, a tailored version of the Scaled Gradient Projection algorithm is designed to optimize the marginal likelihood. 2. CHARACTERIZATION OF UNCERTAINTY. The confidence sets returned by the non-parametric Bayesian identification algorithm are analyzed and compared with those returned by parametric Prediction Error Methods. The comparison is carried out in the impulse response space, by deriving “particle” versions (i.e. Monte-Carlo approximations) of the standard confidence sets. 3. ONLINE ESTIMATION. The application of the non-parametric Bayesian system identification techniques is extended to an online setting, in which new data become available as time goes. Specifically, two key modifications of the original “batch” procedure are proposed in order to meet the real-time requirements. In addition, the identification of time-varying systems is tackled by introducing a forgetting factor in the estimation criterion and by treating it as a hyper-parameter. 4. POST PROCESSING: MODEL REDUCTION. Non-parametric Bayesian identification procedures estimate the unknown system in terms of its impulse response coefficients, thus returning a model with high (possibly infinite) McMillan degree. A tailored procedure is proposed to reduce such model to a lower degree one, which appears more suitable for filtering and control applications. Different criteria for the selection of the order of the reduced model are evaluated and compared

    Advanced Geoscience Remote Sensing

    Get PDF
    Nowadays, advanced remote sensing technology plays tremendous roles to build a quantitative and comprehensive understanding of how the Earth system operates. The advanced remote sensing technology is also used widely to monitor and survey the natural disasters and man-made pollution. Besides, telecommunication is considered as precise advanced remote sensing technology tool. Indeed precise usages of remote sensing and telecommunication without a comprehensive understanding of mathematics and physics. This book has three parts (i) microwave remote sensing applications, (ii) nuclear, geophysics and telecommunication; and (iii) environment remote sensing investigations

    Pharmaceutical development and manufacturing in a Quality by Design perspective: methodologies for design space description

    Get PDF
    In the last decade, the pharmaceutical industry has been experiencing a period of drastic change in the way new products and processes are being conceived, due to the introduction of the Quality by design (QbD) initiative put forth by the pharmaceutical regulatory agencies (such as the Food and Drug Adminstration (FDA) and the European Medicines Agency (EMA)). One of the most important aspects introduced in the QbD framework is that of design space (DS) of a pharmaceutical product, defined as “the multidimensional combination and interaction of input variables (e.g. material attributes) and process parameters that have been demonstrated to provide assurance of quality”. The identification of the DS represents a key advantage for pharmaceutical companies, since once the DS has been approved by the regulatory agency, movements within the DS do not constitute a manufacturing change and therefore do not require any further regulatory post-approval. This translates into an enhanced flexibility during process operation, with significant advantages in terms of productivity and process economics. Mathematical modeling, both first-principles and data-driven, has proven to be a valuable tool to assist a DS identification exercise. The development of advanced mathematical techniques for the determination and maintenance of a design space, as well as the quantification of the uncertainty associated with its identification, is a research area that has gained increasing attention during the last years. The objective of this Dissertation is to develop novel methodologies to assist the (i) determination of the design space of a new pharmaceutical product, (ii) quantify the assurance of quality for a new pharmaceutical product as advocated by the regulatory agencies, (iii) adapt and maintain a design space during plant operation, and (iv) design optimal experiments for the calibration of first-principles mathematical models to be used for design space identification. With respect to the issue of design space determination, a methodology is proposed that combines surrogate-based feasibility analysis and latent-variable modeling for the identification of the design space of a new pharmaceutical product. Projection onto latent structures (PLS) is exploited to obtain a latent representation of the space identified by the model inputs (i.e. raw material properties and process parameters) and surrogate-based feasibility is then used to reconstruct the boundary of the DS on this latent representation, with significant reduction of the overall computational burden. The final result is a compact representation of the DS that can be easily expressed in terms of the original physically-relevant input variables (process parameters and raw material properties) and can then be easily interpreted by industrial practitioners. As regards the quantification of “assurance” of quality, two novel methodologies are proposed to account for the two most common sources of model uncertainty (structural and parametric) in the model-based identification of the DS of a new pharmaceutical product. The first methodology is specifically suited for the quantification of assurance of quality when a PLS model is to be used for DS identification. Two frequentist analytical models are proposed to back-propagate the uncertainty from the quality attributes of the final product to the space identified by the set of raw material properties and process parameters of the manufacturing process. It is shown how these models can be used to identify a subset of input combinations (i.e., raw material properties and process parameters) within which the DS is expected to lie with a given degree of confidence. It is also shown how this reduced space of input combinations (called experiment space) can be used to tailor an experimental campaign for the final assessment of the DS, with a significant reduction of the experimental effort required with respect to a non-tailored experimental campaign. The validity of the proposed methodology is tested on granulation and roll compaction processes, involving both simulated and experimental data. The second methodology proposes a joint Bayesian/latent-variable approach, and the assurance of quality is quantified in terms of the probability that the final product will meet its specifications. In this context, the DS is defined in a probabilistic framework as the set of input combinations that guarantee that the probability that the product will meet its quality specifications is greater than a predefined threshold value. Bayesian multivariate linear regression is coupled with latent-variable modeling in order to obtain a computationally friendly implementation of this probabilistic DS. Specifically, PLS is exploited to reduce the computational burden for the discretization of the input domain and to give a compact representation of the DS. On the other hand, Bayesian multivariate linear regression is used to compute the probability that the product will meet the desired quality for each of the discretization points of the input domain. The ability of the methodology to give a scientifically-driven representation of the probabilistic DS is proved with three case studies involving literature experimental data of pharmaceutical unit operations. With respect to the issue of the maintenance of a design space, a methodology is proposed to adapt in real time a model-based representation of a design space during plant operation in the presence of process-model mismatch. Based on the availability of a first-principles model (FPM) or semi-empirical model for the manufacturing process, together with measurements from plant sensors, the methodology jointly exploits (i) a dynamic state estimator and (ii) feasibility analysis to perform a risk-based online maintenance of the DS. The state estimator is deployed to obtain an up-to-date FPM by adjusting in real-time a small subset of the model parameters. Feasibility analysis and surrogate-based feasibility analysis are used to update the DS in real-time by exploiting the up-to-date FPM returned by the state estimator. The effectiveness of the methodology is shown with two simulated case studies, namely the roll compaction of microcrystalline cellulose and the penicillin fermentation in a pilot scale bioreactor. As regards the design of optimal experiments for the calibration of mathematical models for DS identification, a model-based design of experiments (MBDoE) approach is presented for an industrial freeze-drying process. A preliminary analysis is performed to choose the most suitable process model between different model alternatives and to test the structural consistency of the chosen model. A new experiment is then designed based on this model using MBDoE techniques, in order to increase the precision of the estimates of the most influential model parameters. The results of the MBDoE activity are then tested both in silico and on the real equipment

    Environmental contaminants, parasitism, and neoplasia in white perch Morone americana from Chesapeake Bay, USA

    Get PDF
    White perch are an abundant demersal fish species in freshwater and oligohaline habitats of the Chesapeake Bay. An avoidance of salinity \u3e 12-15 ppt generally restricts the distribution and movements of fish to within tributaries in the mid to lower Bay, which over time has resulted in the formation of at least three separate stocks in Chesapeake Bay. Sub-populations of white perch that are partially isolated may serve as sentinels of the conditions or stressors in the tributaries in which they reside. Fish are exposed to a variety of environmental contaminants and other anthropogenic stressors that can vary in magnitude based on regional differences in land-use patterns. Health studies of white perch conducted in the 1980s and 1990s revealed a variety of hepatic lesions, including two reports of liver neoplasms, which suggested a sensitivity to degraded habitat or pollution. However, surveys to determine prevalences and potential etiologies of tumors were not determined and the health of white perch in Chesapeake Bay was not investigated again until the studies reported herein. Recent health investigations has revealed associations between neoplasms (cholangiocarcinomas) and bile duct parasites (coccidian and myxozoan) that were not previously described from white perch. These findings raised questions concerning the potential roles of contaminants and parasitism in liver tumor induction in this species. To address knowledge gaps associated with the prevalence and etiology of tumors in white perch, an assessment of environmental contaminants, biomarkers of exposure, biliary parasites, and liver histopathology was required. This study was conducted in two tributaries of the Bay: the Choptank River, an eastern shore tributary with extensive watershed agriculture, and the Severn River, a western shore tributary with extensive development. This dissertation addresses: 1) descriptions and taxonomic placement of the coccidian and myxozoan parasites; 2) measurement of waterborne concentrations of polycyclic aromatic hydrocarbons (PAHs), organochlorine pesticides, and brominated diphenyl ethers; 3) detection of biliary metabolites as a biomarker of exposures to PAHs; 4) a histopathological description of parasitic infections, neoplasms and other lesions in the liver of fish; 5) an assessment of the biological and anthropogenic risk factors for neoplasia; and 6) an assessment of splenic and hepatic macrophage aggregates as an alternate biomarker of contaminant exposure

    Caractérisation spatio-temporelle de la dynamique des trouées et de la réponse de la forêt boréale à l'aide de données lidar multi-temporelles

    Get PDF
    La forêt boréale est un écosystème hétérogène et dynamique façonné par les perturbations naturelles comme les feux, les épidémies d'insectes, le vent et la régénération. La dynamique des trouées joue un rôle important dans la dynamique forestière parce qu'elle influence le recrutement de nouveaux individus au sein de la canopée et la croissance de la végétation avoisinante par une augmentation des ressources. Bien que l'importance des trouées en forêt boréale fut reconnue, les connaissances nécessaires à la compréhension des relations entre le régime de trouées et la dynamique forestière, en particulier sur la croissance, sont souvent manquantes. Il est difficile d'observer et de mesurer extensivement la dynamique des trouées ou les changements de la canopée simultanément dans le temps et l'espace avec des données terrain ou des images bidimensionnelles (photos aériennes,...) et ce particulièrement dans des systèmes complexes comme les forêts ouvertes ou morcelées. De plus, la plupart des recherches furent menées en s'appuyant sur seulement quelques trouées représentatives bien que les interactions entre les trouées et la structure forestière furent rarement étudiées de manière conjointe. Le lidar est un système qui balaye la surface terrestre avec des faisceaux laser permettant d'obtenir une image dense de points en trois dimensions montrant les aspects structuraux de la végétation et de la topographie sous-jacente d'une grande superficie. Nous avons formulé l'hypothèse que lorsque les retours lidar de tirs quasi-verticaux sont denses et précis, ils permettent une interprétation de la géométrie des trouées et la comparaison de celles-ci dans le temps, ce qui nous informe à propos de leur influence sur la dynamique forestière. De plus, les mesures linéaires prises à différents moments dans le temps permettraient de donner une estimation fiable de la croissance. Ainsi, l'objectif de cette recherche doctorale était de développer des méthodes et d'accroître nos connaissances sur le régime de trouées et sa dynamique, et de déterminer comment la forêt boréale mixte répond à ces perturbations en termes de croissance et de mortalité à l'échelle locale. Un autre objectif était aussi de comprendre le rôle à court terme des ouvertures de la canopée dans un peuplement et la dynamique successionelle. Ces processus écologiques furent étudiés en reconstituant la hauteur de la surface de la canopée de la forêt boréale par l'utilisation de données lidar prises. en 1998, 2003 (et 2007), mais sans spécifications d'études similaires. L'aire d'étude de 6 km² dans la Forêt d'Enseignement et de Recherche du Lac Duparquet, Québec, Canada, était suffisamment grande pour capter la variabilité de la structure de la canopée et de la réponse de la forêt à travers une gamme de peuplements à différents stades de développement. Les recherches menées lors de cette étude ont révélé que les données lidar multi-temporelles peuvent être utilisées a priori dans toute étude de télédétection des changements, dont l'optimisation de la résolution des matrices et le choix de l'interpolation des algorithmes sont essentiels (pour les surfaces végétales et terrestres) afin d'obtenir des limites précises des trouées. Nous avons trouvé qu'une technique basée sur la croissance de régions appliquée à une surface lidar peut être utilisée pour délimiter les trouées avec une géométrie précise et pour éliminer les espaces entre les arbres représentant de fausses trouées. La comparaison de trouées avec leur délimitation Iidar le long de transects linéaires de 980 mètres montre une forte correspondance de 96,5%. Le lidar a été utilisé avec succès pour délimiter des trouées simples (un seul arbre) ou multiples (plus de 5 m²). En utilisant la combinaison de séries temporelles de trouées dérivées du lidar, nous avons développé des méthodes afin de délimiter les divers types d'évènements de dynamique des trouées: l'occurrence aléatoire de trouées, l'expansion de trouées et la fermeture de trouées, tant par la croissance latérale que la régénération. La technique proposée pour identifier les hauteurs variées arbre/gaulis sur une image lidar d'un Modèle de Hauteur de Couvert (MHC) a montré près de 75 % de correspondance avec les localisations photogrammétriques. Les taux de croissance libre suggérés basés sur les donnés lidar brutes après l'élimination des sources possibles d'erreur furent utilisés subséquemment pour des techniques statistiques afin de quantifier les réponses de croissance en hauteur qui ont été trouvées afin de faire varier la localisation spatiale en respect de la bordure de la trouée. À partir de la combinaison de donnés de plusieurs groupes d'espèces (de conifères et décidues) interprétée à partir d'images à haute résolution avec des données structurales lidar nous avons estimé les patrons de croissance en hauteur des différents groupes arbres/gaulis pour plusieurs contextes de voisinage. Les résultats on montré que la forêt boréale mixte autour du lac Duparquet est un système hautement dynamique, où la perturbation de la canopée joue un rôle important même pour une courte période de temps. La nouvelle estimation du taux de formation des trouées était de 0,6 %, ce qui correspond à une rotation de 182 ans pour cette forêt. Les résultats ont montré aussi que les arbres en périphérie des trouées étaient plus vulnérables à la mortalité que ceux à l'intérieur du couvert, résultant en un élargissement de la trouée. Nos résultats confirment que tant la croissance latérale que la croissance en hauteur de la régénération contribuent à la fermeture de la canopée à un taux annuel de 1,2 %. Des évidences ont aussi montré que les trouées de conifères et de feuillus ont des croissances latérales (moyenne de 22 cm/an) et verticales similaires sans tenir compte de leur localisation et leur hauteur initiale. La croissance en hauteur de tous les gaulis était fortement positive selon le type d'évènement et la superficie de la trouée. Les résultats suggèrent que la croissance des gaulis de conifères et de feuillus atteint son taux de croissance maximal à des distances respectives se situant entre 0,5 et 2 m et 1,5 et 4 m à partir de la bordure d'une trouée et pour des ouvertures de moins de 800 m² et 250 m² respectivement. Les effets des trouées sur la croissance en hauteur d'une forêt intacte se faisaient sentir à des distance allant jusqu'à à 30 m et 20 m des trouées, respectivement pour les feuillus et les conifères. Des analyses fines de l'ouverture de la canopée montrent que les peuplements à différents stades de développement sont hautement dynamiques et ne peuvent systématiquement suivre les mêmes patrons successionels. Globalement, la forêt est presqu'à l'équilibre compositionnel avec une faible augmentation de feuillus, principalement dû à la régénération de type infilling plutôt qu'une transition successionelle de conifères tolérants à l'ombre. Les trouées sont importantes pour le maintien des feuillus puisque le remplacement en sous-couvert est vital pour certains résineux. L'étude à démontré également que la dernière épidémie de tordeuse des bourgeons de l'épinette qui s'est terminée il y a 16 ans continue d'affecter de vieux peuplements résineux qui présentent toujours un haut taux de mortalité. Les résultats obtenus démontrent que lidar est un excellent outil pour acquérir des détails rapidement sur les dynamiques spatialement extensives et à court terme des trouées de structures complexes en forêt boréale. Les évidences de cette recherche peuvent servir tant à l'écologie, la sylviculture, l'aménagement forestier et aux spécialistes lidar. Ces idées ajoutent une nouvelle dimension à notre compréhension du rôle des petites perturbations et auront une implication directe pour les aménagistes forestiers en quête d'un aménagement forestier écologique et du maintien des forêts mixtes. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Perturbation naturelle, Dynamique forestière, Dynamique des trouées, Croissances latérales, Régénération, Succession, Lidar à retours discrets, Grande superficie, Localisation des arbres individuels, Croissance en hauteur

    Nonparametric Econometric Methods and Application

    Get PDF
    The present Special Issue collects a number of new contributions both at the theoretical level and in terms of applications in the areas of nonparametric and semiparametric econometric methods. In particular, this collection of papers that cover areas such as developments in local smoothing techniques, splines, series estimators, and wavelets will add to the existing rich literature on these subjects and enhance our ability to use data to test economic hypotheses in a variety of fields, such as financial economics, microeconomics, macroeconomics, labor economics, and economic growth, to name a few

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF
    corecore