2,286 research outputs found
Small area estimation of general parameters with application to poverty indicators: A hierarchical Bayes approach
Poverty maps are used to aid important political decisions such as allocation
of development funds by governments and international organizations. Those
decisions should be based on the most accurate poverty figures. However, often
reliable poverty figures are not available at fine geographical levels or for
particular risk population subgroups due to the sample size limitation of
current national surveys. These surveys cannot cover adequately all the desired
areas or population subgroups and, therefore, models relating the different
areas are needed to 'borrow strength" from area to area. In particular, the
Spanish Survey on Income and Living Conditions (SILC) produces national poverty
estimates but cannot provide poverty estimates by Spanish provinces due to the
poor precision of direct estimates, which use only the province specific data.
It also raises the ethical question of whether poverty is more severe for women
than for men in a given province. We develop a hierarchical Bayes (HB) approach
for poverty mapping in Spanish provinces by gender that overcomes the small
province sample size problem of the SILC. The proposed approach has a wide
scope of application because it can be used to estimate general nonlinear
parameters. We use a Bayesian version of the nested error regression model in
which Markov chain Monte Carlo procedures and the convergence monitoring
therein are avoided. A simulation study reveals good frequentist properties of
the HB approach. The resulting poverty maps indicate that poverty, both in
frequency and intensity, is localized mostly in the southern and western
provinces and it is more acute for women than for men in most of the provinces.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS702 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources
In designed experiments and surveys, known laws or design feat ures provide
checks on the most relevant aspects of a model and identify the target
parameters. In contrast, in most observational studies in the health and social
sciences, the primary study data do not identify and may not even bound target
parameters. Discrepancies between target and analogous identified parameters
(biases) are then of paramount concern, which forces a major shift in modeling
strategies. Conventional approaches are based on conditional testing of
equality constraints, which correspond to implausible point-mass priors. When
these constraints are not identified by available data, however, no such
testing is possible. In response, implausible constraints can be relaxed into
penalty functions derived from plausible prior distributions. The resulting
models can be fit within familiar full or partial likelihood frameworks. The
absence of identification renders all analyses part of a sensitivity analysis.
In this view, results from single models are merely examples of what might be
plausibly inferred. Nonetheless, just one plausible inference may suffice to
demonstrate inherent limitations of the data. Points are illustrated with
misclassified data from a study of sudden infant death syndrome. Extensions to
confounding, selection bias and more complex data structures are outlined.Comment: Published in at http://dx.doi.org/10.1214/09-STS291 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
ISBIS 2016: Meeting on Statistics in Business and Industry
This Book includes the abstracts of the talks presented at the 2016 International Symposium on Business and Industrial Statistics, held at Barcelona, June 8-10, 2016, hosted at the Universitat Politècnica de Catalunya - Barcelona TECH, by the Department of Statistics and Operations Research. The location of the meeting was at ETSEIB Building (Escola Tecnica Superior d'Enginyeria Industrial) at Avda Diagonal 647.
The meeting organizers celebrated the continued success of ISBIS and ENBIS society, and the meeting draw together the international community of statisticians, both academics and industry professionals, who share the goal of making statistics the foundation for decision making in business and related applications. The Scientific Program Committee was constituted by:
David Banks, Duke University
AmĂlcar Oliveira, DCeT - Universidade Aberta and CEAUL
Teresa A. Oliveira, DCeT - Universidade Aberta and CEAUL
Nalini Ravishankar, University of Connecticut
Xavier Tort Martorell, Universitat Politécnica de Catalunya, Barcelona TECH
Martina Vandebroek, KU Leuven
Vincenzo Esposito Vinzi, ESSEC Business Schoo
The Application of Artificial Intelligence in Project Management Research: A Review
The field of artificial intelligence is currently experiencing relentless growth, with innumerable models emerging in the research and development phases across various fields, including science, finance, and engineering. In this work, the authors review a large number of learning techniques aimed at project management. The analysis is largely focused on hybrid systems, which present computational models of blended learning techniques. At present, these models are at a very early stage and major efforts in terms of development is required within the scientific community. In addition, we provide a classification of all the areas within project management and the learning techniques that are used in each, presenting a brief study of the different artificial intelligence techniques used today and the areas of project management in which agents are being applied. This work should serve as a starting point for researchers who wish to work in the exciting world of artificial intelligence in relation to project leadership and management
A linear regression model for imprecise response
A linear regression model with imprecise response and p real explanatory variables is analyzed. The imprecision of the response variable is functionally described by means of certain kinds of fuzzy sets, the LR fuzzy sets. The LR fuzzy random variables are introduced to model usual random experiments when the characteristic observed on each result can be described with fuzzy numbers of a particular class, determined by 3 random values: the center, the left spread and the right spread. In fact, these constitute a natural generalization of the interval data. To deal with the estimation problem the space of the LR fuzzy numbers is proved to be isometric to a closed and convex cone of R3 with respect to a generalization of the most used metric for LR fuzzy numbers. The expression of the estimators in terms of moments is established, their limit distribution and asymptotic properties are analyzed and applied to the determination of confidence regions and hypothesis testing procedures. The results are illustrated by means of some case-studies. © 2010 Elsevier Inc. All rights reserved
Fuzzy logistic regression for detecting differential DNA methylation regions
“Epigenetics is the study of changes in gene activity or function that are not related to a change in the DNA sequence. DNA methylation is one of the main types of epigenetic modifications, that occur when a methyl chemical group attaches to a cytosine on the DNA sequence. Although the sequence does not change, the addition of a methyl group can change the way genes are expressed and produce different phenotypes. DNA methylation is involved in many biological processes and has important implications in the fields of biomedicine and agriculture.
Statistical methods have been developed to compare DNA methylation at cytosine nucleotides between populations of interest (e.g., healthy and diseased) across the entire genome from next generation sequence (NGS) data. Testing for the differences between populations in DNA methylation at specific sites is often followed by an assessment of regional difference using post hoc aggregation procedures to group neighboring sites that are differentially methylated. Although site-level analysis can yield some useful information, there are advantages to testing for differential methylation across entire genomic regions. Examining genomic regions produces less noise, reduces the numbers of statistical tests, and has the potential to provide more informative results to biologists.
In this research, several different types of logistic regression models are investigated to test for differentially methylated regions (DMRs). The focus of this work is on developing a fuzzy logistic regression model for DMR detection. Two other logistic regression methods (weighted average logistic regression and ordinal logistic regression) are also introduced as alternative approaches. The performance of these novel approaches are then compared with an existing logistic regression method (MAGIg) for region-level testing, using data simulated based on two (one plant, one human) real NGS methylation data sets”--Abstract, page iii
Data mining using intelligent systems : an optimized weighted fuzzy decision tree approach
Data mining can be said to have the aim to analyze the observational datasets to find relationships and to present the data in ways that are both understandable and useful. In this thesis, some existing intelligent systems techniques such as Self-Organizing Map, Fuzzy C-means and decision tree are used to analyze several datasets. The techniques are used to provide flexible information processing capability for handling real-life situations. This thesis is concerned with the design, implementation, testing and application of these techniques to those datasets. The thesis also introduces a hybrid intelligent systems technique: Optimized Weighted Fuzzy Decision Tree (OWFDT) with the aim of improving Fuzzy Decision Trees (FDT) and solving practical problems.
This thesis first proposes an optimized weighted fuzzy decision tree, incorporating the introduction of Fuzzy C-Means to fuzzify the input instances but keeping the expected labels crisp. This leads to a different output layer activation function and weight connection in the neural network (NN) structure obtained by mapping the FDT to the NN. A momentum term was also introduced into the learning process to train the weight connections to avoid oscillation or divergence. A new reasoning mechanism has been also proposed to combine the constructed tree with those weights which had been optimized in the learning process. This thesis also makes a comparison between the OWFDT and two benchmark algorithms, Fuzzy ID3 and weighted FDT.
SIx datasets ranging from material science to medical and civil engineering were introduced as case study applications. These datasets involve classification of composite material failure mechanism, classification of electrocorticography (ECoG)/Electroencephalogram (EEG) signals, eye bacteria prediction and wave overtopping prediction. Different intelligent systems techniques were used to cluster the patterns and predict the classes although OWFDT was used to design classifiers for all the datasets. In the material dataset, Self-Organizing Map and Fuzzy C-Means were used to cluster the acoustic event signals and classify those events to different failure mechanism, after the classification, OWFDT was introduced to design a classifier in an attempt to classify acoustic event signals. For the eye bacteria dataset, we use the bagging technique to improve the classification accuracy of Multilayer Perceptrons and Decision Trees. Bootstrap aggregating (bagging) to Decision Tree also helped to select those most important sensors (features) so that the dimension of the data could be reduced. Those features which were most important were used to grow the OWFDT and the curse of dimensionality problem could be solved using this approach. The last dataset, which is concerned with wave overtopping, was used to benchmark OWFDT with some other Intelligent Systems techniques, such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Evolving Fuzzy Neural Network (EFuNN), Genetic Neural Mathematical Method (GNMM) and Fuzzy ARTMAP.
Through analyzing these datasets using these Intelligent Systems Techniques, it has been shown that patterns and classes can be found or can be classified through combining those techniques together. OWFDT has also demonstrated its efficiency and effectiveness as compared with a conventional fuzzy Decision Tree and weighted fuzzy Decision Tree
- …