1,052 research outputs found
Currency Unions and Trade: A PPML Re-Assessment with High-Dimensional Fixed Effects
Recent work on the effects of currency unions (CUs) on trade stresses the importance of using many countries and years in order to obtain reliable estimates. However, for large samples, computational issues associated with the three-way (exporter-time, importer-time, and country-pair) fixed effects currently recommended in the gravity literature have heretofore limited the choice of estimator, leaving an important methodological gap. To address this gap, we introduce an iterative Poisson Pseudo-Maximum Likelihood (PPML) estimation procedure that facilitates the inclusion of these fixed effects for large data sets and also allows for correlated errors across countries and time. When applied to a comprehensive sample with more than 200 countries trading over 65 years, these innovations flip the conclusions of an otherwise rigorously-specified linear model. Most importantly, our estimates for both the overall CU effect and the Euro effect specifically are economically small and statistically insignificant. We also document that linear and PPML estimates of the Euro effect increasingly diverge as the sample size grows
Generalized methods and solvers for noise removal from piecewise constant signals. I. Background theory
Removing noise from piecewise constant (PWC) signals is a challenging signal processing problem arising in many practical contexts. For example, in exploration geosciences, noisy drill hole records need to be separated into stratigraphic zones, and in biophysics, jumps between molecular dwell states have to be extracted from noisy fluorescence microscopy signals. Many PWC denoising methods exist, including total variation regularization, mean shift clustering, stepwise jump placement, running medians, convex clustering shrinkage and bilateral filtering; conventional linear signal processing methods are fundamentally unsuited. This paper (part I, the first of two) shows that most of these methods are associated with a special case of a generalized functional, minimized to achieve PWC denoising. The minimizer can be obtained by diverse solver algorithms, including stepwise jump placement, convex programming, finite differences, iterated running medians, least angle regression, regularization path following and coordinate descent. In the second paper, part II, we introduce novel PWC denoising methods, and comparisons between these methods performed on synthetic and real signals, showing that the new understanding of the problem gained in part I leads to new methods that have a useful role to play
A geometric approach to visualization of variability in functional data
We propose a new method for the construction and visualization of boxplot-type displays for functional data. We use a recent functional data analysis framework, based on a representation of functions called square-root slope functions, to decompose observed variation in functional data into three main components: amplitude, phase, and vertical translation. We then construct separate displays for each component, using the geometry and metric of each representation space, based on a novel definition of the median, the two quartiles, and extreme observations. The outlyingness of functional data is a very complex concept. Thus, we propose to identify outliers based on any of the three main components after decomposition. We provide a variety of visualization tools for the proposed boxplot-type displays including surface plots. We evaluate the proposed method using extensive simulations and then focus our attention on three real data applications including exploratory data analysis of sea surface temperature functions, electrocardiogram functions and growth curves
Comparison of Tukey's T-Method and Scheffé's S-Method for Various Numbers of All Possible Differences of Averages Contrasts Under Violation of Assumptions
Empirical .05 and .01 rates of Type I error were compared for the Tukey and Scheffé multiple comparison techniques. The experimentwise error rate was defined over five sets of the all possible 25 differences of averages contrasts. The robustness of the Tukey and Scheffé statistics was not only related to the type of assumption violation, but also to the sets containing different numbers of contrasts. The Tukey method could be judged as robust a statistic as the Scheffé method.Yeshttps://us.sagepub.com/en-us/nam/manuscript-submission-guideline
The response of perennial and temporary headwater stream invertebrate communities to hydrological extremes
The headwaters of karst rivers experience considerable hydrological variability, including spates and streambed drying. Extreme summer flooding on the River Lathkill (Derbyshire, UK) provided the opportunity to examine the invertebrate community response to unseasonal spate flows, flow recession and, at temporary sites, streambed drying. Invertebrates were sampled at sites with differing flow permanence regimes during and after the spates. Following streambed drying at temporary sites, dewatered surface sediments were investigated as a refugium for aquatic invertebrates. Experimental rehydration of these dewatered sediments was conducted to promote development of desiccation-tolerant life stages. At perennial sites, spate flows reduced invertebrate abundance and diversity, whilst at temporary sites, flow reactivation facilitated rapid colonisation of the surface channel by a limited number of invertebrate taxa. Following streambed drying, 38 taxa were recorded from the dewatered and rehydrated sediments, with Oligochaeta being the most abundant taxon and Chironomidae (Diptera) the most diverse. Experimental rehydration of dewatered sediments revealed the presence of additional taxa, including Stenophylax sp. (Trichoptera: Limnephilidae) and Nemoura sp. (Plecoptera: Nemouridae). The influence of flow permanence on invertebrate community composition was apparent despite the aseasonal high-magnitude flood events
Towards Machine Wald
The past century has seen a steady increase in the need of estimating and
predicting complex systems and making (possibly critical) decisions with
limited information. Although computers have made possible the numerical
evaluation of sophisticated statistical models, these models are still designed
\emph{by humans} because there is currently no known recipe or algorithm for
dividing the design of a statistical model into a sequence of arithmetic
operations. Indeed enabling computers to \emph{think} as \emph{humans} have the
ability to do when faced with uncertainty is challenging in several major ways:
(1) Finding optimal statistical models remains to be formulated as a well posed
problem when information on the system of interest is incomplete and comes in
the form of a complex combination of sample data, partial knowledge of
constitutive relations and a limited description of the distribution of input
random variables. (2) The space of admissible scenarios along with the space of
relevant information, assumptions, and/or beliefs, tend to be infinite
dimensional, whereas calculus on a computer is necessarily discrete and finite.
With this purpose, this paper explores the foundations of a rigorous framework
for the scientific computation of optimal statistical estimators/models and
reviews their connections with Decision Theory, Machine Learning, Bayesian
Inference, Stochastic Optimization, Robust Optimization, Optimal Uncertainty
Quantification and Information Based Complexity.Comment: 37 page
Outlier detection and classification in sensor data streams for proactive decision support systems
A paper has a deal with the problem of quality assessment in sensor data streams accumulated by proactive decision support systems. The new problem is stated where outliers need to be detected and to be classified according to their nature of origin. There are two types of outliers defined; the first type is about misoperations of a system and the second type is caused by changes in the observed system behavior due to inner and external influences. The proposed method is based on the data-driven forecast approach to predict the values in the incoming data stream at the expected time. This method includes the forecasting model and the clustering model. The forecasting model predicts a value in the incoming data stream at the expected time to find the deviation between a real observed value and a predicted one. The clustering method is used for taxonomic classification of outliers. Constructive neural networks models (CoNNS) and evolving connectionists systems (ECS) are used for prediction of sensors data. There are two real world tasks are used as case studies. The maximal values of accuracy are 0.992 and 0.974, and F1 scores are 0.967 and 0.938, respectively, for the first and the second tasks. The conclusion contains findings how to apply the proposed method in proactive decision support systems
- …