12,344 research outputs found
Structural Change in (Economic) Time Series
Methods for detecting structural changes, or change points, in time series
data are widely used in many fields of science and engineering. This chapter
sketches some basic methods for the analysis of structural changes in time
series data. The exposition is confined to retrospective methods for univariate
time series. Several recent methods for dating structural changes are compared
using a time series of oil prices spanning more than 60 years. The methods
broadly agree for the first part of the series up to the mid-1980s, for which
changes are associated with major historical events, but provide somewhat
different solutions thereafter, reflecting a gradual increase in oil prices
that is not well described by a step function. As a further illustration, 1990s
data on the volatility of the Hang Seng stock market index are reanalyzed.Comment: 12 pages, 6 figure
A Binary Control Chart to Detect Small Jumps
The classic N p chart gives a signal if the number of successes in a sequence
of inde- pendent binary variables exceeds a control limit. Motivated by
engineering applications in industrial image processing and, to some extent,
financial statistics, we study a simple modification of this chart, which uses
only the most recent observations. Our aim is to construct a control chart for
detecting a shift of an unknown size, allowing for an unknown distribution of
the error terms. Simulation studies indicate that the proposed chart is su-
perior in terms of out-of-control average run length, when one is interest in
the detection of very small shifts. We provide a (functional) central limit
theorem under a change-point model with local alternatives which explains that
unexpected and interesting behavior. Since real observations are often not
independent, the question arises whether these re- sults still hold true for
the dependent case. Indeed, our asymptotic results work under the fairly
general condition that the observations form a martingale difference array.
This enlarges the applicability of our results considerably, firstly, to a
large class time series models, and, secondly, to locally dependent image data,
as we demonstrate by an example
A comparative study of nonparametric methods for pattern recognition
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal
Sparse Model Identification and Learning for Ultra-high-dimensional Additive Partially Linear Models
The additive partially linear model (APLM) combines the flexibility of
nonparametric regression with the parsimony of regression models, and has been
widely used as a popular tool in multivariate nonparametric regression to
alleviate the "curse of dimensionality". A natural question raised in practice
is the choice of structure in the nonparametric part, that is, whether the
continuous covariates enter into the model in linear or nonparametric form. In
this paper, we present a comprehensive framework for simultaneous sparse model
identification and learning for ultra-high-dimensional APLMs where both the
linear and nonparametric components are possibly larger than the sample size.
We propose a fast and efficient two-stage procedure. In the first stage, we
decompose the nonparametric functions into a linear part and a nonlinear part.
The nonlinear functions are approximated by constant spline bases, and a triple
penalization procedure is proposed to select nonzero components using adaptive
group LASSO. In the second stage, we refit data with selected covariates using
higher order polynomial splines, and apply spline-backfitted local-linear
smoothing to obtain asymptotic normality for the estimators. The procedure is
shown to be consistent for model structure identification. It can identify
zero, linear, and nonlinear components correctly and efficiently. Inference can
be made on both linear coefficients and nonparametric functions. We conduct
simulation studies to evaluate the performance of the method and apply the
proposed method to a dataset on the Shoot Apical Meristem (SAM) of maize
genotypes for illustration
Scientific publications of the bioscience programs division. Volume 5 - Planetary quarantine
Bibliography and indexes on planetary quarantin
Analyzing Network Traffic for Malicious Hacker Activity
Since the Internet came into life in the 1970s, it has been growing more than 100% every year. On the other hand, the solutions to detecting network intrusion are far outpaced. The economic impact of malicious attacks in lost revenue to a single e-commerce company can vary from 66 thousand up to 53 million US dollars. At the same time, there is no effective mathematical model widely available to distinguish anomaly network behaviours such as port scanning, system exploring, virus and worm propagation from normal traffic.
PDS proposed by Random Knowledge Inc., detects and localizes traffic patterns consistent with attacks hidden within large amounts of legitimate traffic. With the network’s packet traffic stream being its input, PDS relies on high fidelity models for normal traffic from which it can critically judge the legitimacy of any substream of packet traffic. Because of the reliability on an accurate baseline model for normal network traffic, in this workshop, we concentrate on modelling normal network traffic with a Poisson process
Aerospace medicine and biology. A continuing bibliography with indexes, supplement 195
This bibliography lists 148 reports, articles, and other documents introduced into the NASA scientific and technical information system in June 1979
Keyed Non-Parametric Hypothesis Tests
The recent popularity of machine learning calls for a deeper understanding of
AI security. Amongst the numerous AI threats published so far, poisoning
attacks currently attract considerable attention. In a poisoning attack the
opponent partially tampers the dataset used for learning to mislead the
classifier during the testing phase.
This paper proposes a new protection strategy against poisoning attacks. The
technique relies on a new primitive called keyed non-parametric hypothesis
tests allowing to evaluate under adversarial conditions the training input's
conformance with a previously learned distribution . To do so we
use a secret key unknown to the opponent.
Keyed non-parametric hypothesis tests differs from classical tests in that
the secrecy of prevents the opponent from misleading the keyed test
into concluding that a (significantly) tampered dataset belongs to
.Comment: Paper published in NSS 201
Inefficiency in the German Mechanical Engineering Sector
This paper aims to examine the relative efficiency of German engineering firms using a sample of roughly 23,000 observations between 1995 and 2004. As these firms had been successful in the examination period in terms of output- and export-growth, it is expected that a majority of firms is operating quite efficiently and that the density of efficiency scores is skewed to the left. Moreover, as the German engineering industry is dominated by medium sized firms, the question arises whether these firms are the most efficient ones. Finally an increasing efficiency gap between size classes over time is important since that would be a signal for a structural problem within the industry. The analysis - using recently developed DEA methods like bootstrapping or outlier detection - contradicts the two first expectations. The firms proved to operate quite inefficiently with an overall mean of 0.69, and efficiency differs significantly with firm size whereas medium sized firms being on average the least efficient ones. When looking at changes in efficiency over time, we find a decreasing efficiency gap between size classes.DEA, German engineering firms
False discovery rate regression: an application to neural synchrony detection in primary visual cortex
Many approaches for multiple testing begin with the assumption that all tests
in a given study should be combined into a global false-discovery-rate
analysis. But this may be inappropriate for many of today's large-scale
screening problems, where auxiliary information about each test is often
available, and where a combined analysis can lead to poorly calibrated error
rates within different subsets of the experiment. To address this issue, we
introduce an approach called false-discovery-rate regression that directly uses
this auxiliary information to inform the outcome of each test. The method can
be motivated by a two-groups model in which covariates are allowed to influence
the local false discovery rate, or equivalently, the posterior probability that
a given observation is a signal. This poses many subtle issues at the interface
between inference and computation, and we investigate several variations of the
overall approach. Simulation evidence suggests that: (1) when covariate effects
are present, FDR regression improves power for a fixed false-discovery rate;
and (2) when covariate effects are absent, the method is robust, in the sense
that it does not lead to inflated error rates. We apply the method to neural
recordings from primary visual cortex. The goal is to detect pairs of neurons
that exhibit fine-time-scale interactions, in the sense that they fire together
more often than expected due to chance. Our method detects roughly 50% more
synchronous pairs versus a standard FDR-controlling analysis. The companion R
package FDRreg implements all methods described in the paper
- …