2,065 research outputs found
Outlier Detection and Missing Value Estimation in Time Series Traffic Count Data: Final Report of SERC Project GR/G23180.
A serious problem in analysing traffic count data is what to do when missing or extreme values occur, perhaps as a result of a breakdown in automatic counting equipment. The objectives of this current work were to attempt to look at ways of solving this problem by:
1)establishing the applicability of time series and influence function techniques for estimating missing values and detecting outliers in time series traffic data;
2)making a comparative assessment of new techniques with those used by traffic engineers in practice for local, regional or national traffic count systems
Two alternative approaches were identified as being potentially useful and these were evaluated and compared with methods currently employed for `cleaning' traffic count series. These were based on evaluating the effect of individual or groups of observations on the estimate of the auto-correlation structure and events influencing a parametric model (ARIMA).
These were compared with the existing methods which included visual inspection and smoothing techniques such as the exponentially weighted moving average in which means and variances are updated using observations from the same time and day of week.
The results showed advantages and disadvantages for each of the methods.
The exponentially weighted moving average method tended to detect unreasonable outliers and also suggested replacements which were consistently larger than could reasonably be expected.
Methods based on the autocorrelation structure were reasonably successful in detecting events but the replacement values were suspect particularly when there were groups of values needing replacement. The methods also had problems in the presence of non-stationarity, often detecting outliers which were really a result of the changing level of the data rather than extreme values. In the presence of other events, such as a change in level or seasonality, both the influence function and change in autocorrelation present problems of interpretation since there is no way of distinguishing these events from outliers.
It is clear that the outlier problem cannot be separated from that of identifying structural changes as many of the statistics used to identify outliers also respond to structural changes. The ARIMA (1,0,0)(0,1,1)7 was found to describe the vast majority of traffic count series which means that the problem of identifying a starting model can largely be avoided with a high degree of assurance.
Unfortunately it is clear that a black-box approach to data validation is prone to error but methods such as those described above lend themselves to an interactive graphics data-validation technique in which outliers and other events are highlighted requiring acceptance or otherwise manually. An adaptive approach to fitting the model may result in something which can be more automatic and this would allow for changes in the underlying model to be accommodated.
In conclusion it was found that methods based on the autocorrelation structure are the most computationally efficient but lead to problems of interpretation both between different types of event and in the presence of non-stationarity. Using the residuals from a fitted ARIMA model is the most successful method at finding outliers and distinguishing them from other events, being less expensive than case deletion. The replacement values derived from the ARIMA model were found to be the most accurate
Setar Modelling of Traffic Count Data.
As part of a SERC funded project investigating outlier detection and replacement with transport data, univariate Box-Jenkins (1976) models have already been successfully applied to traffic count series (see Redfern et al, 1992). However, the underlying assumption of normality for ARIMA models implies they are not ideally suited for time series exhibiting certain behavioural characteristics. The limitations of ARIMA models are discussed in some detail by Tong (1983), including problems with time irreversibility, non-normality, cyclicity and asymmetry. Data with irregularly spaced extreme values are unlikely to be modelled well by ARIMA models, which are better suited to data where the probability of a very high value is small. Tong (1983) argues that one way of modelling such non-normal behaviour might be to retain the general ARIMA framework and allow the white noise element to be non-gaussian. As an alternative he proposes abandoning the linearity assumption and defines a group of non linear structures, one of which is the Self-Exciting Threshold Autoregressive (SETAR) model. The model form is described in more detail below but basically consists of two (or more) piecewise linear models, with the time series "tripping" between each model according to its value with respect to a threshold point. The model is called "Self-Exciting" because the indicator variable determining the appropriate linear model for each piece of data is itself a function of the data series. Intuitively this means the mechanism driving the alternation between each model form is not an external input such as a related time series (other models can be defined where this exists), but is actually contained within the series itself. The series is thus Self-Exciting.
The three concepts embedded within the SETAR model structure are those of the threshold, limit cycle and time delay, each of which can be illustrated by the diverse applications such models can take.
The threshold can be defined as some point beyond which, if the data falls, the series structure changes inherently and so an alternative linear model form would be appropriate. In hydrology this is seen as the non-linearity of soil infiltration, where at the soil saturation point (threshold) a new model for infiltration would become appropriate.
Limit cycles describe the stable cyclical phenomena which we sometimes observe within time series. The cyclical behaviour is stationary, ie consists of regular, sustained oscillations and is an intrinsic property of the data. The limit cycle phenomena is physically observable in the field of radio-engineering where a triode valve is used to generate oscillations (see Tong, 1983 for a full description). Essentially the triode value produces self-sustaining oscillations between emitting and collecting electrons, according to the voltage value of a grid placed between the anode and cathode (thereby acting as the threshold indicator).
The third essential concept within the SETAR structure is that of the time delay and is perhaps intuitively the easiest to grasp. It can be seen within the field of population biology where many types of non-linear model may apply. For example within the cyclical oscillations of blowfly population data there is an inbuilt "feedback" mechanism given by the hatching period for eggs, which would give rise to a time delay parameter within the model. For some processes this inherent delay may be so small as to be virtually instantaneous and so the delay parameter could be omitted.
In general time series Tong (1983) found the SETAR model well suited to the cyclical nature of the Canadian Lynx trapping series and for modelling riverflow systems (Tong, Thanoon & Gudmundsson, 1984). Here we investigate their applicability with time series traffic counts, some of which have exhibited the type of non-linear and cyclical characteristics which could undermine a straightforward linear modelling process
Broadening the Scope of Nanopublications
In this paper, we present an approach for extending the existing concept of
nanopublications --- tiny entities of scientific results in RDF representation
--- to broaden their application range. The proposed extension uses English
sentences to represent informal and underspecified scientific claims. These
sentences follow a syntactic and semantic scheme that we call AIDA (Atomic,
Independent, Declarative, Absolute), which provides a uniform and succinct
representation of scientific assertions. Such AIDA nanopublications are
compatible with the existing nanopublication concept and enjoy most of its
advantages such as information sharing, interlinking of scientific findings,
and detailed attribution, while being more flexible and applicable to a much
wider range of scientific results. We show that users are able to create AIDA
sentences for given scientific results quickly and at high quality, and that it
is feasible to automatically extract and interlink AIDA nanopublications from
existing unstructured data sources. To demonstrate our approach, a web-based
interface is introduced, which also exemplifies the use of nanopublications for
non-scientific content, including meta-nanopublications that describe other
nanopublications.Comment: To appear in the Proceedings of the 10th Extended Semantic Web
Conference (ESWC 2013
Intra- and interspecies interactions between prion proteins and effects of mutations and polymorphisms
Recently, crystallization of the prion protein in a dimeric form was reported. Here we show that native soluble homogenous FLAG-tagged prion proteins from hamster, man and cattle expressed in the baculovirus system are predominantly dimeric. The PrP/PrP interaction was confirmed in Semliki Forest virus-RNA transfected BHK cells co-expressing FLAG- and oligohistidine-tagged human PrP. The yeast two-hybrid system identified the octarepeat region and the C-terminal structured domain (aa90-aa230) of PrP as PrP/PrP interaction domains. Additional octarepeats identified in patients suffering from fCJD reduced (wtPrP versus PrP+90R) and completely abolished (PrP+90R versus PrP+90R) the PrP/PrP interaction in the yeast two-hybrid system. In contrast, the Met/Val polymorphism (aa129), the GSS mutation Pro102Leu and the FFI mutation Asp178Asn did not affect PrP/PrP interactions. Proof of interactions between human or sheep and bovine PrP, and sheep and human PrP, as well as lack of interactions between human or bovine PrP and hamster PrP suggest that interspecies PrP interaction studies in the yeast two-hybrid system may serve as a rapid pre-assay to investigate species barriers in prion diseases
Trust in Crowds: probabilistic behaviour in anonymity protocols
The existing analysis of the Crowds anonymity protocol assumes that a participating member is either ‘honest’ or ‘corrupted’. This paper generalises this analysis so that each member is assumed to maliciously disclose the identity of other nodes with a probability determined by her vulnerability to corruption. Within this model, the trust in a principal is defined to be the probability that she behaves honestly. We investigate the effect of such a probabilistic behaviour on the anonymity of the principals participating in the protocol, and formulate the necessary conditions to achieve ‘probable innocence’. Using these conditions, we propose a generalised Crowds-Trust protocol which uses trust information to achieves ‘probable innocence’ for principals exhibiting probabilistic behaviour
The Christiansen Effect in Saturn's narrow dusty rings and the spectral identification of clumps in the F ring
Stellar occultations by Saturn's rings observed with the Visual and Infrared
Mapping Spectrometer (VIMS) onboard the Cassini spacecraft reveal that dusty
features such as the F ring and the ringlets in the Encke and the Laplace Gaps
have distinctive infrared transmission spectra. These spectra show a narrow
optical depth minimum at wavelengths around 2.87 microns. This minimum is
likely due to the Christiansen Effect, a reduction in the extinction of small
particles when their (complex) refractive index is close to that of the
surrounding medium. Simple Mie-scattering models demonstrate that the strength
of this opacity dip is sensitive to the size distribution of particles between
1 and 100 microns across. Furthermore, the spatial resolution of the
occultation data is sufficient to reveal variations in the transmission spectra
within and among these rings. For example, in both the Encke Gap ringlets and F
ring, the opacity dip weakens with increasing local optical depth, which is
consistent with the larger particles being concentrated near the cores of these
rings. The strength of the opacity dip varies most dramatically within the F
ring; certain compact regions of enhanced optical depth lack an opacity dip and
therefore appear to have a greatly reduced fraction of grains in the few-micron
size range.Such spectrally-identifiable structures probably represent a subset
of the compact optically-thick clumps observed by other Cassini instruments.
These variations in the ring's particle size distribution can provide new
insights into the processes of grain aggregation, disruption and transport
within dusty rings. For example, the unusual spectral properties of the F-ring
clumps could perhaps be ascribed to small grains adhering onto the surface of
larger particles in regions of anomalously low velocity dispersion.Comment: 42 pages, 15 figures, accepted for publication in Icarus. A few small
typographical errors fixed to match correction in proof
The algebra of adjacency patterns: Rees matrix semigroups with reversion
We establish a surprisingly close relationship between universal Horn classes
of directed graphs and varieties generated by so-called adjacency semigroups
which are Rees matrix semigroups over the trivial group with the unary
operation of reversion. In particular, the lattice of subvarieties of the
variety generated by adjacency semigroups that are regular unary semigroups is
essentially the same as the lattice of universal Horn classes of reflexive
directed graphs. A number of examples follow, including a limit variety of
regular unary semigroups and finite unary semigroups with NP-hard variety
membership problems.Comment: 30 pages, 9 figure
The Stochastic Dynamics of an Array of Atomic Force Microscopes in a Viscous Fluid
We consider the stochastic dynamics of an array of two closely spaced atomic
force microscope cantilevers in a viscous fluid for use as a possible
biomolecule sensor. The cantilevers are not driven externally, as is common in
applications of atomic force microscopy, and we explore the stochastic
cantilever dynamics due to the constant buffeting of fluid particles by
Brownian motion. The stochastic dynamics of two adjacent cantilevers are
correlated due to long range effects of the viscous fluid. Using a recently
proposed thermodynamic approach the hydrodynamic correlations are quantified
for precise experimental conditions through deterministic numerical
simulations. Results are presented for an array of two readily available atomic
force microscope cantilevers. It is shown that the force on a cantilever due to
the fluid correlations with an adjacent cantilever is more than 3 times smaller
than the Brownian force on an individual cantilever. Our results indicate that
measurements of the correlations in the displacement of an array of atomic
force microscopes can detect piconewton forces with microsecond time
resolution.Comment: 7 page article with 11 images submitted to the International Journal
of Nonlinear Mechanic
The impact of deep-sea fisheries and implementation of the UNGA Resolutions 61/105 and 64/72. Report of an international scientific workshop
The scientific workshop to review fisheries management, held in Lisbon in May 2011, brought together 22 scientists and fisheries experts from around the world to consider the United Nations General Assembly (UNGA) resolutions on high seas bottom fisheries: what progress has been made and what the outstanding issues are. This report summarises the workshop conclusions, identifying examples of good practice and making recommendations in areas where it was agreed that the current management measures fall short of their target
Stationary solutions of the one-dimensional nonlinear Schroedinger equation: I. Case of repulsive nonlinearity
All stationary solutions to the one-dimensional nonlinear Schroedinger
equation under box and periodic boundary conditions are presented in analytic
form. We consider the case of repulsive nonlinearity; in a companion paper we
treat the attractive case. Our solutions take the form of stationary trains of
dark or grey density-notch solitons. Real stationary states are in one-to-one
correspondence with those of the linear Schr\"odinger equation. Complex
stationary states are uniquely nonlinear, nodeless, and symmetry-breaking. Our
solutions apply to many physical contexts, including the Bose-Einstein
condensate and optical pulses in fibers.Comment: 11 pages, 7 figures -- revised versio
- …
