1,812 research outputs found

    Outlier Detection and Missing Value Estimation in Time Series Traffic Count Data: Final Report of SERC Project GR/G23180.

    Get PDF
    A serious problem in analysing traffic count data is what to do when missing or extreme values occur, perhaps as a result of a breakdown in automatic counting equipment. The objectives of this current work were to attempt to look at ways of solving this problem by: 1)establishing the applicability of time series and influence function techniques for estimating missing values and detecting outliers in time series traffic data; 2)making a comparative assessment of new techniques with those used by traffic engineers in practice for local, regional or national traffic count systems Two alternative approaches were identified as being potentially useful and these were evaluated and compared with methods currently employed for `cleaning' traffic count series. These were based on evaluating the effect of individual or groups of observations on the estimate of the auto-correlation structure and events influencing a parametric model (ARIMA). These were compared with the existing methods which included visual inspection and smoothing techniques such as the exponentially weighted moving average in which means and variances are updated using observations from the same time and day of week. The results showed advantages and disadvantages for each of the methods. The exponentially weighted moving average method tended to detect unreasonable outliers and also suggested replacements which were consistently larger than could reasonably be expected. Methods based on the autocorrelation structure were reasonably successful in detecting events but the replacement values were suspect particularly when there were groups of values needing replacement. The methods also had problems in the presence of non-stationarity, often detecting outliers which were really a result of the changing level of the data rather than extreme values. In the presence of other events, such as a change in level or seasonality, both the influence function and change in autocorrelation present problems of interpretation since there is no way of distinguishing these events from outliers. It is clear that the outlier problem cannot be separated from that of identifying structural changes as many of the statistics used to identify outliers also respond to structural changes. The ARIMA (1,0,0)(0,1,1)7 was found to describe the vast majority of traffic count series which means that the problem of identifying a starting model can largely be avoided with a high degree of assurance. Unfortunately it is clear that a black-box approach to data validation is prone to error but methods such as those described above lend themselves to an interactive graphics data-validation technique in which outliers and other events are highlighted requiring acceptance or otherwise manually. An adaptive approach to fitting the model may result in something which can be more automatic and this would allow for changes in the underlying model to be accommodated. In conclusion it was found that methods based on the autocorrelation structure are the most computationally efficient but lead to problems of interpretation both between different types of event and in the presence of non-stationarity. Using the residuals from a fitted ARIMA model is the most successful method at finding outliers and distinguishing them from other events, being less expensive than case deletion. The replacement values derived from the ARIMA model were found to be the most accurate

    Setar Modelling of Traffic Count Data.

    Get PDF
    As part of a SERC funded project investigating outlier detection and replacement with transport data, univariate Box-Jenkins (1976) models have already been successfully applied to traffic count series (see Redfern et al, 1992). However, the underlying assumption of normality for ARIMA models implies they are not ideally suited for time series exhibiting certain behavioural characteristics. The limitations of ARIMA models are discussed in some detail by Tong (1983), including problems with time irreversibility, non-normality, cyclicity and asymmetry. Data with irregularly spaced extreme values are unlikely to be modelled well by ARIMA models, which are better suited to data where the probability of a very high value is small. Tong (1983) argues that one way of modelling such non-normal behaviour might be to retain the general ARIMA framework and allow the white noise element to be non-gaussian. As an alternative he proposes abandoning the linearity assumption and defines a group of non linear structures, one of which is the Self-Exciting Threshold Autoregressive (SETAR) model. The model form is described in more detail below but basically consists of two (or more) piecewise linear models, with the time series "tripping" between each model according to its value with respect to a threshold point. The model is called "Self-Exciting" because the indicator variable determining the appropriate linear model for each piece of data is itself a function of the data series. Intuitively this means the mechanism driving the alternation between each model form is not an external input such as a related time series (other models can be defined where this exists), but is actually contained within the series itself. The series is thus Self-Exciting. The three concepts embedded within the SETAR model structure are those of the threshold, limit cycle and time delay, each of which can be illustrated by the diverse applications such models can take. The threshold can be defined as some point beyond which, if the data falls, the series structure changes inherently and so an alternative linear model form would be appropriate. In hydrology this is seen as the non-linearity of soil infiltration, where at the soil saturation point (threshold) a new model for infiltration would become appropriate. Limit cycles describe the stable cyclical phenomena which we sometimes observe within time series. The cyclical behaviour is stationary, ie consists of regular, sustained oscillations and is an intrinsic property of the data. The limit cycle phenomena is physically observable in the field of radio-engineering where a triode valve is used to generate oscillations (see Tong, 1983 for a full description). Essentially the triode value produces self-sustaining oscillations between emitting and collecting electrons, according to the voltage value of a grid placed between the anode and cathode (thereby acting as the threshold indicator). The third essential concept within the SETAR structure is that of the time delay and is perhaps intuitively the easiest to grasp. It can be seen within the field of population biology where many types of non-linear model may apply. For example within the cyclical oscillations of blowfly population data there is an inbuilt "feedback" mechanism given by the hatching period for eggs, which would give rise to a time delay parameter within the model. For some processes this inherent delay may be so small as to be virtually instantaneous and so the delay parameter could be omitted. In general time series Tong (1983) found the SETAR model well suited to the cyclical nature of the Canadian Lynx trapping series and for modelling riverflow systems (Tong, Thanoon & Gudmundsson, 1984). Here we investigate their applicability with time series traffic counts, some of which have exhibited the type of non-linear and cyclical characteristics which could undermine a straightforward linear modelling process

    Polar Bear Conservation in Canada: Defining the Policy Problems

    Get PDF
    Conservation of polar bears (Ursus maritimus) in Canada is based on the goals and principles of the 1973 International Agreement on the Conservation of Polar Bears and Their Habitat, and has long been considered an exemplar of science-based wildlife management. However, accelerating social and ecological changes in the Arctic raise questions about the polar bear management regime’s ability to adapt successfully to new challenges. We apply the analytic framework of the policy sciences to develop a comprehensive orientation to this evolving situation, and we suggest possible ways to define and advance shared goals of stakeholders and other participants. We conclude that the decision process in polar bear management does not sufficiently foster identification and securing of common interests among participants who express multiple, competing perspectives in an arena that has been increasingly fragmented and symbolically charged by issues such as the recent listing of polar bears under the U.S. Endangered Species Act. The fundamental challenge for polar bear conservation in Canada is to design a better decision process so that it can constructively reconcile the various perspectives, demands, and expectations of stakeholders.Au Canada, la conservation des ours polaires (Ursus maritimus) respecte les objectifs et les principes de l’Accord international sur la conservation des ours blancs et leur habitat de 1973, qui est considéré depuis longtemps comme un modèle de gestion de la faune fondée sur la science. Cependant, l’évolution de plus en plus rapide des changements d’ordre social et écologique dans l’Arctique a pour effet de soulever des questions sur l’aptitude du régime de gestion de l’ours polaire à bien s’adapter aux nouveaux défis. Nous utilisons le cadre de référence analytique de la science des politiques pour aboutir à une orientation exhaustive de cette situation en pleine évolution, et nous suggérons des manières possibles de définir et de formuler des objectifs partagés par les parties prenantes et d’autres participants. Nous concluons que le processus de décision en matière de gestion de l’ours polaire n’encourage pas suffisamment l’identification et l’engagement d’intérêts communs entre les participants qui expriment des perspectives multiples et concurrentes dans un domaine de plus en plus fragmenté et symboliquement caractérisé par des enjeux tels que la liste récente d’ours polaires en vertu de la loi américaine sur les espèces en voie de disparition (U.S. Endangered Species Act). Le défi fondamental en ce qui a trait à la conservation des ours polaires au Canada consiste à concevoir un meilleur processus de décision pouvant réconcilier, de manière constructive, les diverses perspectives, exigences et attentes des parties prenantes

    Application of Outlier Detection and Missing Value Estimation Techniques to Various Forms of Traffic Count Data

    Get PDF
    This paper reports on the application of suitable techniques for detecting outliers and suggesting estimates for missing values in various forrns of traffic count data. The data used in this study came from three sources. The first set was provided by the Department of Transport's (DOT) regional office in Leeds and consists of automatic hourly traffic counts at four sites. The second set was part of a larger database provided by West Yorkshire Highways, Engineering and Technical Services (HETS). This set consists of automatic half hourly traffic counts on a single site. The third and final set was provided by Nottinghan University and consists of automatic five minute traffic counts at 40 locations, in close proximity to each other, from Leicester. Three suitable techniques emerged from pilot studies of such series conducted by Watson et a1 (1992a) and Redfern et a1 (1992). The three techniques are: a) Maintaining an average and variability measure over time; b) ARIMA modelling with detection of large residuals; C) A point's influence on the correlation structure of the series. A fourth technique, by-eye detection and estimation, provides an intuitive comparison for the first three techniques

    An Influence Method for Outliers Detection Applied to Time Series Traffic Data

    Get PDF
    The applicability of an outlier detection statistic developed for standard time series is assessed in estimating missing values and detecting outliers in traffic count data. The work of Chernick, Downing and Pike (1982) is extended to form a quantitive outlier detection statistic for use with time series data. The statistic is formed from the squared elements of the Influence Function Matrix, where each element of the matrix gives the influence on pk, of a pair of observations at time lag k. Approximate first four moments for the statistic are derived and by fitting Johnson curves to those theoretical moments, critical points are also produced. The statistic is also used to form the basis of an adjustment procedure to treat outliers or estimate missing values in the time series. Chernick et al's (1982) nuclear power data and the Department of Transport's traffic count data are used for practical illustration

    Development of an Influence Statistic for Outlier Detection With Time Series Traffic Data.

    Get PDF
    As part of a SERC funded project investigating the detection and treatment of outlying time series transport data, the practical applicability of the Influence Statistic described by Watson et al(1991) is assessed here. Missing or outlying data occur in a variety of transport time series such as traflic counts or journey times for many reasons including broken machinery and recording errors. In practice such data is patched largely by subjective opinion or using simple aggregate methods. In the analysis of non-transport time series several methods have been recently developed to both detect and treat outliers, including work by Kohn and Ansley (1986), Hau and Tong (1984) and Bruce and Martin (1989). These methods use either an intervention modelling approach (where the outlier is modelled as part of an ARIMA structure) or look at the influence an observation exerts on a particular parameter associated with the model. An alternative is the Influence Statistic proposed by Watson (1987) and Watson et al (1992) which examines the influence of an observation on the sample autocorrelation function. Initial research showed the statistic has practical application in a transport context, and a replacement procedure based on the method was found to be effective in treating maverick data. Here we report the results from a wider application of the statistic using traffic count data fmm. the Department of Transport. Further developments are suggested and investigated for the replacement procedure and a comparison is made between possible variations in the method

    Modelling Outliers and Missing Values in traffic Count Data Using the ARIMA Model.

    Get PDF
    This paper considers the application of the methodology to traffic count time series in which both missing values and outliers are present. Intervention analysis and detection using large residuals are shown to he reasonably effective but possible problems that result from non- stationarity in the data are identified. It is shown that despite considerable variabilty in the types of series the model selected from the ARIMA family is surprisingly homogeneous

    Broadening the Scope of Nanopublications

    Full text link
    In this paper, we present an approach for extending the existing concept of nanopublications --- tiny entities of scientific results in RDF representation --- to broaden their application range. The proposed extension uses English sentences to represent informal and underspecified scientific claims. These sentences follow a syntactic and semantic scheme that we call AIDA (Atomic, Independent, Declarative, Absolute), which provides a uniform and succinct representation of scientific assertions. Such AIDA nanopublications are compatible with the existing nanopublication concept and enjoy most of its advantages such as information sharing, interlinking of scientific findings, and detailed attribution, while being more flexible and applicable to a much wider range of scientific results. We show that users are able to create AIDA sentences for given scientific results quickly and at high quality, and that it is feasible to automatically extract and interlink AIDA nanopublications from existing unstructured data sources. To demonstrate our approach, a web-based interface is introduced, which also exemplifies the use of nanopublications for non-scientific content, including meta-nanopublications that describe other nanopublications.Comment: To appear in the Proceedings of the 10th Extended Semantic Web Conference (ESWC 2013

    Intra- and interspecies interactions between prion proteins and effects of mutations and polymorphisms

    Get PDF
    Recently, crystallization of the prion protein in a dimeric form was reported. Here we show that native soluble homogenous FLAG-tagged prion proteins from hamster, man and cattle expressed in the baculovirus system are predominantly dimeric. The PrP/PrP interaction was confirmed in Semliki Forest virus-RNA transfected BHK cells co-expressing FLAG- and oligohistidine-tagged human PrP. The yeast two-hybrid system identified the octarepeat region and the C-terminal structured domain (aa90-aa230) of PrP as PrP/PrP interaction domains. Additional octarepeats identified in patients suffering from fCJD reduced (wtPrP versus PrP+90R) and completely abolished (PrP+90R versus PrP+90R) the PrP/PrP interaction in the yeast two-hybrid system. In contrast, the Met/Val polymorphism (aa129), the GSS mutation Pro102Leu and the FFI mutation Asp178Asn did not affect PrP/PrP interactions. Proof of interactions between human or sheep and bovine PrP, and sheep and human PrP, as well as lack of interactions between human or bovine PrP and hamster PrP suggest that interspecies PrP interaction studies in the yeast two-hybrid system may serve as a rapid pre-assay to investigate species barriers in prion diseases

    Reduced anisotropy of water diffusion in structural cerebral abnormalities demonstrated with diffusion tensor imaging

    Get PDF
    We used diffusion tensor imaging (DTI) to investigate the behavior of water diffusion in cerebral structural abnormalities. The fractional anisotropy, a measure of directionality of the molecular motion of water, and the mean diffusivity, a measure of the magnitude of the molecular motion of water, were measured in 18 patients with longstanding partial epilepsy and structural abnormalities on standard magnetic resonance imaging and the results compared with measurements in the white matter of 10 control subjects. Structural abnormalities were brain damage (postsurgical brain damage, nonspecific brain damage, perinatal brain damage, perinatal infarct, ischemic infarct, perinatal hypoxia, traumatic brain damage (n = 3), mitochondrial cytopathy and mesiotemporal sclerosis), dysgenesis (cortical dysplasia (n = 2) and heterotopia) and tumors (meningioma (n = 2), hypothalamic hamartoma and glioma). Anisotropy was reduced in all structural abnormalities. In the majority of abnormalities this was associated with an increased mean diffusivity; however, 30% of all structural abnormalities (some patients with brain damage and dysgenesis) had a normal mean diffusivity in combination with a reduced anisotropy. There was no correlation between fractional anisotropy and mean diffusivity measurements in structural abnormalities (r = -0.1). Our findings suggest that DTI is sensitive for the detection of a variety of structural abnormalities, that a reduced anisotropy is the common denominator in structural cerebral abnormalities of different etiologies and that mean diffusivity and fractional anisotropy may be, in part, independent. Combined measurements of mean diffusivity and fractional anisotropy are likely to increase the specificity of DTI
    • …
    corecore