3,060 research outputs found
A Bayesian spatio-temporal model of panel design data: airborne particle number concentration in Brisbane, Australia
This paper outlines a methodology for semi-parametric spatio-temporal
modelling of data which is dense in time but sparse in space, obtained from a
split panel design, the most feasible approach to covering space and time with
limited equipment. The data are hourly averaged particle number concentration
(PNC) and were collected, as part of the Ultrafine Particles from Transport
Emissions and Child Health (UPTECH) project. Two weeks of continuous
measurements were taken at each of a number of government primary schools in
the Brisbane Metropolitan Area. The monitoring equipment was taken to each
school sequentially. The school data are augmented by data from long term
monitoring stations at three locations in Brisbane, Australia.
Fitting the model helps describe the spatial and temporal variability at a
subset of the UPTECH schools and the long-term monitoring sites. The temporal
variation is modelled hierarchically with penalised random walk terms, one
common to all sites and a term accounting for the remaining temporal trend at
each site. Parameter estimates and their uncertainty are computed in a
computationally efficient approximate Bayesian inference environment, R-INLA.
The temporal part of the model explains daily and weekly cycles in PNC at the
schools, which can be used to estimate the exposure of school children to
ultrafine particles (UFPs) emitted by vehicles. At each school and long-term
monitoring site, peaks in PNC can be attributed to the morning and afternoon
rush hour traffic and new particle formation events. The spatial component of
the model describes the school to school variation in mean PNC at each school
and within each school ground. It is shown how the spatial model can be
expanded to identify spatial patterns at the city scale with the inclusion of
more spatial locations.Comment: Draft of this paper presented at ISBA 2012 as poster, part of UPTECH
projec
Two Approaches to Imputation and Adjustment of Air Quality Data from a Composite Monitoring Network
An analysis of air quality data is provided for the municipal area of Taranto characterized by high environmental risks, due to the massive presence of industrial sites with elevated environmental impact activities. The present study is focused on particulate matter as measured by PM10 concentrations. Preliminary analysis involved addressing several data problems, mainly: (i) an imputation techniques were considered to cope with the large number of missing data, due to both different working periods for groups of monitoring stations and occasional malfunction of PM10 sensors; (ii) due to the use of different validation techniques for each of the three monitoring networks, a calibration procedure was devised to allow for data comparability. Missing data imputation and calibration were addressed by three alternative procedures sharing a leave-one-out type mechanism and based on {\it ad hoc} exploratory tools and on the recursive Bayesian estimation and prediction of spatial linear mixed effects models. The three procedures are introduced by motivating issues and compared in terms of performance
Two Approaches to Imputation and Adjustment of Air Quality Data from a Composite Monitoring Network
An analysis of air quality data is provided for the municipal area of Taranto characterized by high environmental risks, due to the massive presence of industrial sites with elevated environmental impact activities. The present study is focused on particulate matter as measured by PM10 concentrations. Preliminary analysis involved addressing several data problems, mainly: (i) an imputation techniques were considered to cope with the large number of missing data, due to both different working periods for groups of monitoring stations and occasional malfunction of PM10 sensors; (ii) due to the use of different validation techniques for each of the three monitoring networks, a calibration procedure was devised to allow for data comparability. Missing data imputation and calibration were addressed by three alternative procedures sharing a leave-one-out type mechanism and based on {\it ad hoc} exploratory tools and on the recursive Bayesian estimation and prediction of spatial linear mixed effects models. The three procedures are introduced by motivating issues and compared in terms of performance
Idealized computational models for auditory receptive fields
This paper presents a theory by which idealized models of auditory receptive
fields can be derived in a principled axiomatic manner, from a set of
structural properties to enable invariance of receptive field responses under
natural sound transformations and ensure internal consistency between
spectro-temporal receptive fields at different temporal and spectral scales.
For defining a time-frequency transformation of a purely temporal sound
signal, it is shown that the framework allows for a new way of deriving the
Gabor and Gammatone filters as well as a novel family of generalized Gammatone
filters, with additional degrees of freedom to obtain different trade-offs
between the spectral selectivity and the temporal delay of time-causal temporal
window functions.
When applied to the definition of a second-layer of receptive fields from a
spectrogram, it is shown that the framework leads to two canonical families of
spectro-temporal receptive fields, in terms of spectro-temporal derivatives of
either spectro-temporal Gaussian kernels for non-causal time or the combination
of a time-causal generalized Gammatone filter over the temporal domain and a
Gaussian filter over the logspectral domain. For each filter family, the
spectro-temporal receptive fields can be either separable over the
time-frequency domain or be adapted to local glissando transformations that
represent variations in logarithmic frequencies over time. Within each domain
of either non-causal or time-causal time, these receptive field families are
derived by uniqueness from the assumptions.
It is demonstrated how the presented framework allows for computation of
basic auditory features for audio processing and that it leads to predictions
about auditory receptive fields with good qualitative similarity to biological
receptive fields measured in the inferior colliculus (ICC) and primary auditory
cortex (A1) of mammals.Comment: 55 pages, 22 figures, 3 table
Data-driven model development in environmental geography - Methodological advancements and scientific applications
Die Erfassung räumlich kontinuierlicher Daten und raum-zeitlicher Dynamiken ist ein Forschungsschwerpunkt der Umweltgeographie. Zu diesem Ziel sind Modellierungsmethoden erforderlich, die es ermöglichen, aus limitierten Felddaten raum-zeitliche Aussagen abzuleiten. Die Komplexität von Umweltsystemen erfordert dabei die Verwendung von Modellierungsstrategien, die es erlauben, beliebige Zusammenhänge zwischen einer Vielzahl potentieller Prädiktoren zu berücksichtigen. Diese Anforderung verlangt nach einem Paradigmenwechsel von der parametrischen hin zu einer nicht-parametrischen, datengetriebenen Modellentwicklung, was zusätzlich durch die zunehmende Verfügbarkeit von Geodaten verstärkt wird.
In diesem Zusammenhang haben sich maschinelle Lernverfahren als ein wichtiges Werkzeug erwiesen, um Muster in nicht-linearen und komplexen Systemen zu erfassen. Durch die wachsende Popularität maschineller Lernverfahren in wissenschaftlichen Zeitschriften und die Entwicklung komfortabler Softwarepakete wird zunehmend der Fehleindruck einer einfachen Anwendbarkeit erzeugt. Dem gegenüber steht jedoch eine Komplexität, die im Detail nur durch eine umfassende Methodenkompetenz kontrolliert werden kann.
Diese Problematik gilt insbesondere für Geodaten, die besondere Merkmale wie vor allem räumliche Abhängigkeit aufweisen, womit sie sich von "gewöhnlichen" Daten abheben, was jedoch in maschinellen Lernanwendungen bisher weitestgehend ignoriert wird.
Die vorliegende Arbeit beschäftigt sich mit dem Potenzial und der Sensitivität des maschinellen Lernens in der Umweltgeographie. In diesem Zusammenhang wurde eine Reihe von maschinellen Lernanwendungen in einem breiten Spektrum der Umweltgeographie veröffentlicht. Die einzelnen Beiträge stehen unter der übergeordneten Hypothese, dass datengetriebene Modellierungsstrategien nur dann zu einem Informationsgewinn und zu robusten raum-zeitlichen Ergebnissen führen, wenn die Merkmale von geographischen Daten berücksichtigt werden. Neben diesem übergeordneten methodischen Fokus zielt jede Anwendung darauf ab, durch adäquat angewandte Methoden neue fachliche Erkenntnisse in ihrem jeweiligen Forschungsgebiet zu liefern.
Im Rahmen der Arbeit wurde eine Vielzahl relevanter Umweltmonitoring-Produkte entwickelt. Die Ergebnisse verdeutlichen, dass sowohl hohe fachwissenschaftliche als auch methodische Kenntnisse unverzichtbar sind, um den Bereich der datengetriebenen Umweltgeographie voranzutreiben. Die Arbeit demonstriert erstmals die Relevanz räumlicher Überfittung in geographischen Lernanwendungen und legt ihre Auswirkungen auf die Modellergebnisse dar. Um diesem Problem entgegenzuwirken, wird eine neue, an Geodaten angepasste Methode zur Modellentwicklung entwickelt, wodurch deutlich verbesserte Ergebnisse erzielt werden können.
Diese Arbeit ist abschließend als Appell zu verstehen, über die Standardanwendungen der maschinellen Lernverfahren hinauszudenken, da sie beweist, dass die Anwendung von Standardverfahren auf Geodaten zu starker Überfittung und Fehlinterpretation der Ergebnisse führt. Erst wenn Eigenschaften von geographischen Daten berücksichtigt werden, bietet das maschinelle Lernen ein leistungsstarkes Werkzeug, um wissenschaftlich verlässliche Ergebnisse für die Umweltgeographie zu liefern
Bayesian model averaging over tree-based dependence structures for multivariate extremes
Describing the complex dependence structure of extreme phenomena is
particularly challenging. To tackle this issue we develop a novel statistical
algorithm that describes extremal dependence taking advantage of the inherent
hierarchical dependence structure of the max-stable nested logistic
distribution and that identifies possible clusters of extreme variables using
reversible jump Markov chain Monte Carlo techniques. Parsimonious
representations are achieved when clusters of extreme variables are found to be
completely independent. Moreover, we significantly decrease the computational
complexity of full likelihood inference by deriving a recursive formula for the
nested logistic model likelihood. The algorithm performance is verified through
extensive simulation experiments which also compare different likelihood
procedures. The new methodology is used to investigate the dependence
relationships between extreme concentration of multiple pollutants in
California and how these pollutants are related to extreme weather conditions.
Overall, we show that our approach allows for the representation of complex
extremal dependence structures and has valid applications in multivariate data
analysis, such as air pollution monitoring, where it can guide policymaking
Estimating daily nitrogen dioxide level: Exploring traffic effects
Data used to assess acute health effects from air pollution typically have
good temporal but poor spatial resolution or the opposite. A modified
longitudinal model was developed that sought to improve resolution in both
domains by bringing together data from three sources to estimate daily levels
of nitrogen dioxide () at a geographic location. Monthly
measurements at 316 sites were made available by the Study of
Traffic, Air quality and Respiratory health (STAR). Four US Environmental
Protection Agency monitoring stations have hourly measurements of . Finally, the Connecticut Department of Transportation provides data on
traffic density on major roadways, a primary contributor to
pollution. Inclusion of a traffic variable improved performance of the model,
and it provides a method for estimating exposure at points that do not have
direct measurements of the outcome. This approach can be used to estimate daily
variation in levels of over a region.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS642 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Quality of Information in Mobile Crowdsensing: Survey and Research Challenges
Smartphones have become the most pervasive devices in people's lives, and are
clearly transforming the way we live and perceive technology. Today's
smartphones benefit from almost ubiquitous Internet connectivity and come
equipped with a plethora of inexpensive yet powerful embedded sensors, such as
accelerometer, gyroscope, microphone, and camera. This unique combination has
enabled revolutionary applications based on the mobile crowdsensing paradigm,
such as real-time road traffic monitoring, air and noise pollution, crime
control, and wildlife monitoring, just to name a few. Differently from prior
sensing paradigms, humans are now the primary actors of the sensing process,
since they become fundamental in retrieving reliable and up-to-date information
about the event being monitored. As humans may behave unreliably or
maliciously, assessing and guaranteeing Quality of Information (QoI) becomes
more important than ever. In this paper, we provide a new framework for
defining and enforcing the QoI in mobile crowdsensing, and analyze in depth the
current state-of-the-art on the topic. We also outline novel research
challenges, along with possible directions of future work.Comment: To appear in ACM Transactions on Sensor Networks (TOSN
- …