149,012 research outputs found
Multivariate emulation of computer simulators: model selection and diagnostics with application to a humanitarian relief model
We present a common framework for Bayesian emulation methodologies for multivariate-output simulators, or computer models, that employ either parametric linear models or nonparametric Gaussian processes. Novel diagnostics suitable for multivariate covariance-separable emulators are developed and techniques to improve the adequacy of an emulator are discussed and implemented. A variety of emulators are compared for a humanitarian relief simulator, modelling aid missions to Sicily after a volcanic eruption and earthquake, and a sensitivity analysis is conducted to determine the sensitivity of the simulator output to changes in the input variables. The results from parametric and nonparametric emulators are compared in terms of prediction accuracy, uncertainty quantification and scientific interpretability
Effect of Suburban Transit Oriented Developments on Residential Property Values, MTI Report 08-07
The development of successful TODs often encounters several barriers. These barriers include: a lack of inter-jurisdictional cooperation, auto-oriented design that favors park and ride lot over ridership generating uses, and community opposition. The community opposition may be more vocal in suburban areas where residents of predominately single-family neighborhoods may feel that the proposed high-density, mixed-use TOD will bring noise, air pollution, increased congestion and crime into their area. Community opposition has been instrumental in stopping many TOD projects in the San Francisco Bay Area. While community opposition to TODs has been pronounced, very little empirical research exists that indicates whether this opposition is well-founded. Economic theory suggests that if a TOD has a negative effect on the surrounding residential neighborhoods, then that effect should lower land prices and in turn, the housing prices in these neighborhoods. Similarly, an increase in the housing prices would mean a positive effect of TODs on the surrounding neighborhoods. This study empirically estimates the impact of four San Francisco Bay Area sub-urban TODs on single-family home sale prices. The study finds that the case study suburban TODs either had no impact or had a positive impact on the surrounding single-family home sale prices
The very same thing: Extending the object token concept to incorporate causal constraints on individual identity
The contributions of feature recognition, object categorization, and recollection of episodic memories to the re-identification of a perceived object as the very same thing encountered in a previous perceptual episode are well understood in terms of both cognitive-behavioral phenomenology and neurofunctional implementation. Human beings do not, however, rely solely on features and context to re-identify individuals; in the presence of featural change and similarly-featured distractors, people routinely employ causal constraints to establish object identities. Based on available cognitive and neurofunctional data, the standard object-token based model of individual re-identification is extended to incorporate the construction of unobserved and hence fictive causal histories (FCHs) of observed objects by the pre-motor action planning system. Cognitive-behavioral and implementation-level predictions of this extended model and methods for testing them are outlined. It is suggested that functional deficits in the construction of FCHs are associated with clinical outcomes in both Autism Spectrum Disorders and later-stage stage Alzheimer's disease.\u
PRESISTANT: Learning based assistant for data pre-processing
Data pre-processing is one of the most time consuming and relevant steps in a
data analysis process (e.g., classification task). A given data pre-processing
operator (e.g., transformation) can have positive, negative or zero impact on
the final result of the analysis. Expert users have the required knowledge to
find the right pre-processing operators. However, when it comes to non-experts,
they are overwhelmed by the amount of pre-processing operators and it is
challenging for them to find operators that would positively impact their
analysis (e.g., increase the predictive accuracy of a classifier). Existing
solutions either assume that users have expert knowledge, or they recommend
pre-processing operators that are only "syntactically" applicable to a dataset,
without taking into account their impact on the final analysis. In this work,
we aim at providing assistance to non-expert users by recommending data
pre-processing operators that are ranked according to their impact on the final
analysis. We developed a tool PRESISTANT, that uses Random Forests to learn the
impact of pre-processing operators on the performance (e.g., predictive
accuracy) of 5 different classification algorithms, such as J48, Naive Bayes,
PART, Logistic Regression, and Nearest Neighbor. Extensive evaluations on the
recommendations provided by our tool, show that PRESISTANT can effectively help
non-experts in order to achieve improved results in their analytical tasks
Predicting and simulating future land use pattern : a case study of Seremban district
As long as rapid urbanization which is a result of natural population growth and rural urban migration due to push and pull factors of social and economic conditions as well as the moving of urban populations from major city centres to urban fringe areas due to changing lifestyle which emphasized on spacious and more comfortable and environmentally friendly living environment continue to happen; towns and cities will continue to grow and expand to accommodate the growing and complex demand of the people. Experiences have shown that rapid and uncontrolled expansion of towns and cities has led to amongst others the deterioration in the quality of urban environment and sprawling of urban development onto prime agricultural and forest areas as well as cities starting to lose their identity. In order to avoid such phenomena continuing to happen, particularly in the Kuala Lumpur Conurbation Area, towns and cities need to be properly planned and managed so that their growth or expansion can be controlled and managed in a sustainable manner. One of the strategies adopted to curb sprawling development is through the delineation of urban growth or development limits (UGL). This means that the limit of towns and cities need to be studied and identified, so that urban development can be directed to areas that are identified and specified suitable for such development. One of the main tasks in the process of delineating UGL has been included as an important task in the preparation of development plans. With such policy a research study is now being carried out to develop a spatial modelling framework towards delineating UGL through the application and integration of spatial technologies and this will be a basis or framework for land use planners, managers and policy makers to formulate urban land use policies and monitor urban land use development. One of the main analysis involve in the process of performing this task is to understand past urban land development trend and to predict and identify future urban growth areas of the selected study area. This paper highlights the integration of statistical modeling technique via binary logistic regression analysis with GIS technology in understanding and predicting urban growth pattern and area as applied to District of Seremban, Negeri Sembilan. The result shows that urban land use pattern in the study area within the study period are significantly related to more than half of the predictors used in the analysis
Scalable Population Synthesis with Deep Generative Modeling
Population synthesis is concerned with the generation of synthetic yet
realistic representations of populations. It is a fundamental problem in the
modeling of transport where the synthetic populations of micro-agents represent
a key input to most agent-based models. In this paper, a new methodological
framework for how to 'grow' pools of micro-agents is presented. The model
framework adopts a deep generative modeling approach from machine learning
based on a Variational Autoencoder (VAE). Compared to the previous population
synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs
sampling and traditional generative models such as Bayesian Networks or Hidden
Markov Models, the proposed method allows fitting the full joint distribution
for high dimensions. The proposed methodology is compared with a conventional
Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary.
It is shown that, while these two methods outperform the VAE in the
low-dimensional case, they both suffer from scalability issues when the number
of modeled attributes increases. It is also shown that the Gibbs sampler
essentially replicates the agents from the original sample when the required
conditional distributions are estimated as frequency tables. In contrast, the
VAE allows addressing the problem of sampling zeros by generating agents that
are virtually different from those in the original data but have similar
statistical properties. The presented approach can support agent-based modeling
at all levels by enabling richer synthetic populations with smaller zones and
more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table
Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates
The optimization of algorithm (hyper-)parameters is crucial for achieving
peak performance across a wide range of domains, ranging from deep neural
networks to solvers for hard combinatorial problems. The resulting algorithm
configuration (AC) problem has attracted much attention from the machine
learning community. However, the proper evaluation of new AC procedures is
hindered by two key hurdles. First, AC benchmarks are hard to set up. Second
and even more significantly, they are computationally expensive: a single run
of an AC procedure involves many costly runs of the target algorithm whose
performance is to be optimized in a given AC benchmark scenario. One common
workaround is to optimize cheap-to-evaluate artificial benchmark functions
(e.g., Branin) instead of actual algorithms; however, these have different
properties than realistic AC problems. Here, we propose an alternative
benchmarking approach that is similarly cheap to evaluate but much closer to
the original AC problem: replacing expensive benchmarks by surrogate benchmarks
constructed from AC benchmarks. These surrogate benchmarks approximate the
response surface corresponding to true target algorithm performance using a
regression model, and the original and surrogate benchmark share the same
(hyper-)parameter space. In our experiments, we construct and evaluate
surrogate benchmarks for hyperparameter optimization as well as for AC problems
that involve performance optimization of solvers for hard combinatorial
problems, drawing training data from the runs of existing AC procedures. We
show that our surrogate benchmarks capture overall important characteristics of
the AC scenarios, such as high- and low-performing regions, from which they
were derived, while being much easier to use and orders of magnitude cheaper to
evaluate
- …