9 research outputs found
MPI-Vector-IO: Parallel I/O and Partitioning for Geospatial Vector Data
In recent times, geospatial datasets are growing in terms of size, complexity and heterogeneity. High performance systems are needed to analyze such data to produce actionable insights in an efficient manner. For polygonal a.k.a vector datasets, operations such as I/O, data partitioning, communication, and load balancing becomes challenging in a cluster environment. In this work, we present MPI-Vector-IO 1 , a parallel I/O library that we have designed using MPI-IO specifically for partitioning and reading irregular vector data formats such as Well Known Text. It makes MPI aware of spatial data, spatial primitives and provides support for spatial data types embedded within collective computation and communication using MPI message-passing library. These abstractions along with parallel I/O support are useful for parallel Geographic Information System (GIS) application development on HPC platforms
Measures and adjustments of pattern frequency distributions
Frequent pattern mining over large databases is fundamental to many data mining applications, where pattern frequency distribution plays a central role. Various approaches have been proposed for pattern mining with respectable computational performance. However, the appropriate evaluation of the pattern frequentness and the refinement of the mining result set are somewhat ignored. This has created a set of problems in conventional mining approaches which are identified in this thesis. Most conventional mining approaches evaluate pattern frequentness with an ill formed "support" measure, and generate patterns with full enumeration mode which produces excessive number of patterns in an application. Consequently, the mining result sets exhibit among other issues those of overfitting and underfitting, probability anomaly and bias for generated against original observations. Even worse, these results are delivered to users without any refinement. Overcoming these drawbacks is challenging, since these problems are rather philosophical than computational and hence their resolution demands a well established theory to reform the mining foundations and to pursue graceful knowledge degeneration. Based on the problems identified, this thesis first proposes a reformulation of the frequentness measure, which effectively resolves the probability anomaly and other related issues. To deal with the profound full enumeration mode, we first explore a set of properties governing raw pattern frequency distributions, such that a number of important mining parameters can be predetermined Based on these explorations, an approach to adjust the raw pattern frequency distributions is established and its theoretical merits are justified. This refinement theory shows that unconditional pattern reduction is achievable before domain constraints are imposed. The thesis then presents a maximum likelihood pattern sampling model and strategies to realize the adjustment. Findings presented in this thesis are based on known set theory, combinatorics, and probability theory, and they are theoretically fundamental and applicable to every item based or key words based pattern mining and the improvement of mining effectiveness. We expect that these findings would pave a way to replace the full enumeration pattern generation with selective generation mode, which would then radically change the state of the art of pattern mining
Eleventh European Powder Diffraction Conference. Warsaw, September 19-22, 2008
Zeitschrift für Kristallographie. Supplement Volume 30 presents the complete Proceedings of all contributions to the XI European Powder Diffraction Conference in Warsaw 2008: Method Development and Application,Instrumental, Software Development, Materials. Supplement Series of Zeitschrift für Kristallographie publishes Proceedings and Abstracts of international conferences on the interdisciplinary field of crystallography
LANDSLIDE INVESTIGATION IN THE RÍO AGUAS CATCHMENT, SOUTHEAST SPAIN
Many remote and/or rural communities Uve under the threat of major landslide activity. These
remote areas are also increasingly the focus of development programmes which, without careful
consideration of the ground conditions, will be at risk. The cost of mitigating against landslide
activity can also be extremely high, possibly making prevention or avoidance the better long-term
option. It is, therefore, of prime importance to assess the type and magnitude of the
landslide activity affecting an area, particularly during the feasibility and planning stages of any
project. The most straightforward approach to any landslide investigation is the compilation of
a landslide inventory and the development of a geological and geomorphological ground mode!
for the area under investigation.
Previous landslide inventory-based projects (for example, in UK, Hong Kong and Nepal) have
utilised either desk study or remotely sensed sources (aerial photographs or satellite imagery)
with only limited field mapping, have focused on wet monsoonal environments in either
mountainous and/or heavily populated areas and have usually been completed without reference
to a ground model for the area being investigated. Therefore, an investigation of the landslide
activity affecting the 425 km² Rio Aguas Catchment area has been completed. This is a remote
and rural part of southeastern Spain, which has an arid to semi-arid climate and is periodically
affected by both earthquake and flash-flood activity.
Through a combination of aerial photographic interpretation (API) and field verification and
mapping, over 300 landslides have been mapped and documented. These data have been used
to develop a landslide inventory and a conceptual geological and geomorphological ground
model of the study area, with respect to the landslide activity. The landslide inventory data
have been used to complete a statistical analysis investigating the factors that control the
distribution, style and mechanisms of the landslide activity, as well as to examine the
relationships between landslide volume, runout length and angle of reach. The basis of the
ground model is a project-derived terrain classification for the study area.
The data analysis has shown that although a variety of landslide failure mechanisms are seen
within the study area, the majority of the landsides are rock falls and topples and/or occur within
incised sections of the drainage network. The analysis has also shown that the landslide activity
is controlled by a combination of the discontinuities within the rock mass, as well as contrasts in
the permeability and stiffness of the rock masses/types involved. The influence of human
activity, as well as tectonic activity, rainfall and expansive day soils has also been considered.
However, a lack of detailed historical landslide and rainfall data limits the conclusions that can
be drawn.
The mapped landslide distribution (supported by the geological and geomorphological ground
model) has highlighted that the majority of the landslides in the Rio Aguas Catchment are
related to a major river capture and modification of the drainage network that occurred
approximately 100Ka BP, and that they are a key component of the geomorphological processes
active within the study area. This river capture, driven by differential tectonic uplift between
sedimentary basins, has caused a wave of incision to pass through a substantial section of the
south central part of the study area leading to the oversteepening of slopes, the incision of the
drainage network and the majority of the landslide activity that is seen within the study area.
The development of the drainage network has been recorded by a series of river terrace deposits
that reflect the overall tectonically induced incision as well as the variable Quaternary climate.
These river terrace deposits have been used to provide a relative temporal framework for the
landslide activity, in the absence of any dated landslide chronology