3,192 research outputs found
On the selection of secondary indices in relational databases
An important problem in the physical design of databases is the selection of secondary indices. In general, this problem cannot be solved in an optimal way due to the complexity of the selection process. Often use is made of heuristics such as the well-known ADD and DROP algorithms. In this paper it will be shown that frequently used cost functions can be classified as super- or submodular functions. For these functions several mathematical properties have been derived which reduce the complexity of the index selection problem. These properties will be used to develop a tool for physical database design and also give a mathematical foundation for the success of the before-mentioned ADD and DROP algorithms
Modelling Uncertainty in Physical Database Design
Physical database design can be marked as a crucial step in the overall design process of databases. The outcome of physical database design is a physical schema which describes the storage and access structures of the stored database. The selection of an ecient physical schema is an NP-complete problem. A signi cant number of eorts has been reported to develop tools that assist in the selection of physical schemas. Most of the eorts implicitly apply a number of heuristics to avoid the evaluation of all schemas. In this paper, we present an approach, based on the Dempster-Shafer theory, that explicitly models a rich set of heuristics |used for the selection of an ecient physical schema | into knowledge rules. These rules may be loaded into a knowledge base, which, in turn, can be embedded in physical database design tools.
Service Offshoring and White-Collar Employment
I study the effects of service offshoring on white-collar employment, using highly disaggregated occupational data for the U.S.. I present a structural model of the firmâs behavior that allows tractable derivation of labor demand elasticities for highly detailed occupations. I estimate the model using Quasi-Maximum Likelihood, to simultaneously account for the high degree of censoring of the employment variable and the small cross-sectional dimension of the panel. I find that service offshoring is skill-biased, because it raises employment among high-skilled occupations and lowers employment among medium- and low-skilled ones. Within each skill group, service offshoring penalizes tradeable occupations and tends to benefit complex non tradeable jobs.service offshoring, white-collar occupations, labor demand elasticities, homothetic weak separability, censored demand system estimation
An exploration into the sparse representation of spectra
Includes bibliographical references (leaves 73-76)This thesis describes an exploration in achieving sparse representations of object, with special focus on spectral data. Given a database of objects one would like to know the actual aspects of each class that distinguish it from any other class in the database. We explore the hypothesis that simple abstractions (descriptions) that humans normally make, especially based on the visual phenomenology or physics on the problem, can be helpful in extracting and formulating useful sparse representations of the observed objects. In this thesis we focus on the discovery of such underlying features, employing a number of recent methods from machine learning. Firstly we find that an approach to automatic feature discovery recently proposed in the literature (Non Negative Matrix Factorization) is not as it seems. We show the limitations of this approach and demonstrate a more efficient method on a synthetic problem. Secondly we explore a more empirical approach to extracting visually attractive features of spectra from which we formulate simple re-representation of spectral data and show that the identification and discovery of certain intuitive features at various scales can be sufficient to describe a spectrum profile. Finally we explore a more traditional and principled automatic method of analyzing a spectrum at different resolutions (Wavelets). We find that certain classes of spectra can easily be discriminated between by a simple approximation of the spectrum profile while in other cases only the finer profile details are important. Throughout this thesis we employ a measure called the separability index as our measure of how easy it is to discriminate objects in a database with the proposed representations
Scale challenges in inventory of forests aided by remote sensing
The impact of changing the scale of observation on information derived from forest inventories
is the basis of scale-related research in forest inventory and analysis (FIA). Interactions between
the scale of observation and observed heterogeneity in studied variables highlight a dependence
on scale that affects measurements, estimates, and relationships between inventory data from
terrestrial and remote sensing surveys. This doctoral research defines "scale" as the divisions
of continuous space over which measurements are made, or hierarchies of discrete units of
study/analysis in space. Therefore, the "scale of observation" (also known as support) refers
to that integral of space over which statistics are computed and forest inventory variables
regionalized.
Given the ubiquitous nature of scale issues, a case study approach was undertaken in
this research (Articles I-IV) with the goal to provide fundamental understanding of responses
to the scale of observation for specific FIA variables. The studied forest inventory variables
are; forest stand structural heterogeneity, forest cover proportion and tree species identities.
Forest cover proportion (or simply forest area) and tree species are traditional and fundamental
forest inventory variables commonly assessed over large areas using both terrestrial samples
and remote sensing data whereas, forest stand structural heterogeneity is a contemporary FIA
variable that is increasingly demanded in multi-resource inventories to inform management
and conservation efforts as it is linked to biodiversity, productivity, ecosystem functioning and
productivity, and used as auxiliary data in forest inventory.
This research has two overall aims:
1. To improve the understanding of the association between the scale of observation and
observed heterogeneity in inventory of forest stand structural heterogeneity, forest-cover
proportions, and identification of tree species from a combination of terrestrial samples
and remote sensing data.
2. To contribute knowledge to the estimation of scale-dependence in inventory of forest
stand structural heterogeneity, forest-cover proportions, and identification of tree species
from a combination of terrestrial samples and remote sensing data.
Different scales of observation were considered across the four case studies encompassing
individual leaf, crown-part or branch, single-tree crown, forest stand, landscape and global levels
of analysis. Terrestrial and remote sensing data sets from a variety of temperate forests in
Germany and France were utilized across case studies. In cases where no inventory data were
available, synthetic data was simulated at different scales of observation. Heterogeneity in FIA
variable estimates was monitored across scales of observation using estimators of variance and
associated precision. As too much heterogeneity is hardly interpreted due to a low signal to noise
ratio, object-based image analysis (OBIA) methods were used to manage heterogeneity in high resolution
remote sensing data before evaluating scale dependence or scaling across observed
scales. Similarly, ensemble classification techniques were applied to address methodological
heterogeneity across classifiers in a case study on classification of two physically and spectrally
similar Pinus species. Across case studies, a dependence on the scale of observation was
determined by linking estimates of heterogeneity to their respective scales of observation using
linear regression and a combination of geo-statistics and Monte-Carlo approaches. In order to
address scale-dependence, thresholds to scale domains were identified so as to enable efficient
observation of studied FIA variables and scaling approaches proposed to bridge observations
across scales. For scaling, this research evaluated the potential of different regression techniques
to map forest stand structural heterogeneity and tree species wall-to-wall from remote sensing
data. In addition, radiative transfer modelling was evaluated in the transfer between leaf and
crown hyperspectra, and a global sampling grid framework proposed to efficiently link different
stages of survey sampling.
This research shows that the scale of observation affected all studied FIA variables albeit
to varying degrees, conditioned on the spatial structure and aggregation properties of the
assessed FIA variable (i.e. whether the variable is extensive, intensive or scale-specific) and
the method used in aggregation on support (e.g. mean, variance, quantile etc.). The scale
of observation affected measurements or estimates of the studied FIA variables as well as
relationships between spatially structured FIA variables. The scale of observation determined
observed heterogeneity in FIA variables, affected parameter retrieval from radiative transfer
models, and affected variable selection and performance of models linking terrestrial and remote
sensing data. On the other hand, this research shows that it is possible to determine domains
of scale dependence within which to efficiently observe the studied FIA variables and to bridge
between scales of observation using various scaling methods.
The findings of this doctoral research are relevant for the general understanding of scale
issues in FIA. Research in Article I, for example, informs optimization of plot sizes for efficient
inventory and mapping of forest structural heterogeneity, as well as for the design of natural
resource inventories. Similarly, research in Article II is applicable in large area forest (or general
land) cover monitoring from sampling by both visual interpretation of high resolution remote
sensing imagery and terrestrial surveys. This research is also useful to determine observation
design for efficient inventory of land cover. Research in Article III contributes in many contexts
of remote sensing assisted inventory of forests especially in management and conservation
planning, pest and diseases control and in the estimation of biomass. Lastly, research in Article IV
highlights scale-related effects in passive optical remote sensing of forests currently understudied
and can ultimately contribute to sensor calibration and modelling approaches
A Residential Energy Demand System for Spain
Sharp price fluctuations and increasing environmental and distributional concerns, among other issues, have led to a renewed academic interest in energy demand. In this paper we estimate, for the first time in Spain, an energy demand system with household microdata. In doing so, we tackle several econometric and data problems that are generally recognized to bias parameter estimates. This is obviously relevant, as obtaining correct price and income responses is essential if they may be used for assessing the economic consequences of hypothetical or real changes. With this objective, we combine data sources for a long time period and choose a demand system with flexible income and price responses. We also estimate the model in different sub-samples to capture varying responses to energy price changes by households living in rural, intermediate and urban areas. This constitutes a first attempt in the literature and it proved to be a very successful choice.households, energy, demand, spain, location
Three-dimensional hydrodynamic models coupled with GIS-based neuro-fuzzy classification for assessing environmental vulnerability of marine cage aquaculture
There is considerable opportunity to develop new modelling techniques within a
Geographic Information Systems (GIS) framework for the development of sustainable
marine cage culture. However, the spatial data sets are often uncertain and incomplete,
therefore new spatial models employing âsoft computingâ methods such as fuzzy logic
may be more suitable.
The aim of this study is to develop a model using Neuro-fuzzy techniques in a 3D GIS
(Arc View 3.2) to predict coastal environmental vulnerability for Atlantic salmon cage
aquaculture. A 3D hydrodynamic model (3DMOHID) coupled to a particle-tracking
model is applied to study the circulation patterns, dispersion processes and residence
time in Mulroy Bay, Co. Donegal Ireland, an Irish fjard (shallow fjordic system), an
area of restricted exchange, geometrically complicated with important aquaculture
activities.
The hydrodynamic model was calibrated and validated by comparison with sea surface
and water flow measurements. The model provided spatial and temporal information on
circulation, renewal time, helping to determine the influence of winds on circulation
patterns and in particular the assessment of the hydrographic conditions with a strong
influence on the management of fish cage culture.
The particle-tracking model was used to study the transport and flushing processes.
Instantaneous massive releases of particles from key boxes are modelled to analyse the
ocean-fjord exchange characteristics and, by emulating discharge from finfish cages, to
show the behaviour of waste in terms of water circulation and water exchange.
In this study the results from the hydrodynamic model have been incorporated into GIS
to provide an easy-to-use graphical user interface for 2D (maps), 3D and temporal
visualization (animations), for interrogation of results.
v
Data on the physical environment and aquaculture suitability were derived from a 3-
dimensional hydrodynamic model and GIS for incorporation into the final model
framework and included mean and maximum current velocities, current flow quiescence
time, water column stratification, sediment granulometry, particulate waste dispersion
distance, oxygen depletion, water depth, coastal protection zones, and slope.
The Neuro-fuzzy classification model NEFCLASSâJ, was used to develop learning
algorithms to create the structure (rule base) and the parameters (fuzzy sets) of a fuzzy
classifier from a set of classified training data. A total of 42 training sites were sampled
using stratified random sampling from the GIS raster data layers, and the vulnerability
categories for each were manually classified into four categories based on the opinions
of experts with field experience and specific knowledge of the environmental problems
investigated.
The final products, GIS/based Neuro Fuzzy maps were achieved by combining modeled
and real environmental parameters relevant to marine fin fish Aquaculture.
Environmental vulnerability models, based on Neuro-fuzzy techniques, showed
sensitivity to the membership shapes of the fuzzy sets, the nature of the weightings
applied to the model rules, and validation techniques used during the learning and
validation process. The accuracy of the final classifier selected was R=85.71%,
(estimated error value of ±16.5% from Cross Validation, N=10) with a Kappa
coefficient of agreement of 81%. Unclassified cells in the whole spatial domain (of
1623 GIS cells) ranged from 0% to 24.18 %.
A statistical comparison between vulnerability scores and a significant product of
aquaculture waste (nitrogen concentrations in sediment under the salmon cages) showed
that the final model gave a good correlation between predicted environmental
vi
vulnerability and sediment nitrogen levels, highlighting a number of areas with variable
sensitivity to aquaculture.
Further evaluation and analysis of the quality of the classification was achieved and the
applicability of separability indexes was also studied. The inter-class separability
estimations were performed on two different training data sets to assess the difficulty of
the class separation problem under investigation. The Neuro-fuzzy classifier for a
supervised and hard classification of coastal environmental vulnerability has
demonstrated an ability to derive an accurate and reliable classification into areas of
different levels of environmental vulnerability using a minimal number of training sets.
The output will be an environmental spatial model for application in coastal areas
intended to facilitate policy decision and to allow input into wider ranging spatial
modelling projects, such as coastal zone management systems and effective
environmental management of fish cage aquaculture
- âŠ