23,612 research outputs found
The VIMOS Public Extragalactic Redshift Survey (VIPERS): PCA-based automatic cleaning and reconstruction of survey spectra
Identifying spurious reduction artefacts in galaxy spectra is a challenge for
large surveys. We present an algorithm for identifying and repairing residual
spurious features in sky-subtracted galaxy spectra with application to the
VIPERS survey. The algorithm uses principal component analysis (PCA) applied to
the galaxy spectra in the observed frame to identify sky line residuals
imprinted at characteristic wavelengths. We further model the galaxy spectra in
the rest-frame using PCA to estimate the most probable continuum in the
corrupted spectral regions, which are then repaired. We apply the method to
90,000 spectra from the VIPERS survey and compare the results with a subset
where careful editing was performed by hand. We find that the automatic
technique does an extremely good job in reproducing the time-consuming manual
cleaning and does it in a uniform and objective manner across a large data
sample. The mask data products produced in this work are released together with
the VIPERS second public data release (PDR-2).Comment: Find the VIPERS data release at http://vipers.inaf.i
Cleaning uncertain data for top-k queries
The information managed in emerging applications, such as sensor networks, location-based services, and data integration, is inherently imprecise. To handle data uncertainty, probabilistic databases have been recently developed. In this paper, we study how to quantify the ambiguity of answers returned by a probabilistic top-k query. We develop efficient algorithms to compute the quality of this query under the possible world semantics. We further address the cleaning of a probabilistic database, in order to improve top-k query quality. Cleaning involves the reduction of ambiguity associated with the database entities. For example, the uncertainty of a temperature value acquired from a sensor can be reduced, or cleaned, by requesting its newest value from the sensor. While this 'cleaning operation' may produce a better query result, it may involve a cost and fail. We investigate the problem of selecting entities to be cleaned under a limited budget. Particularly, we propose an optimal solution and several heuristics. Experiments show that the greedy algorithm is efficient and close to optimal. © 2013 IEEE.published_or_final_versio
Graph Summarization
The continuous and rapid growth of highly interconnected datasets, which are
both voluminous and complex, calls for the development of adequate processing
and analytical techniques. One method for condensing and simplifying such
datasets is graph summarization. It denotes a series of application-specific
algorithms designed to transform graphs into more compact representations while
preserving structural patterns, query answers, or specific property
distributions. As this problem is common to several areas studying graph
topologies, different approaches, such as clustering, compression, sampling, or
influence detection, have been proposed, primarily based on statistical and
optimization methods. The focus of our chapter is to pinpoint the main graph
summarization methods, but especially to focus on the most recent approaches
and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie
Production planning and control of closed-loop supply chains
More and more supply chains emerge that include a return flow of materials. Many original equipment manufacturers are nowadays engaged in the remanufacturing business. In many process industries, production defectives and by-products are reworked. These closed-loop supply chains deserve special attention. Production planning and control in such hybrid systems is a real challenge, especially due to increased uncertainties. Even companies that are engaged in remanufacturing operations only, face more complicated planning situations than traditional manufacturing companies.We point out the main complicating characteristics in closed-loop systems with both remanufacturing and rework, and indicated the need for new or modified/extended production planning and control approaches. An overview of the existing scientific contributions is given. It appears that we only stand at the beginning of this line of research, and that many more contributions are needed and expected in the future.closed-loop supply chains;Production planning and control
- …