36,390 research outputs found
Online Bivariate Outlier Detection in Final Test Using Kernel Density Estimation
In parametric IC testing, outlier detection is applied to filter out potential unreliable devices. Most outlier detection methods are used in an offline setting and hence are not applicable to Final Test, where immediate pass/fail decisions are required. Therefore, we developed a new bivariate online outlier detection method that is applicable to Final Test without making assumptions about a specific form of relations between two test parameters. An acceptance region is constructed using kernel density estimation. We use a grid discretization in order to enable a fast outlier decision. After each accepted device the grid is updated, hence the method is able to adapt to shifting measurements
Azorean agriculture efficiency by PAR
The producers always aspire at increasing the efficiency of their production process. However, they do not always succeed in optimizing their production. In the last years, the interest on Data Envelopment Analysis (DEA) as a powerful tool for measuring efficiency has increased. This is due to the large amount of data sets collected to better understand the phenomena under study, and, at the same time, to the need of timely and inexpensive information.
The âProductivity Analysis with Râ (PAR) framework establishes a user-friendly data envelopment analysis environment with special emphasis on variable selection and aggregation, and summarization and interpretation of the results. The starting point is the following R packages: DEA (Diaz-Martinez and Fernandez-Menendez, 2008) and FEAR (Wilson, 2007). The DEA package performs some models of Data Envelopment Analysis presented in (Cooper et al., 2007). FEAR is a software package for computing nonparametric efficiency estimates and testing hypotheses in frontier models. FEAR implements the bootstrap methods described in (Simar and Wilson, 2000).
PAR is a software framework using a portfolio of models for efficiency estimation and providing also results explanation functionality. PAR framework has been developed to distinguish between efficient and inefficient observations and to explicitly advise the producers about possibilities for production optimization. PER framework offers several R functions for a reasonable interpretation of the data analysis results and text presentation of the obtained information. The output of an efficiency study with PAR software is self- explanatory.
We are applying PAR framework to estimate the efficiency of the agricultural system in Azores (Mendes et al., 2009). All Azorean farms will be clustered into homogeneous groups according to their efficiency measurements to define clusters of âgoodâ practices and cluster of âless goodâ practices. This makes PAR appropriate to support public policies in agriculture sector in Azores.N/
Data Improving in Time Series Using ARX and ANN Models
Anomalous data can negatively impact energy forecasting by causing model parameters to be incorrectly estimated. This paper presents two approaches for the detection and imputation of anomalies in time series data. Autoregressive with exogenous inputs (ARX) and artificial neural network (ANN) models are used to extract the characteristics of time series. Anomalies are detected by performing hypothesis testing on the extrema of the residuals, and the anomalous data points are imputed using the ARX and ANN models. Because the anomalies affect the model coefficients, the data cleaning process is performed iteratively. The models are re-learned on âcleanerâ data after an anomaly is imputed. The anomalous data are reimputed to each iteration using the updated ARX and ANN models. The ARX and ANN data cleaning models are evaluated on natural gas time series data. This paper demonstrates that the proposed approaches are able to identify and impute anomalous data points. Forecasting models learned on the unclean data and the cleaned data are tested on an uncleaned out-of-sample dataset. The forecasting model learned on the cleaned data outperforms the model learned on the unclean data with 1.67% improvement in the mean absolute percentage errors and a 32.8% improvement in the root mean squared error. Existing challenges include correctly identifying specific types of anomalies such as negative flows
Autoencoders for strategic decision support
In the majority of executive domains, a notion of normality is involved in
most strategic decisions. However, few data-driven tools that support strategic
decision-making are available. We introduce and extend the use of autoencoders
to provide strategically relevant granular feedback. A first experiment
indicates that experts are inconsistent in their decision making, highlighting
the need for strategic decision support. Furthermore, using two large
industry-provided human resources datasets, the proposed solution is evaluated
in terms of ranking accuracy, synergy with human experts, and dimension-level
feedback. This three-point scheme is validated using (a) synthetic data, (b)
the perspective of data quality, (c) blind expert validation, and (d)
transparent expert evaluation. Our study confirms several principal weaknesses
of human decision-making and stresses the importance of synergy between a model
and humans. Moreover, unsupervised learning and in particular the autoencoder
are shown to be valuable tools for strategic decision-making
Spectrophotometric Redshifts In The Faint Infrared Grism Survey: Finding Overdensities Of Faint Galaxies
We improve the accuracy of photometric redshifts by including low-resolution
spectral data from the G102 grism on the Hubble Space Telescope, which assists
in redshift determination by further constraining the shape of the broadband
Spectral Energy Disribution (SED) and identifying spectral features. The
photometry used in the redshift fits includes near-IR photometry from
FIGS+CANDELS, as well as optical data from ground-based surveys and HST ACS,
and mid-IR data from Spitzer. We calculated the redshifts through the
comparison of measured photometry with template galaxy models, using the EAZY
photometric redshift code. For objects with F105W AB mag with a
redshift range of , we find a typical error of for the purely photometric redshifts; with the addition of FIGS spectra,
these become , an improvement of 50\%. Addition of
grism data also reduces the outlier rate from 8\% to 7\% across all fields.
With the more-accurate spectrophotometric redshifts (SPZs), we searched the
FIGS fields for galaxy overdensities. We identified 24 overdensities across the
4 fields. The strongest overdensity, matching a spectroscopically identified
cluster at , has 28 potential member galaxies, of which 8 have previous
spectroscopic confirmation, and features a corresponding X-ray signal. Another
corresponding to a cluster at has 22 members, 18 of which are
spectroscopically confirmed. Additionally, we find 4 overdensities that are
detected at an equal or higher significance in at least one metric to the two
confirmed clusters.Comment: 17 pages, 13 figures. To appear in Ap
- âŚ