36,390 research outputs found

    Online Bivariate Outlier Detection in Final Test Using Kernel Density Estimation

    Get PDF
    In parametric IC testing, outlier detection is applied to filter out potential unreliable devices. Most outlier detection methods are used in an offline setting and hence are not applicable to Final Test, where immediate pass/fail decisions are required. Therefore, we developed a new bivariate online outlier detection method that is applicable to Final Test without making assumptions about a specific form of relations between two test parameters. An acceptance region is constructed using kernel density estimation. We use a grid discretization in order to enable a fast outlier decision. After each accepted device the grid is updated, hence the method is able to adapt to shifting measurements

    Azorean agriculture efficiency by PAR

    Get PDF
    The producers always aspire at increasing the efficiency of their production process. However, they do not always succeed in optimizing their production. In the last years, the interest on Data Envelopment Analysis (DEA) as a powerful tool for measuring efficiency has increased. This is due to the large amount of data sets collected to better understand the phenomena under study, and, at the same time, to the need of timely and inexpensive information. The “Productivity Analysis with R” (PAR) framework establishes a user-friendly data envelopment analysis environment with special emphasis on variable selection and aggregation, and summarization and interpretation of the results. The starting point is the following R packages: DEA (Diaz-Martinez and Fernandez-Menendez, 2008) and FEAR (Wilson, 2007). The DEA package performs some models of Data Envelopment Analysis presented in (Cooper et al., 2007). FEAR is a software package for computing nonparametric efficiency estimates and testing hypotheses in frontier models. FEAR implements the bootstrap methods described in (Simar and Wilson, 2000). PAR is a software framework using a portfolio of models for efficiency estimation and providing also results explanation functionality. PAR framework has been developed to distinguish between efficient and inefficient observations and to explicitly advise the producers about possibilities for production optimization. PER framework offers several R functions for a reasonable interpretation of the data analysis results and text presentation of the obtained information. The output of an efficiency study with PAR software is self- explanatory. We are applying PAR framework to estimate the efficiency of the agricultural system in Azores (Mendes et al., 2009). All Azorean farms will be clustered into homogeneous groups according to their efficiency measurements to define clusters of “good” practices and cluster of “less good” practices. This makes PAR appropriate to support public policies in agriculture sector in Azores.N/

    Data Improving in Time Series Using ARX and ANN Models

    Get PDF
    Anomalous data can negatively impact energy forecasting by causing model parameters to be incorrectly estimated. This paper presents two approaches for the detection and imputation of anomalies in time series data. Autoregressive with exogenous inputs (ARX) and artificial neural network (ANN) models are used to extract the characteristics of time series. Anomalies are detected by performing hypothesis testing on the extrema of the residuals, and the anomalous data points are imputed using the ARX and ANN models. Because the anomalies affect the model coefficients, the data cleaning process is performed iteratively. The models are re-learned on “cleaner” data after an anomaly is imputed. The anomalous data are reimputed to each iteration using the updated ARX and ANN models. The ARX and ANN data cleaning models are evaluated on natural gas time series data. This paper demonstrates that the proposed approaches are able to identify and impute anomalous data points. Forecasting models learned on the unclean data and the cleaned data are tested on an uncleaned out-of-sample dataset. The forecasting model learned on the cleaned data outperforms the model learned on the unclean data with 1.67% improvement in the mean absolute percentage errors and a 32.8% improvement in the root mean squared error. Existing challenges include correctly identifying specific types of anomalies such as negative flows

    Autoencoders for strategic decision support

    Full text link
    In the majority of executive domains, a notion of normality is involved in most strategic decisions. However, few data-driven tools that support strategic decision-making are available. We introduce and extend the use of autoencoders to provide strategically relevant granular feedback. A first experiment indicates that experts are inconsistent in their decision making, highlighting the need for strategic decision support. Furthermore, using two large industry-provided human resources datasets, the proposed solution is evaluated in terms of ranking accuracy, synergy with human experts, and dimension-level feedback. This three-point scheme is validated using (a) synthetic data, (b) the perspective of data quality, (c) blind expert validation, and (d) transparent expert evaluation. Our study confirms several principal weaknesses of human decision-making and stresses the importance of synergy between a model and humans. Moreover, unsupervised learning and in particular the autoencoder are shown to be valuable tools for strategic decision-making

    Spectrophotometric Redshifts In The Faint Infrared Grism Survey: Finding Overdensities Of Faint Galaxies

    Get PDF
    We improve the accuracy of photometric redshifts by including low-resolution spectral data from the G102 grism on the Hubble Space Telescope, which assists in redshift determination by further constraining the shape of the broadband Spectral Energy Disribution (SED) and identifying spectral features. The photometry used in the redshift fits includes near-IR photometry from FIGS+CANDELS, as well as optical data from ground-based surveys and HST ACS, and mid-IR data from Spitzer. We calculated the redshifts through the comparison of measured photometry with template galaxy models, using the EAZY photometric redshift code. For objects with F105W <26.5< 26.5 AB mag with a redshift range of 0<z<60 < z < 6, we find a typical error of Δz=0.03∗(1+z)\Delta z = 0.03 * (1+z) for the purely photometric redshifts; with the addition of FIGS spectra, these become Δz=0.02∗(1+z)\Delta z = 0.02 * (1+z), an improvement of 50\%. Addition of grism data also reduces the outlier rate from 8\% to 7\% across all fields. With the more-accurate spectrophotometric redshifts (SPZs), we searched the FIGS fields for galaxy overdensities. We identified 24 overdensities across the 4 fields. The strongest overdensity, matching a spectroscopically identified cluster at z=0.85z=0.85, has 28 potential member galaxies, of which 8 have previous spectroscopic confirmation, and features a corresponding X-ray signal. Another corresponding to a cluster at z=1.84z=1.84 has 22 members, 18 of which are spectroscopically confirmed. Additionally, we find 4 overdensities that are detected at an equal or higher significance in at least one metric to the two confirmed clusters.Comment: 17 pages, 13 figures. To appear in Ap
    • …
    corecore