Search CORE

108 research outputs found

Recommended from our members

Design and Implementation of an Anomaly Detector

Author: Bagherjeiran A.
Cantu-Paz E.
Kamath C.
Publication venue: Lawrence Livermore National Laboratory
Publication date: 11/07/2005
Field of study

This paper describes the design and implementation of a general-purpose anomaly detector for streaming data. Based on a survey of similar work from the literature, a basic anomaly detector builds a model on normal data, compares this model to incoming data, and uses a threshold to determine when the incoming data represent an anomaly. Models compactly represent the data but still allow for effective comparison. Comparison methods determine the distance between two models of data or the distance between a model and a point. Threshold selection is a largely neglected problem in the literature, but the current implementation includes two methods to estimate thresholds from normal data. With these components, a user can construct a variety of anomaly detection schemes. The implementation contains several methods from the literature. Three separate experiments tested the performance of the components on two well-known and one completely artificial dataset. The results indicate that the implementation works and can reproduce results from previous experiments

UNT Digital Library

Group Leaders Optimization Algorithm

Author: Anmer Daskin
Bäck T
Cantu-Paz E
Dennis JE
Goldberg DE
Nielsen MA
Sabre Kais
Wales DJ
Weise T
Winter G
Yang Z
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2011
Field of study

We present a new global optimization algorithm in which the influence of the leaders in social groups is used as an inspiration for the evolutionary technique which is designed into a group architecture. To demonstrate the efficiency of the method, a standard suite of single and multidimensional optimization functions along with the energies and the geometric structures of Lennard-Jones clusters are given as well as the application of the algorithm on quantum circuit design problems. We show that as an improvement over previous methods, the algorithm scales as N^2.5 for the Lennard-Jones clusters of N-particles. In addition, an efficient circuit design is shown for two qubit Grover search algorithm which is a quantum algorithm providing quadratic speed-up over the classical counterpart

arXiv.org e-Print Archive

Crossref

Purdue E-Pubs

Mutagenesis as a Diversity Enhancer and Preserver in Evolution Strategies

Author: A. Toffolo
C. Broyden
C. Kelley
E. Cantu-Paz
M. Arioli
N. Hansen
N. Krasnogor
S. García
Z. Michalewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Proceedings of: 9th International Symposium on Distributed Computing and Artificial Intelligence (DCAI 2012). Salamanca, March 28-30, 2012Mutagenesis is a process which forces the coverage of certain zones of the search space during the generations of an evolution strategy, by keeping track of the covered ranges for the different variables in the so called gene matrix. Originally introduced as an artifact to control the automated stopping criterion in a memetic algorithm, ESLAT, it also improved the exploration capabilities of the algorithm, even though this was considered a secondary matter and not properly analyzed or tested. This work focuses on this diversity enhancement, redefining mutagenesis to increase this characteristic, measuring this improvement over a set of twenty-seven unconstrained optimization functions to provide statistically significant results.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02.Publicad

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

e-Archivo (Univ. Carlos III de Madrid e-Archivo)

Grammatical evolution decision trees for detecting gene-gene interactions

Author: AA Motsinger
AA Motsinger-Reif
AA Motsinger-Reif
Alison A Motsinger-Reif
BA Shepherd
BLG Miller
CS Greene
D Altshuler
DB Goldstein
DR Velez
E Alpaydin
E Cantu-Paz
HJ Cordell
IH Witten
J Koza
J Koza
JH Moore
JH Moore
JH Moore
JH Moore
JH Moore
JN Hirschhorn
JR Quinlan
JS Aguilar-Ruiz
L Brieman
LGL Devroy
M Hall
M O'Neill
M O'Neill
MD Ritchie
MR Nelson
Nicholas E Hardison
R Bellman
R Culverhouse
RJ Neuman
SM Dudek
Stacey J Winham
Sushamna Deodhar
TJ Hastie
W Li
X Yao
Publication venue: BioMed Central
Publication date: 01/11/2010
Field of study

Abstract Background A fundamental goal of human genetics is the discovery of polymorphisms that predict common, complex diseases. It is hypothesized that complex diseases are due to a myriad of factors including environmental exposures and complex genetic risk models, including gene-gene interactions. Such epistatic models present an important analytical challenge, requiring that methods perform not only statistical modeling, but also variable selection to generate testable genetic model hypotheses. This challenge is amplified by recent advances in genotyping technology, as the number of potential predictor variables is rapidly increasing. Methods Decision trees are a highly successful, easily interpretable data-mining method that are typically optimized with a hierarchical model building approach, which limits their potential to identify interacting effects. To overcome this limitation, we utilize evolutionary computation, specifically grammatical evolution, to build decision trees to detect and model gene-gene interactions. In the current study, we introduce the Grammatical Evolution Decision Trees (GEDT) method and software and evaluate this approach on simulated data representing gene-gene interaction models of a range of effect sizes. We compare the performance of the method to a traditional decision tree algorithm and a random search approach and demonstrate the improved performance of the method to detect purely epistatic interactions. Results The results of our simulations demonstrate that GEDT has high power to detect even very moderate genetic risk models. GEDT has high power to detect interactions with and without main effects. Conclusions GEDT, while still in its initial stages of development, is a promising new approach for identifying gene-gene interactions in genetic association studies.</p

Crossref

Directory of Open Access Journals

PubMed Central

A Deeper Look at DES Dwarf Galaxy Candidates: Grus I and Indus II

Author: Aguena M.
Allam S.
Amara A.
Avila S.
Bechtol K.
Brooks D.
Cantu Sarah A.
Carnero Rosell A.
Carrasco Kind M.
Carretero J.
Costanzi M.
Crnojevic Denija
Da Costa L. N.
De Vicente J.
Des Collaboration
Desai S.
Diehl H. T.
Doel P.
Drlica-Wagner A.
Eifler T. F.
Everett S.
Frieman J.
Garc\ueda-Bellido J.
Gaztanaga E.
Gruen D.
Gruendl R. A.
Gschwend J.
Gutierrez G.
Hinton S. R.
Hollowood D. L.
Honscheid K.
James D. J.
Kuehn K.
Maia M. A. G.
Marshall Jennifer
Mart\uednez-V\ue1zquez Clara E.
Menanteau F.
Miquel R.
Pace Andrew B.
Palmese A.
Paz-Chinch\uf3n F.
Plazas A. A.
Sanchez E.
Santiago B.
Scarpine V.
Schubnell M.
Serrano S.
Sevilla-Noarbe I.
Simon Joshua D.
Smith M.
Soares-Santos M.
Strigari Louis E.
Stringer K. M.
Suchyta E.
Swanson M. E. C.
Tarle G.
Walker A. R.
Wilkinson R. D.
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2021
Field of study

We present deep g- and r-band Magellan/Megacam photometry of two dwarf galaxy candidates discovered in the Dark Energy Survey (DES), Grus I and Indus II (DES J2038-4609). For the case of Grus I, we resolved the main sequence turn-off (MSTO) and similar to 2 mags below it. The MSTO can be seen at g(0) similar to 24 with a photometric uncertainty of 0.03 mag. We show Grus I to be consistent with an old, metal-poor (similar to 13.3 Gyr, [Fe/H] similar to -1.9) dwarf galaxy. We derive updated distance and structural parameters for Grus I using this deep, uniform, wide-field data set. We find an azimuthally-averaged halflight radius more than two times larger (similar to 151(-31)(+21) pc; similar to 4'. 16(-0.74)(+0.54)) and an absolute V-band magnitude similar to-4.1 that is similar to 1 magnitude brighter than previous studies. We obtain updated distance, ellipticity, and centroid parameters that are in agreement with other studies within uncertainties. Although our photometry of Indus II is similar to 2-3 magnitudes deeper than the DES Y1 public release, we find no coherent stellar population at its reported location. The original detection was located in an incomplete region of sky in the DES Y2Q1 data set and was flagged due to potential blue horizontal branch member stars. The best-fit isochrone parameters are physically inconsistent with both dwarf galaxies and globular clusters. We conclude that Indus II is likely a false positive, flagged due to a chance alignment of stars along the line of sight

Archivio istituzionale della ricerca - Università di Trieste

UCL Discovery

The University of Arizona

Oblique decision trees for spatial pattern detection: optimal algorithm and application to malaria risk

Author: AJ Thomas
B Ghattas
Belco Poudiougou
BW Turnbull
C Schmoor
CE Brodley
CY Fu
D Heath
E Cantu-Paz
F Tanser
G Rushton
GF Killeen
GP Patil
H Zhang
J Cuzick
J Wakefield
Jean Gaudart
JF Bithell
JK Baird
L Breiman
L Duczmal
LA Waller
M Booman
M Kulldorff
M Kulldorff
M Kulldorff
M Kulldorff
M Leblanc
MR Segal
NH Anderson
NJ Crichton
Ogobara Doumbo
OK Doumbo
PJ Diggle
PJ Diggle
R Xu
RE Gangnon
RG Newcombe
S Gey
SK Murthy
Stéphane Ranque
T Tango
T Tango
T Tango
TJ Sheehan
U Hjalmars
V Gomez-Rubio
WJH McBride
World Health Organization
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: In order to detect potential disease clusters where a putative source cannot be specified, classical procedures scan the geographical area with circular windows through a specified grid imposed to the map. However, the choice of the windows' shapes, sizes and centers is critical and different choices may not provide exactly the same results. The aim of our work was to use an Oblique Decision Tree model (ODT) which provides potential clusters without pre-specifying shapes, sizes or centers. For this purpose, we have developed an ODT-algorithm to find an oblique partition of the space defined by the geographic coordinates. METHODS: ODT is based on the classification and regression tree (CART). As CART finds out rectangular partitions of the covariate space, ODT provides oblique partitions maximizing the interclass variance of the independent variable. Since it is a NP-Hard problem in R(N), classical ODT-algorithms use evolutionary procedures or heuristics. We have developed an optimal ODT-algorithm in R(2), based on the directions defined by each couple of point locations. This partition provided potential clusters which can be tested with Monte-Carlo inference. We applied the ODT-model to a dataset in order to identify potential high risk clusters of malaria in a village in Western Africa during the dry season. The ODT results were compared with those of the Kulldorff' s SaTScan™. RESULTS: The ODT procedure provided four classes of risk of infection. In the first high risk class 60%, 95% confidence interval (CI95%) [52.22–67.55], of the children was infected. Monte-Carlo inference showed that the spatial pattern issued from the ODT-model was significant (p < 0.0001). Satscan results yielded one significant cluster where the risk of disease was high with an infectious rate of 54.21%, CI95% [47.51–60.75]. Obviously, his center was located within the first high risk ODT class. Both procedures provided similar results identifying a high risk cluster in the western part of the village where a mosquito breeding point was located. CONCLUSION: ODT-models improve the classical scanning procedures by detecting potential disease clusters independently of any specification of the shapes, sizes or centers of the clusters

Crossref

HAL AMU

Springer - Publisher Connector

Directory of Open Access Journals

Neural networks for genetic epidemiology: past, present, and future

During the past two decades, the field of human genetics has experienced an information explosion. The completion of the human genome project and the development of high throughput SNP technologies have created a wealth of data; however, the analysis and interpretation of these data have created a research bottleneck. While technology facilitates the measurement of hundreds or thousands of genes, statistical and computational methodologies are lacking for the analysis of these data. New statistical methods and variable selection strategies must be explored for identifying disease susceptibility genes for common, complex diseases. Neural networks (NN) are a class of pattern recognition methods that have been successfully implemented for data mining and prediction in a variety of fields. The application of NN for statistical genetics studies is an active area of research. Neural networks have been applied in both linkage and association analysis for the identification of disease susceptibility genes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dark Energy Survey Year 3 Results: Deep Field optical + near-infrared images and catalogue

Author: Abbott T. M. C.
Aguena M.
Alarcon A.
Allam S.
Amon A.
Annis J.
Bacon D.
Banerji M.
Bechtol K.
Bernstein G. M.
Bertin E.
Bhargava S.
Brooks D.
Buchs R.
Burke D. L.
Cantu S.
Carretero J.
Castander F. J.
Choi A.
Conselice C.
Cordero J.
Costanzi M.
Crocce M.
da Costa L. N.
Davis C.
Davis T. M.
De Vicente J.
DeRose J.
Desai S.
Diehl H. T.
Dietrich J. P.
Dodelson S.
Drlica-Wagner A.
Eckert K.
Eifler T. F.
Elvin-Poole J.
Everett S.
Ferrero I.
Ferté A.
Flaugher B.
Fosalba P.
García-Bellido J.
Gaztanaga E.
Gerdes D. W.
Gruen D.
Gruendl R. A.
Gschwend J.
Gutierrez G.
Harrison I.
Hartley W. G.
Hinton S. R.
Hollowood D. L.
Honscheid K.
Huterer D.
James D. J.
Jarvis M.
Johnson M. D.
Kent S.
Kind M. Carrasco
Kokron N.
Krause E.
Kuehn K.
Kuropatkin N.
Lahav O.
Lin H.
MacCrann N.
Maia M. A. G.
March M.
Marshall J. L.
Martini P.
Melchior P.
Menanteau F.
Miquel R.
Mohr J. J.
Morgan R.
Myles J.
Neilsen E.
Ogando R. L. C.
Pace A. B.
Palmese A.
Pandey S.
Paz-Chinchón F.
Pereira M. E. S.
Plazas A. A.
Prat J.
Rodriguez-Monroy M.
Romer A. K.
Roodman A.
Rosell A. Carnero
Rykoff E. S.
Sako M.
Samuroff S.
Sanchez E.
Scarpine V.
Secco L. F.
Serrano S.
Sevilla-Noarbe I.
Sheldon E.
Smith M.
Soares-Santos M.
Suchyta E.
Swanson M. E. C.
Sánchez C.
Tarle G.
Tarsitano F.
Thomas D.
To C.
Tong A.
Troxel M. A.
Varga T. N.
Vasquez Z.
Walker A. R.
Wang K.
Wester W.
Wilkinson R. D.
Yanny B.
Zhou C.
Zuntz J.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 23/12/2020
Field of study

We describe the Dark Energy Survey (DES) Deep Fields, a set of images and associated multiwavelength catalogue (ugrizJHKs) built from Dark Energy Camera (DECam) and Visible and Infrared Survey Telescope for Astronomy (VISTA) data. The DES Deep Fields comprise 11 fields (10 DES supernova fields plus COSMOS), with a total area of ∼30 sq. deg. in ugriz bands and reaching a maximum i-band depth of 26.75 (AB, 10σ, 2 arcsec). We present a catalogue for the DES 3-yr cosmology analysis of those four fields with full 8-band coverage, totalling 5.88 sq. deg. after masking. Numbering 2.8 million objects (1.6 million post-masking), our catalogue is drawn from images coadded to consistent depths of r = 25.7, i = 25, and z = 24.3 mag. We use a new model-fitting code, built upon established methods, to deblend sources and ensure consistent colours across the u-band to Ks-band wavelength range. We further detail the tight control we maintain over the point-spread function modelling required for the model fitting, astrometry and consistency of photometry between the four fields. The catalogue allows us to perform a careful star-galaxy separation and produces excellent photometric redshift performance (NMAD = 0.023 at i < 23). The Deep-Fields catalogue will be made available as part of the cosmology data products release, following the completion of the DES 3-yr weak lensing and galaxy clustering cosmology work

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Trieste

Southampton (e-Prints Soton)

HAL-INSU

UCL Discovery

NORA - Norwegian Open Research Archives

Sussex Research Online

MPG.PuRe

Recommended from our members

Selection Intensity in Genetic Algorithms with Generation Gaps

Author: Cantu-Paz E.
Publication venue: Lawrence Livermore National Laboratory
Publication date: 19/01/2000
Field of study

This paper presents calculations of the selection intensity of common selection and replacement methods used in genetic algorithms (GAs) with generation gaps. The selection intensity measures the increase of the average fitness of the population after selection, and it can be used to predict the average fitness of the population at each iteration as well as the number of steps until the population converges to a unique solution. In addition, the theory explains the fast convergence of some algorithms with small generation gaps. The accuracy of the calculations was verified experimentally with a simple test function. The results of this study facilitate comparisons between different algorithms, and provide a tool to adjust the selection pressure, which is indispensable to obtain robust algorithms

UNT Digital Library

Recommended from our members

Using Evolutionary Algorithms to Induce Oblique Decision Trees

Author: Cantu-Paz E.
Kamath C.
Publication venue: Lawrence Livermore National Laboratory
Publication date: 21/01/2000
Field of study

This paper illustrates the application of evolutionary algorithms (EAs) to the problem of oblique decision tree induction. The objectives are to demonstrate that EAs can find classifiers whose accuracy is competitive with other oblique tree construction methods, and that this can be accomplished in a shorter time. Experiments were performed with a (1+1) evolutionary strategy and a simple genetic algorithm on public domain and artificial data sets. The empirical results suggest that the EAs quickly find Competitive classifiers, and that EAs scale up better than traditional methods to the dimensionality of the domain and the number of training instances

UNT Digital Library