1,277 research outputs found

    A Framework for Genetic Algorithms Based on Hadoop

    Full text link
    Genetic Algorithms (GAs) are powerful metaheuristic techniques mostly used in many real-world applications. The sequential execution of GAs requires considerable computational power both in time and resources. Nevertheless, GAs are naturally parallel and accessing a parallel platform such as Cloud is easy and cheap. Apache Hadoop is one of the common services that can be used for parallel applications. However, using Hadoop to develop a parallel version of GAs is not simple without facing its inner workings. Even though some sequential frameworks for GAs already exist, there is no framework supporting the development of GA applications that can be executed in parallel. In this paper is described a framework for parallel GAs on the Hadoop platform, following the paradigm of MapReduce. The main purpose of this framework is to allow the user to focus on the aspects of GA that are specific to the problem to be addressed, being sure that this task is going to be correctly executed on the Cloud with a good performance. The framework has been also exploited to develop an application for Feature Subset Selection problem. A preliminary analysis of the performance of the developed GA application has been performed using three datasets and shown very promising performance

    Orion revisited. II. The foreground population to Orion A

    Full text link
    Following the recent discovery of a large population of young stars in front of the Orion Nebula, we carried out an observational campaign with the DECam wide-field camera covering ~10~deg^2 centered on NGC 1980 to confirm, probe the extent of, and characterize this foreground population of pre-main-sequence stars. We confirm the presence of a large foreground population towards the Orion A cloud. This population contains several distinct subgroups, including NGC1980 and NGC1981, and stretches across several degrees in front of the Orion A cloud. By comparing the location of their sequence in various color-magnitude diagrams with other clusters, we found a distance and an age of 380pc and 5~10Myr, in good agreement with previous estimates. Our final sample includes 2123 candidate members and is complete from below the hydrogen-burning limit to about 0.3Msun, where the data start to be limited by saturation. Extrapolating the mass function to the high masses, we estimate a total number of ~2600 members in the surveyed region. We confirm the presence of a rich, contiguous, and essentially coeval population of about 2600 foreground stars in front of the Orion A cloud, loosely clustered around NGC1980, NGC1981, and a new group in the foreground of the OMC-2/3. For the area of the cloud surveyed, this result implies that there are more young stars in the foreground population than young stars inside the cloud. Assuming a normal initial mass function, we estimate that between one to a few supernovae must have exploded in the foreground population in the past few million years, close to the surface of Orion A, which might be responsible, together with stellar winds, for the structure and star formation activity in these clouds. This long-overlooked foreground stellar population is of great significance, calling for a revision of the star formation history in this region of the Galaxy.Comment: Accepted for publication in A&

    Did You Do Your Homework? Raising Awareness on Software Fairness and Discrimination

    Get PDF
    Machine Learning is a vital part of various modern day decision making software. At the same time, it has shown to exhibit bias, which can cause an unjust treatment of individuals and population groups. One method to achieve fairness in machine learning software is to provide individuals with the same degree of benefit, regardless of sensitive attributes (e.g., students receive the same grade, independent of their sex or race). However, there can be other attributes that one might want to discriminate against (e.g., students with homework should receive higher grades). We will call such attributes anti-protected attributes. When reducing the bias of machine learning software, one risks the loss of discriminatory behaviour of anti-protected attributes. To combat this, we use grid search to show that machine learning software can be debiased (e.g., reduce gender bias) while also improving the ability to discriminate against anti-protected attributes

    The Effect of Offspring Population Size on NSGA-II: A Preliminary Study

    Get PDF
    Non-Dominated Sorting Genetic Algorithm (NSGA-II) is one of the most popular Multi-Objective Evolutionary Algorithms (MOEA) and has been applied to a large range of problems. Previous studies have shown that parameter tuning can improve NSGA-II performance. However, the tuning of the offspring population size, which guides the exploration-exploitation trade-off in NSGA-II, has been overlooked so far. Previous work has generally used the population size as the default offspring population size for NSGA-II. We therefore investigate the impact of offspring population size on the performance of NSGA-II. We carry out an empirical study by comparing the effectiveness of three configurations vs. the default NSGA-II configuration on six optimization problems based on four Pareto front quality indicators and statistical tests. Our findings show that the performance of NSGA-II can be improved by reducing the offspring population size and in turn increasing the number of generations. This leads to similar or statistically significant better results than those obtained by using the default NSGA-II configuration in 92% of the experiments performed

    A hierarchical Bayesian model to infer PL(Z) relations using Gaia parallaxes

    Get PDF
    Aims. We aim at creating a Bayesian model to infer the coefficients of PL or PLZ relations that propagates uncertainties in the observables in a rigorous and well founded way. Methods. We propose a directed acyclic graph to encode the conditional probabilities of the inference model that will allow us to infer probability distributions for the PL and PL(Z) relations. We evaluate the model with several semi-synthetic data sets and apply it to a sample of 200 fundamental mode and first overtone mode RR Lyrae stars for which Gaia DR1 parallaxes and literature Ks-band mean magnitudes are available. We define and test several hyperprior probabilities to verify their adequacy and check the sensitivity of the solution with respect to the prior choice. Results. The main conclusion of this work is the absolute necessity of incorporating the existing correlations between the observed variables (periods, metallicities and parallaxes) in the form of model priors in order to avoid systematically biased results, especially in the case of non-negligible uncertainties in the parallaxes. The tests with the semi-synthetic data based on the data set used in Gaia Collaboration et al. (2017) reveal the significant impact that the existing correlations between parallax, metallicity and periods have on the inferred parameters. The relation coefficients obtained here have been superseded by those presented in Muraveva et al. (2018a), that incorporates the findings of this work and the more recent Gaia DR2 measurements.Comment: 14 pages, 12 figures. Submitted to A&

    Statistical techniques for the detection and analysis of solar explosive events

    Full text link
    Solar explosive events are commonly explained as small scale magnetic reconnection events, although unambiguous confirmation of this scenario remains elusive due to the lack of spatial resolution and of the statistical analysis of large enough samples of this type of events. In this work, we propose a sound statistical treatment of data cubes consisting of a temporal sequence of long slit spectra of the solar atmosphere. The analysis comprises all the stages from the explosive event detection to its characterization and the subsequent sample study. We have designed two complementary approaches based on the combination of standard statistical techniques (Robust Principal Component Analysis in one approach and wavelet decomposition and Independent Component Analysis in the second) in order to obtain least biased samples. These techniques are implemented in the spirit of letting the data speak for themselves. The analysis is carried out for two spectral lines: the C IV line at 1548.2 angstroms and the Ne VIII line at 770.4 angstroms. We find significant differences between the characteristics of the line profiles emitted in the proximities of two active regions, and in the quiet Sun, most visible in the relative importance of a separate population of red shifted profiles. We also find a higher frequency of explosive events near the active regions, and in the C IV line. The distribution of the explosive events characteristics is interpreted in the light of recent numerical simulations. Finally, we point out several regions of the parameter space where the reconnection model has to be refined in order to explain the observations.Comment: Accepted for publication in Astronomy and Astrophysics (in Section 9. The Sun) on 18/01/2011. 17 pages, 22 Figure
    corecore