113 research outputs found
Inheritance-Based Diversity Measures for Explicit Convergence Control in Evolutionary Algorithms
Diversity is an important factor in evolutionary algorithms to prevent
premature convergence towards a single local optimum. In order to maintain
diversity throughout the process of evolution, various means exist in
literature. We analyze approaches to diversity that (a) have an explicit and
quantifiable influence on fitness at the individual level and (b) require no
(or very little) additional domain knowledge such as domain-specific distance
functions. We also introduce the concept of genealogical diversity in a broader
study. We show that employing these approaches can help evolutionary algorithms
for global optimization in many cases.Comment: GECCO '18: Genetic and Evolutionary Computation Conference, 2018,
Kyoto, Japa
Learning Action Models as Reactive Behaviors
Abstract Autonomous vehicles will require both projective planning and reactive components in order to perform robustly. Projective components are needed for long-term planning and replanning where explicit reasoning about future states is required. Reactive components allow the system to always have some action available in real-time, and themselves can exhibit robust behavior, but lack the ability to explicitly reason about future states over a long time period. This work addresses the problem of learning reactive components (normative action models) for autonomous vehicles from simulation models. Two main thrusts of our current work are described here. First, we wish to show that behaviors learned from simulation are useful in the actual physical system operating in the real world. Second, in order to scale the technique, we demonstrate how behaviors can be built up by first learning lower level behaviors, and then fixing these to use as base cornportents of higher-level behaviors
Application of machine learning in SNP discovery
<p>Abstract</p> <p>Background</p> <p>Single nucleotide polymorphisms (SNP) constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML) method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures.</p> <p>Results</p> <p>The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False). The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites) (12 Mb) in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb). SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV) (i.e., fraction of candidate SNP being real) were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes.</p> <p>Conclusion</p> <p>A machine learning (ML) method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study indicate that a trained ML classifier can significantly reduce human intervention and in this case achieved a 5–10 fold enhanced productivity. The optimized feature set and ML framework can also be applied to all polymorphism discovery software. ML support software is written in Perl and can be easily integrated into an existing SNP discovery pipeline.</p
SNP-PHAGE – High throughput SNP discovery pipeline
BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at . CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers
Interventions in measles outbreaks: the potential reduction in cases associated with school suspension and vaccination interventions
Background: Measles is resurgent in the US, with more cases in 2019 than any year since 1992. Many of the cases were concentrated in three outbreaks in New York and Washington states, where local governments enacted intervention strategies in an attempt to limit the spread of measles. Regulations differed by location, suggesting guidance on the optimal interventions may be beneficial.
Methods: We simulate the daily interactions of the populations of six metropolitan areas of Texas, US, using an agent-based model. The real-life vaccination rates of each school in these metropolitan areas are applied to simulated equivalents. A single case of measles is introduced to the population and the resulting number of cases counted. A range of public health interventions, focused on suspending unvaccinated students and mandatory vaccinations, were simulated during measles outbreaks and the reduction in the number of measles cases, relative to no intervention, recorded. Interventions were simulated only in schools with measles cases and in all schools in each metropolitan area.
Results: Suspending unvaccinated students from school was associated with the greatest reduction in measles cases. In a plausible worst-case outbreak scenario, the number of cases is forecast to reduce by 68-96%. Interventions targeting all schools in a metropolitan area is not found to be associated with fewer measles cases than only targeting schools with measles cases, at 2018 vaccination rates. Targeting all schools also increases the cumulative number of school days missed by suspended students by a factor of 10-100, depending on the metropolitan area, compared to targeting only schools with measles cases. If vaccination rates drop 5% in the schools which are under-vaccinated in 2018, metropolitan area-wide interventions are forecast to be associated with fewer cases than school-specific interventions.
Conclusions: Interventions that are quickly implemented and widely followed may reduce the size of measles outbreaks by up 96%. If vaccination rates continue to fall in Texas, metropolitan area-wide interventions should be considered in the event of an outbreak
NuSTAR and Swift observations of the black hole candidate XTE J1908+094 during its 2013 outburst
The black hole candidate XTE J1908+094 went into outburst for the first time
since 2003 in October 2013. We report on an observation with the Nuclear
Spectroscopic Telescope Array (NuSTAR) and monitoring observations with Swift
during the outburst. NuSTAR caught the source in the soft state: the spectra
show a broad relativistic iron line, and the light curves reveal a ~40 ks flare
with the count rate peaking about 40% above the non-flare level and with
significant spectral variation. A model combining a multi-temperature thermal
component, a power-law, and a reflection component with an iron line provides a
good description of the NuSTAR spectrum. Although relativistic broadening of
the iron line is observed, it is not possible to constrain the black hole spin
with these data. The variability of the power-law component, which can also be
modeled as a Comptonization component, is responsible for the flux and spectral
change during the flare, suggesting that changes in the corona (or possibly
continued jet activity) are the likely cause of the flare.Comment: 9 pages, 6 figures, 3 tables, accepted for publication in Ap
The smooth cyclotron line in Her X-1 as seen with NuSTAR
Her X-1, one of the brightest and best studied X-ray binaries, shows a
cyclotron resonant scattering feature (CRSF) near 37 keV. This makes it an
ideal target for detailed study with the Nuclear Spectroscopic Telescope Array
(NuSTAR), taking advantage of its excellent hard X-ray spectral resolution. We
observed Her X-1 three times, coordinated with Suzaku, during one of the high
flux intervals of its 35d super-orbital period. This paper focuses on the shape
and evolution of the hard X-ray spectrum. The broad-band spectra can be fitted
with a powerlaw with a high-energy cutoff, an iron line, and a CRSF. We find
that the CRSF has a very smooth and symmetric shape, in all observations and at
all pulse-phases. We compare the residuals of a line with a Gaussian optical
depth profile to a Lorentzian optical depth profile and find no significant
differences, strongly constraining the very smooth shape of the line. Even
though the line energy changes dramatically with pulse phase, we find that its
smooth shape does not. Additionally, our data show that the continuum is only
changing marginally between the three observations. These changes can be
explained with varying amounts of Thomson scattering in the hot corona of the
accretion disk. The average, luminosity-corrected CRSF energy is lower than in
past observations and follows a secular decline. The excellent data quality of
NuSTAR provides the best constraint on the CRSF energy to date.Comment: 13 pages, 13 figures, accepted for publication in Ap
No Time for Dead Time: Timing analysis of bright black hole binaries with NuSTAR
Timing of high-count rate sources with the NuSTAR Small Explorer Mission
requires specialized analysis techniques. NuSTAR was primarily designed for
spectroscopic observations of sources with relatively low count-rates rather
than for timing analysis of bright objects. The instrumental dead time per
event is relatively long (~2.5 msec), and varies by a few percent
event-to-event. The most obvious effect is a distortion of the white noise
level in the power density spectrum (PDS) that cannot be modeled easily with
the standard techniques due to the variable nature of the dead time. In this
paper, we show that it is possible to exploit the presence of two completely
independent focal planes and use the cross power density spectrum to obtain a
good proxy of the white noise-subtracted PDS. Thereafter, one can use a Monte
Carlo approach to estimate the remaining effects of dead time, namely a
frequency-dependent modulation of the variance and a frequency-independent drop
of the sensitivity to variability. In this way, most of the standard timing
analysis can be performed, albeit with a sacrifice in signal to noise relative
to what would be achieved using more standard techniques. We apply this
technique to NuSTAR observations of the black hole binaries GX 339-4, Cyg X-1
and GRS 1915+105.Comment: 13 pages, 8 figures, submitted to Ap
- …