6,789 research outputs found
Recommended from our members
OMMA enables population-scale analysis of complex genomic features and phylogenomic relationships from nanochannel-based optical maps.
BackgroundOptical mapping is an emerging technology that complements sequencing-based methods in genome analysis. It is widely used in improving genome assemblies and detecting structural variations by providing information over much longer (up to 1 Mb) reads. Current standards in optical mapping analysis involve assembling optical maps into contigs and aligning them to a reference, which is limited to pairwise comparison and becomes bias-prone when analyzing multiple samples.FindingsWe present a new method, OMMA, that extends optical mapping to the study of complex genomic features by simultaneously interrogating optical maps across many samples in a reference-independent manner. OMMA captures and characterizes complex genomic features, e.g., multiple haplotypes, copy number variations, and subtelomeric structures when applied to 154 human samples across the 26 populations sequenced in the 1000 Genomes Project. For small genomes such as pathogenic bacteria, OMMA accurately reconstructs the phylogenomic relationships and identifies functional elements across 21 Acinetobacter baumannii strains.ConclusionsWith the increasing data throughput of optical mapping system, the use of this technology in comparative genome analysis across many samples will become feasible. OMMA is a timely solution that can address such computational need. The OMMA software is available at https://github.com/TF-Chan-Lab/OMTools
Uv-to-fir analysis of spitzer/irac sources in the extended groth strip i: Multi-wavelength photometry and spectral energy distributions
We present an IRAC 3.6+4.5 microns selected catalog in the Extended Groth
Strip (EGS) containing photometry from the ultraviolet to the far-infrared and
stellar parameters derived from the analysis of the multi-wavelength data. In
this paper, we describe the method used to build coherent spectral energy
distributions (SEDs) for all the sources. In a companion paper, we analyze
those SEDs to obtain robust estimations of stellar parameters such as
photometric redshifts, stellar masses, and star formation rates. The catalog
comprises 76,936 sources with [3.6]<23.75 mag (85% completeness level of the
IRAC survey in the EGS) over 0.48 square degrees. For approximately 16% of this
sample, we are able to deconvolve the IRAC data to obtain robust fluxes for the
multiple counterparts found in ground-based optical images. Typically, the SEDs
of the IRAC sources in our catalog count with more than 15 photometric data
points, spanning from the UV to the FIR. Approximately 95% and 90% of all IRAC
sources are detected in the deepest optical and near-infrared bands. Only 10%
of the sources have optical spectroscopy and redshift estimations. Almost 20%
and 2% of the sources are detected by MIPS at 24 and 70 microns, respectively.
We also cross-correlate our catalog with public X-ray and radio catalogs.
Finally, we present the Rainbow Navigator public web-interface utility designed
to browse all the data products resulting from this work, including images,
spectra, photometry, and stellar parameters.Comment: 28 pages, 12 figures, Accepted for publication in ApJ. Access the
Rainbow Database at: http://rainbowx.fis.ucm.e
Extreme Value Theory and Fat Tails in Equity Markets
Equity market crashes or booms are extreme realizations of the underlying return distribution. This paper questions whether booms are more or less likely than crashes and whether emerging markets crash more frequently than developed equity markets. We apply Extreme Value Theory (EVT) to construct statistical tests of both of these questions. EVT elegantly frames the problem of extreme events in the context of the limiting distributions of sample maxima and minima. This paper applies generalized extreme value theory to understand the probability of extreme events and estimate the level of �fatness� in the tails of emerging and developed markets. We disentangle the major �tail index� estimators in the literature and evaluate their small sample properties and sensitivities to the number of extreme observations. We choose to use the Hill index to measure the shape of the distribution in the tail. We then apply nonparametric techniques to assess the significance of differences in tail thickness between the positive and negative tails of a given market and in the tail behavior of the developed and emerging region. We construct Monte Carlo and Wild Bootstrap tests of the null of tail symmetry and find that negative tails are statistically significantly fatter than positive tails for a subset of markets in both regions. We frame group bootstrap tests of universal tail behavior for each region and show that the tail index is statistically similar across countries within the same region. This allows us to pool returns and estimate region wide tail behavior. We form bootstrapping tests of pooled returns and document evidence that emerging markets have fatter negative tails than the developed region. Our findings are consistent with prevalent notions of crashes being more in the emerging region than among developed markets. However our results of asymmetry in several markets in both regions, suggest that the risk of market crashes varies significantly within the region. This has important implications for any international portfolio allocation decisions made with a regional viewExtreme value theory, fat tails, emerging markets
The FORS Deep Field: Field selection, photometric observations and photometric catalog
The FORS Deep Field project is a multi-colour, multi-object spectroscopic
investigation of an approx. 7 times 7 region near the south galactic pole based
mostly on observations carried out with the FORS instruments attached to the
VLT telescopes. It includes the QSO Q 0103-260 (z = 3.36). The goal of this
study is to improve our understanding of the formation and evolution of
galaxies in the young Universe. In this paper the field selection, the
photometric observations, and the data reduction are described. The source
detection and photometry of objects in the FORS Deep Field is discussed in
detail. A combined B and I selected UBgRIJKs photometric catalog of 8753
objects in the FDF is presented and its properties are briefly discussed. The
formal 50% completeness limits for point sources, derived from the co-added
images, are 25.64, 27.69, 26.86, 26.68, 26.37, 23.60 and 21.57 in U, B, g, R,
I, J and Ks (Vega-system), respectively. A comparison of the number counts in
the FORS Deep Field to those derived in other deep field surveys shows very
good agreement.Comment: 15 pages, 11 figures (included), accepted for publication in A&
Decomposition Methods for Large Scale LP Decoding
When binary linear error-correcting codes are used over symmetric channels, a
relaxed version of the maximum likelihood decoding problem can be stated as a
linear program (LP). This LP decoder can be used to decode error-correcting
codes at bit-error-rates comparable to state-of-the-art belief propagation (BP)
decoders, but with significantly stronger theoretical guarantees. However, LP
decoding when implemented with standard LP solvers does not easily scale to the
block lengths of modern error correcting codes. In this paper we draw on
decomposition methods from optimization theory, specifically the Alternating
Directions Method of Multipliers (ADMM), to develop efficient distributed
algorithms for LP decoding.
The key enabling technical result is a "two-slice" characterization of the
geometry of the parity polytope, which is the convex hull of all codewords of a
single parity check code. This new characterization simplifies the
representation of points in the polytope. Using this simplification, we develop
an efficient algorithm for Euclidean norm projection onto the parity polytope.
This projection is required by ADMM and allows us to use LP decoding, with all
its theoretical guarantees, to decode large-scale error correcting codes
efficiently.
We present numerical results for LDPC codes of lengths more than 1000. The
waterfall region of LP decoding is seen to initiate at a slightly higher
signal-to-noise ratio than for sum-product BP, however an error floor is not
observed for LP decoding, which is not the case for BP. Our implementation of
LP decoding using ADMM executes as fast as our baseline sum-product BP decoder,
is fully parallelizable, and can be seen to implement a type of message-passing
with a particularly simple schedule.Comment: 35 pages, 11 figures. An early version of this work appeared at the
49th Annual Allerton Conference, September 2011. This version to appear in
IEEE Transactions on Information Theor
Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo
Understanding the function of complex cortical circuits requires the
simultaneous recording of action potentials from many neurons in awake and
behaving animals. Practically, this can be achieved by extracellularly
recording from multiple brain sites using single wire electrodes. However, in
densely packed neural structures such as the human hippocampus, a single
electrode can record the activity of multiple neurons. Thus, analytic
techniques that differentiate action potentials of different neurons are
required. Offline spike sorting approaches are currently used to detect and
sort action potentials after finishing the experiment. Because the
opportunities to record from the human brain are relatively rare, it is
desirable to analyze large numbers of simultaneous recordings quickly using
online sorting and detection algorithms. In this way, the experiment can be
optimized for the particular response properties of the recorded neurons. Here
we present and evaluate a method that is capable of detecting and sorting
extracellular single-wire recordings in realtime. We demonstrate the utility of
the method by applying it to an extensive data set we acquired from
chronically-implanted depth electrodes in the hippocampus of human epilepsy
patients. This dataset is particularly challenging because it was recorded in a
noisy clinical environment. This method will allow the development of
closed-loop experiments, which immediately adapt the experimental stimuli
and/or tasks to the neural response observed.Comment: 9 figures, 2 tables. Journal of Neuroscience Methods 2006 (in press).
Journal of Neuroscience Methods, 2006 (in press
- …