5,244 research outputs found
Do consumers gamble to convexify?
The combination of credit constraints and indivisible consumption goods may induce some risk-averse individuals to gamble to have a chance of crossing a purchasing threshold. This idea has been demonstrated theoretically, but not explored empirically. We test this idea by focusing on a key implication: income effects for individuals who choose to gamble are likely to be larger than for the general population. Using UK data on gambling wins, other windfalls and durable goods purchases, we show that winners display higher income effects than non-winners but only amongst those likely to be credit-constrained. This is consistent with credit-constrained, risk-averse agents gambling to convexify their budget set.This work was supported in part by the ESRC-funded Centre for Microeconomic Analysis of Public Policy at the Institute for Fiscal Studies (grant number RES-544-28-5001.)This is the final version of the article. It first appeared from Elsevier via http://dx.doi.org/10.1016/j.jebo.2016.07.02
Parallel approach to sliding window sums
Sliding window sums are widely used in bioinformatics applications, including
sequence assembly, k-mer generation, hashing and compression. New vector
algorithms which utilize the advanced vector extension (AVX) instructions
available on modern processors, or the parallel compute units on GPUs and
FPGAs, would provide a significant performance boost for the bioinformatics
applications. We develop a generic vectorized sliding sum algorithm with
speedup for window size w and number of processors P is O(P/w) for a generic
sliding sum. For a sum with commutative operator the speedup is improved to
O(P/log(w)). When applied to the genomic application of minimizer based k-mer
table generation using AVX instructions, we obtain a speedup of over 5X.Comment: 10 pages, 5 figure
Space-efficient Feature Maps for String Alignment Kernels
String kernels are attractive data analysis tools for analyzing string data.
Among them, alignment kernels are known for their high prediction accuracies in
string classifications when tested in combination with SVM in various
applications. However, alignment kernels have a crucial drawback in that they
scale poorly due to their quadratic computation complexity in the number of
input strings, which limits large-scale applications in practice. We address
this need by presenting the first approximation for string alignment kernels,
which we call space-efficient feature maps for edit distance with moves
(SFMEDM), by leveraging a metric embedding named edit sensitive parsing (ESP)
and feature maps (FMs) of random Fourier features (RFFs) for large-scale string
analyses. The original FMs for RFFs consume a huge amount of memory
proportional to the dimension d of input vectors and the dimension D of output
vectors, which prohibits its large-scale applications. We present novel
space-efficient feature maps (SFMs) of RFFs for a space reduction from O(dD) of
the original FMs to O(d) of SFMs with a theoretical guarantee with respect to
concentration bounds. We experimentally test SFMEDM on its ability to learn SVM
for large-scale string classifications with various massive string data, and we
demonstrate the superior performance of SFMEDM with respect to prediction
accuracy, scalability and computation efficiency.Comment: Full version for ICDM'19 pape
Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a Chemogenomics Approach
Virtual screening (VS) is widely used during computational drug discovery to
reduce costs. Chemogenomics-based virtual screening (CGBVS) can be used to
predict new compound-protein interactions (CPIs) from known CPI network data
using several methods, including machine learning and data mining. Although
CGBVS facilitates highly efficient and accurate CPI prediction, it has poor
performance for prediction of new compounds for which CPIs are unknown. The
pairwise kernel method (PKM) is a state-of-the-art CGBVS method and shows high
accuracy for prediction of new compounds. In this study, on the basis of link
mining, we improved the PKM by combining link indicator kernel (LIK) and
chemical similarity and evaluated the accuracy of these methods. The proposed
method obtained an average area under the precision-recall curve (AUPR) value
of 0.562, which was higher than that achieved by the conventional Gaussian
interaction profile (GIP) method (0.425), and the calculation time was only
increased by a few percent
galign: A Tool for Rapid Genome Polymorphism Discovery
BACKGROUND: Highly parallel sequencing technologies have become important tools in the analysis of sequence polymorphisms on a genomic scale. However, the development of customized software to analyze data produced by these methods has lagged behind. METHODS/PRINCIPAL FINDINGS: Here I describe a tool, 'galign', designed to identify polymorphisms between sequence reads obtained using Illumina/Solexa technology and a reference genome. The 'galign' alignment tool does not use Smith-Waterman matrices for sequence comparisons. Instead, a simple algorithm comparing parsed sequence reads to parsed reference genome sequences is used. 'galign' output is geared towards immediate user application, displaying polymorphism locations, nucleotide changes, and relevant predicted amino-acid changes for ease of information processing. To do so, 'galign' requires several accessory files easily derived from an annotated reference genome. Direct sequencing as well as in silico studies demonstrate that 'galign' provides lesion predictions comparable in accuracy to available prediction programs, accompanied by greater processing speed and more user-friendly output. We demonstrate the use of 'galign' to identify mutations leading to phenotypic consequences in C. elegans. CONCLUSION/SIGNIFICANCE: Our studies suggest that 'galign' is a useful tool for polymorphism discovery, and is of immediate utility for sequence mining in C. elegans
Clustering exact matches of pairwise sequence alignments by weighted linear regression
<p>Abstract</p> <p>Background</p> <p>At intermediate stages of genome assembly projects, when a number of contigs have been generated and their validity needs to be verified, it is desirable to align these contigs to a reference genome when it is available. The interest is not to analyze a detailed alignment between a contig and the reference genome at the base level, but rather to have a rough estimate of where the contig aligns to the reference genome, specifically, by identifying the starting and ending positions of such a region. This information is very useful in ordering the contigs, facilitating post-assembly analysis such as gap closure and resolving repeats. There exist programs, such as BLAST and MUMmer, that can quickly align and identify high similarity segments between two sequences, which, when seen in a dot plot, tend to agglomerate along a diagonal but can also be disrupted by gaps or shifted away from the main diagonal due to mismatches between the contig and the reference. It is a tedious and practically impossible task to visually inspect the dot plot to identify the regions covered by a large number of contigs from sequence assembly projects. A forced global alignment between a contig and the reference is not only time consuming but often meaningless.</p> <p>Results</p> <p>We have developed an algorithm that uses the coordinates of all the exact matches or high similarity local alignments, clusters them with respect to the main diagonal in the dot plot using a weighted linear regression technique, and identifies the starting and ending coordinates of the region of interest.</p> <p>Conclusion</p> <p>This algorithm complements existing pairwise sequence alignment packages by replacing the time-consuming seed extension phase with a weighted linear regression for the alignment seeds. It was experimentally shown that the gain in execution time can be outstanding without compromising the accuracy. This method should be of great utility to sequence assembly and genome comparison projects.</p
Multipurpose High Frequency Electron Spin Resonance Spectrometer for Condensed Matter Research
We describe a quasi-optical multifrequency ESR spectrometer operating in the
75-225 GHz range and optimized at 210 GHz for general use in condensed matter
physics, chemistry and biology. The quasi-optical bridge detects the change of
mm wave polarization at the ESR. A controllable reference arm maintains a mm
wave bias at the detector. The attained sensitivity of 2x10^10 spin/G/(Hz)1/2,
measured on a dilute Mn:MgO sample in a non-resonant probe head at 222.4 GHz
and 300 K, is comparable to commercial high sensitive X band spectrometers. The
spectrometer has a Fabry-Perot resonator based probe head to measure aqueous
solutions, and a probe head to measure magnetic field angular dependence of
single crystals. The spectrometer is robust and easy to use and may be operated
by undergraduate students. Its performance is demonstrated by examples from
various fields of condensed matter physics.Comment: submitted to Journal of Magnetic Resonanc
Neutron studies of Na-ion battery materials
The relative vast abundance and more equitable global distribution of terrestrial sodium makes sodium-ion batteries (NIBs) potentially cheaper and more sustainable alternatives to commercial lithium-ion batteries (LIBs). However, the practical capacities and cycle lives of NIBs at present do not match those of LIBs and have therefore hindered their progress to commercialisation. The present drawback of NIB technology stems largely from the electrode materials and their associated Na+ion storage mechanisms. Increased understanding of the electrochemical storage mechanisms and kinetics is therefore vital for the development of current and novel materials to realise the commercial NIB. In contrast to x-ray techniques, the non-dependency of neutron scattering on the atomic number of elements (Z) can substantially increase the scattering contrast of small elements such as sodium and carbon, making neutron techniques powerful for the investigation of NIB electrode materials. Moreover, neutrons are far more penetrating which enables more complex sample environments including in situ and operando studies. Here, we introduce the theory of, and review the use of, neutron diffraction and quasi-elastic neutron scattering, to investigate the structural and dynamic properties of electrode and electrolyte materials for NIBs. To improve our understanding of the actual sodium storage mechanisms and identify intermediate stages during charge/discharge, ex situ, in situ, and operando neutron experiments are required. However, to date there are few studies where operando experiments are conducted during electrochemical cycling. This highlights an opportunity for research to elucidate the operating mechanisms within NIB materials that are under much debate at present
- …