7,513 research outputs found

    Uniform Sampling for Matrix Approximation

    Full text link
    Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances. We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear regression directly, it is enough to compute a better approximation. This observation leads to simple iterative row sampling algorithms for matrix approximation that run in input-sparsity time and preserve row structure and sparsity at all intermediate steps. In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows

    ENCONTER: Entity constrained progressive sequence generation via insertion-based transformer

    Get PDF
    National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding InitiativeCode is available at https://github.com/LARC-CMU-SMU/Enconter</p

    Comparative mitogenomics of the Decapoda reveals evolutionary heterogeneity in architecture and composition

    Get PDF
    The emergence of cost-effective and rapid sequencing approaches has resulted in an exponential rise in the number of mitogenomes on public databases in recent years, providing greater opportunity for undertaking large-scale comparative genomic and systematic research. Nonetheless, current datasets predominately come from small and disconnected studies on a limited number of related species, introducing sampling biases and impeding research of broad taxonomic relevance. This study contributes 21 crustacean mitogenomes from several under-represented decapod infraorders including Polychelida and Stenopodidea, which are used in combination with 225 mitogenomes available on NCBI to investigate decapod mitogenome diversity and phylogeny. An overview of mitochondrial gene orders (MGOs) reveals a high level of genomic variability within the Decapoda, with a large number of MGOs deviating from the ancestral arthropod ground pattern and unevenly distributed among infraorders. Despite the substantial morphological and ecological variation among decapods, there was limited evidence for correlations between gene rearrangement events and species ecology or lineage specific nucleotide substitution rates. Within a phylogenetic context, predicted scenarios of rearrangements show some MGOs to be informative synapomorphies for some taxonomic groups providing strong independent support for phylogenetic relationships. Additional comparisons for a range of mitogenomic features including nucleotide composition, strand asymmetry, unassigned regions and codon usage indicate several clade-specific trends that are of evolutionary and ecological interest

    Spectral hardness evolution characteristics of tracking Gamma-ray Burst pulses

    Full text link
    Employing a sample presented by Kaneko et al. (2006) and Kocevski et al. (2003), we select 42 individual tracking pulses (here we defined tracking as the cases in which the hardness follows the same pattern as the flux or count rate time profile) within 36 Gamma-ray Bursts (GRBs) containing 527 time-resolved spectra and investigate the spectral hardness, EpeakE_{peak} (where EpeakE_{peak} is the maximum of the νFν\nu F_{\nu} spectrum), evolutionary characteristics. The evolution of these pulses follow soft-to-hard-to-soft (the phase of soft-to-hard and hard-to-soft are denoted by rise phase and decay phase, respectively) with time. It is found that the overall characteristics of EpeakE_{peak} of our selected sample are: 1) the EpeakE_{peak} evolution in the rise phase always start on the high state (the values of EpeakE_{peak} are always higher than 50 keV); 2) the spectra of rise phase clearly start at higher energy (the median of EpeakE_{peak} are about 300 keV), whereas the spectra of decay phase end at much lower energy (the median of EpeakE_{peak} are about 200 keV); 3) the spectra of rise phase are harder than that of the decay phase and the duration of rise phase are much shorter than that of decay phase as well. In other words, for a complete pulse the initial EpeakE_{peak} is higher than the final EpeakE_{peak} and the duration of initial phase (rise phase) are much shorter than the final phase (decay phase). This results are in good agreement with the predictions of Lu et al. (2007) and current popular view on the production of GRBs. We argue that the spectral evolution of tracking pulses may be relate to both of kinematic and dynamic process even if we currently can not provide further evidences to distinguish which one is dominant. Moreover, our statistical results give some witnesses to constrain the current GRB model.Comment: 32 pages, 26 figures, 3 tables, accepted for publication in New Astronom

    Numerical Simulation of Epidemic Prevention and Ventilation Efficiency in Indoor Spaces with Partitions and an Air curtain

    Get PDF
    In this study, computational fluid dynamics (CFD) were used to simulate the effect of a partition and air curtain on the concentration of a pollution source in an indoor space with different ventilation configurations. First, in the partition simulation, the performances of six different ventilation configurations were compared. Based on the results obtained, air curtain simulations were then carried out. In this study, carbon dioxide was chosen as the tracer gas in all simulations, and the realizable k − ε turbulence model was selected. In the partition simulation, a front-and-back ventilation configuration with ventilation inlets/outlets near the side walls (in diagonal) showed the best performance. This configuration was adopted for the air curtain simulation so as to investigate the effect of different air inlet velocities and air curtain velocities. It was found that as the height of the partition increases, although it has a higher chance of blocking the Covid-19 virus, it lowers the ventilation efficiency, resulting in the increase of carbon dioxide concentration in the indoor space. When the partition was replaced with an air curtain, it was found that the higher the height of the air curtain, the lower the carbon dioxide concentration in the indoor space. Compared with the partition, the air curtain can reduce the carbon dioxide concentration by up to 74.6%, indicating that the introduction of the air curtain can have an improving effect on the ventilation in the indoor space

    An Empirical Study on Challenging Math Problem Solving with GPT-4

    Full text link
    Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. While several prior works have investigated solving elementary mathematics using LLMs, this work explores the frontier of using GPT-4 for solving more complex and challenging math problems. We evaluate various ways of using GPT-4. Some of them are adapted from existing work, and one is \MathChat, a conversational problem-solving framework newly proposed in this work. We perform the evaluation on difficult high school competition problems from the MATH dataset, which shows the advantage of the proposed conversational approach

    Defect formation during preforming of a bi-axial non-crimp fabric with a pillar stitch pattern

    Get PDF
    To capture the asymmetrical shear behaviour of a bi-axial NCF with a pillar stitch, a non-orthogonal constitutive model was developed and implemented in finite element forming simulations. Preforming experiments indicate that the local distribution of defects is significantly different on both sides of each bi-axial ply, with two different defect mechanisms observed. Correlation with simulation results indicates that one defect type is caused by excessive shear, inducing out-of-plane wrinkling in regions of positive shear (macro-scale wrinkling). The other defect type is caused by fibre compression, inducing in-plane wrinkling in regions of negative shear (meso-scale wrinkling). Local distributions of shear angle and wrinkling strain were used to determine the wrinkling mode and to confirm the corresponding defect mechanism. Results indicate that simulations based on the advanced constitutive model can predict local shear angles within ±5°of experimental values and that predicted wrinkling positions and defect types correlate well with the experiments

    Abnormal Mammary Gland Development and Growth Retardation in Female Mice and MCF7 Breast Cancer Cells Lacking Androgen Receptor

    Get PDF
    Phenotype analysis of female mice lacking androgen receptor (AR) deficient (AR−/−) indicates that the development of mammary glands is retarded with reduced ductal branching in the prepubertal stages, and fewer Cap cells in the terminal end buds, as well as decreased lobuloalveolar development in adult females, and fewer milk-producing alveoli in the lactating glands. The defective development of AR−/− mammary glands involves the defects of insulin-like growth factor I–insulin-like growth factor I receptor and mitogen-activated protein kinase (MAPK) signals as well as estrogen receptor (ER) activity. Similar growth retardation and defects in growth factor–mediated Ras/Raf/MAPK cascade and ER signaling are also found in AR−/− MCF7 breast cancer cells. The restoration assays show that AR NH2-terminal/DNA-binding domain, but not the ligand-binding domain, is essential for normal MAPK function in MCF7 cells, and an AR mutant (R608K), found in male breast cancer, is associated with the excessive activation of MAPK. Together, our data provide the first in vivo evidence showing that AR-mediated MAPK and ER activation may play important roles for mammary gland development and MCF7 breast cancer cell proliferation
    • …
    corecore