3,655 research outputs found

    Robust Sound Event Classification using Deep Neural Networks

    Get PDF
    The automatic recognition of sound events by computers is an important aspect of emerging applications such as automated surveillance, machine hearing and auditory scene understanding. Recent advances in machine learning, as well as in computational models of the human auditory system, have contributed to advances in this increasingly popular research field. Robust sound event classification, the ability to recognise sounds under real-world noisy conditions, is an especially challenging task. Classification methods translated from the speech recognition domain, using features such as mel-frequency cepstral coefficients, have been shown to perform reasonably well for the sound event classification task, although spectrogram-based or auditory image analysis techniques reportedly achieve superior performance in noise. This paper outlines a sound event classification framework that compares auditory image front end features with spectrogram image-based front end features, using support vector machine and deep neural network classifiers. Performance is evaluated on a standard robust classification task in different levels of corrupting noise, and with several system enhancements, and shown to compare very well with current state-of-the-art classification techniques

    A BMS-invariant free scalar model

    Full text link
    The BMS (Bondi-van der Burg-Metzner-Sachs) symmetry arises as the asymptotic symmetry of flat spacetime at null infinity. In particular, the BMS algebra for three dimensional flat spacetime (BMS3_3) is generated by the super-rotation generators which form a Virasoro sub-algebra with central charge cLc_L, together with mutually-commuting super-translation generators. The super-rotation and super-translation generators have non-trivial commutation relations with another central charge cMc_M. In this paper, we study a free scalar theory in two dimensions exhibiting BMS3_3 symmetry, which can also be understood as the ultra-relativistic limit of a free scalar CFT2_2. Upon canonical quantization on the highest weight vacuum, the central charges are found to be cL=2c_L=2 and cM=0c_M=0. Because of the vanishing central charge cM=0c_M=0, the theory features novel properties: there exist primary states which form a multiplet, and the Hilbert space can be organized by an enlarged version of BMS modules dubbed the staggered modules. We further calculate correlation functions and the torus partition function, the later of which is also shown explicitly to be modular invariant.Comment: 59 pages, 5 figures. v2, minor revision: typos correted and some statement rephrase

    γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Over the past two decades, there have been several approximate methods that adopt different mutation models and used for estimating nonsynonymous and synonymous substitution rates (Ka and Ks) based on protein-coding sequences across species or even different evolutionary lineages. Among them, MYN method (a Modified version of Yang-Nielsen method) considers three major dynamic features of evolving DNA sequences–bias in transition/transversion rate, nucleotide frequency, and unequal transitional substitution but leaves out another important feature: unequal substitution rates among different sites or nucleotide positions.</p> <p>Results</p> <p>We incorporated a new feature for analyzing evolving DNA sequences–unequal substitution rates among different sites–into MYN method, and proposed a modified version, namely <it>γ </it>(gamma)-MYN, based on an assumption that the evolutionary rate at each site follows a mode of <it>γ</it>-distribution. We applied <it>γ</it>-MYN to analyze the key estimator of selective pressure ω (Ka/Ks) and other relevant parameters in comparison to two other related methods, YN and MYN, and found that neglecting the variation of substitution rates among different sites may lead to biased estimations of ω. Our new method appears to have minimal deviations when relevant parameters vary within normal ranges defined by empirical data.</p> <p>Conclusion</p> <p>Our results indicate that unequal substitution rates among different sites have variable influences on ω under different evolutionary rates while both transition/transversion rate ratio and unequal nucleotide frequencies affect Ka and Ks thus selective pressure ω.</p> <p>Reviewers</p> <p>This paper was reviewed by Kateryna Makova, David A. Liberles (nominated by David H Ardell), Zhaolei Zhang (nominated by Mark Gerstein), and Shamil Sunyaev.</p

    Glue-on AdS holography for TTˉT\bar T-deformed CFTs

    Full text link
    The TTˉT\bar T deformation is a solvable irrelevant deformation whose properties depend on the sign of the deformation parameter μ\mu. In particular, TTˉT\bar T-deformed CFTs with μ<0\mu<0 have been proposed to be holographically dual to Einstein gravity where the metric satisfies Dirichlet boundary conditions at a finite cutoff surface. In this paper, we put forward a holographic proposal for TTˉT\bar T-deformed CFTs with μ>0\mu>0, in which case the bulk geometry is constructed by gluing a patch of AdS3_3 to the original spacetime. As evidence, we show that the TTˉT\bar T trace flow equation, the spectrum on the cylinder, and the partition function on the torus and the sphere, among other results, can all be reproduced from bulk calculations in glue-on AdS3_3.Comment: 33 pages, 1 figure; v2: clarifications and references added, matches published versio

    Improving Compositional Text-to-image Generation with Large Vision-Language Models

    Full text link
    Recent advancements in text-to-image models, particularly diffusion models, have shown significant promise. However, compositional text-to-image models frequently encounter difficulties in generating high-quality images that accurately align with input texts describing multiple objects, variable attributes, and intricate spatial relationships. To address this limitation, we employ large vision-language models (LVLMs) for multi-dimensional assessment of the alignment between generated images and their corresponding input texts. Utilizing this assessment, we fine-tune the diffusion model to enhance its alignment capabilities. During the inference phase, an initial image is produced using the fine-tuned diffusion model. The LVLM is then employed to pinpoint areas of misalignment in the initial image, which are subsequently corrected using the image editing algorithm until no further misalignments are detected by the LVLM. The resultant image is consequently more closely aligned with the input text. Our experimental results validate that the proposed methodology significantly improves text-image alignment in compositional image generation, particularly with respect to object number, attribute binding, spatial relationships, and aesthetic quality

    Generalized bias-variance evaluation of TREC participated systems

    Get PDF
    Recent research has shown that the improvement of mean retrieval effectiveness (e.g., MAP) may sacrifice the retrieval stability across queries, implying a tradeoff between effectiveness and stability. The evaluation of both effectiveness and stability are often based on a baseline model, which could be weak or biased. In addition, the effectiveness-stability tradeoff has not been systematically or quantitatively evaluated over TREC participated systems. The above two problems, to some extent, limit our awareness of such tradeoff and its impact on developing future IR models. In this paper, motivated by a recently proposed bias-variance based evaluation, we adopt a strong and unbiased “baseline”, which is a virtual target model constructed by the best performance (for each query) among all the participated systems in a retrieval task. We also propose generalized bias variance metrics, based on which a systematic and quantitative evaluation of the effectiveness-stability tradeoff is carried out over the participated systems in the TREC Ad-hoc Track (1993-1999) and Web Track (2010-2012). We observe a clear effectiveness-stability tradeoff, with a trend of becoming more obvious in more recent years. This implies that when we pursue more effective IR systems over years, the stability has become problematic and could have been largely overlooked
    corecore