92 research outputs found

    Automated call detection for acoustic surveys with structured calls of varying length

    Get PDF
    Funding: Y.W. is partly funded by the China Scholarship Council (CSC) for Ph.D. study at the University of St Andrews, UK.1. When recorders are used to survey acoustically conspicuous species, identification calls of the target species in recordings is essential for estimating density and abundance. We investigate how well deep neural networks identify vocalisations consisting of phrases of varying lengths, each containing a variable number of syllables. We use recordings of Hainan gibbon (Nomascus hainanus) vocalisations to develop and test the methods. 2. We propose two methods for exploiting the two-level structure of such data. The first combines convolutional neural network (CNN) models with a hidden Markov model (HMM) and the second uses a convolutional recurrent neural network (CRNN). Both models learn acoustic features of syllables via a CNN and temporal correlations of syllables into phrases either via an HMM or recurrent network. We compare their performance to commonly used CNNs LeNet and VGGNet, and support vector machine (SVM). We also propose a dynamic programming method to evaluate how well phrases are predicted. This is useful for evaluating performance when vocalisations are labelled by phrases, not syllables. 3. Our methods perform substantially better than the commonly used methods when applied to the gibbon acoustic recordings. The CRNN has an F-score of 90% on phrase prediction, which is 18% higher than the best of the SVM or LeNet and VGGNet methods. HMM post-processing raised the F-score of these last three methods to as much as 87%. The number of phrases is overestimated by CNNs and SVM, leading to error rates between 49% and 54%. With HMM, these error rates can be reduced to 0.4% at the lowest. Similarly, the error rate of CRNN's prediction is no more than 0.5%. 4. CRNNs are better at identifying phrases of varying lengths composed of a varying number of syllables than simpler CNN or SVM models. We find a CRNN model to be best at this task, with a CNN combined with an HMM performing almost as well. We recommend that these kinds of models are used for species whose vocalisations are structured into phrases of varying lengths.Publisher PDFPeer reviewe

    Towards Automated Animal Density Estimation with Acoustic Spatial Capture-Recapture

    Full text link
    Passive acoustic monitoring can be an effective way of monitoring wildlife populations that are acoustically active but difficult to survey visually. Digital recorders allow surveyors to gather large volumes of data at low cost, but identifying target species vocalisations in these data is non-trivial. Machine learning (ML) methods are often used to do the identification. They can process large volumes of data quickly, but they do not detect all vocalisations and they do generate some false positives (vocalisations that are not from the target species). Existing wildlife abundance survey methods have been designed specifically to deal with the first of these mistakes, but current methods of dealing with false positives are not well-developed. They do not take account of features of individual vocalisations, some of which are more likely to be false positives than others. We propose three methods for acoustic spatial capture-recapture inference that integrate individual-level measures of confidence from ML vocalisation identification into the likelihood and hence integrate ML uncertainty into inference. The methods include a mixture model in which species identity is a latent variable. We test the methods by simulation and find that in a scenario based on acoustic data from Hainan gibbons, in which ignoring false positives results in 17% positive bias, our methods give negligible bias and coverage probabilities that are close to the nominal 95% level.Comment: 35 pages, 5 figure

    Molecular state interpretation of charmed baryons in the quark model

    Full text link
    Stimulated by the observation of Λc(2910)+\Lambda_c(2910)^+ by the Belle Collaboration, the SS-wave qqqqˉc (q=u or d)qqq\bar{q}c~(q=u~\text{or}~d) pentaquark systems with II = 0, JPJ^P = 12, 32and 52\frac{1}{2}^-,~\frac{3}{2}^- and~\frac{5}{2}^- are investigated in the framework of quark delocalization color screening model(QDCSM). The real-scaling method is utilized to check the bound states and the genuine resonance states. The root mean square of cluster spacing is also calculated to study the structure of the states and estimate if the state is resonance state or not. The numerical results show that Λc(2910)\Lambda_{c}(2910) cannot be interpreted as a molecular state, and Σc(2800)\Sigma_{c}(2800) cannot be explained as the NDND molecular state with JP=1/2J^P=1/2^-. Λc(2595)\Lambda_{c}(2595) can be interpreted as the molecular state with JP=12J^P=\frac{1}{2}^- and the main component is Σcπ\Sigma_{c}\pi. Λc(2625)\Lambda_{c}(2625) can be interpreted as the molecular state with JP=32J^P=\frac{3}{2}^- and the main component is Σcπ\Sigma_{c}^{*}\pi. Λc(2940)\Lambda_{c}(2940) is likely to be interpreted as a molecular state with JP=3/2J^P=3/2^-, and the main component is NDND^{*}. Besides, two new molecular states are predicted, one is the JP=3/2J^P=3/2^- Σcρ\Sigma_{c}\rho resonance state with the mass around 3140 MeV, another one is the JP=52J^P=\frac{5}{2}^- Σcρ\Sigma_{c}^*\rho with the mass of 3188.3 MeV.Comment: 12 pages, 3 figure

    DynamicRead: Exploring Robust Gaze Interaction Methods for Reading on Handheld Mobile Devices under Dynamic Conditions

    Get PDF
    Enabling gaze interaction in real-time on handheld mobile devices has attracted significant attention in recent years. An increasing number of research projects have focused on sophisticated appearance-based deep learning models to enhance the precision of gaze estimation on smartphones. This inspires important research questions, including how the gaze can be used in a real-time application, and what type of gaze interaction methods are preferable under dynamic conditions in terms of both user acceptance and delivering reliable performance. To address these questions, we design four types of gaze scrolling techniques: three explicit technique based on Gaze Gesture, Dwell time, and Pursuit; and one implicit technique based on reading speed to support touch-free, page-scrolling on a reading application. We conduct a 20-participant user study under both sitting and walking settings and our results reveal that Gaze Gesture and Dwell time-based interfaces are more robust while walking and Gaze Gesture has achieved consistently good scores on usability while not causing high cognitive workload.Comment: Accepted by ETRA 2023 as Full paper, and as journal paper in Proceedings of the ACM on Human-Computer Interactio

    Efficient and durable uranium extraction from uranium mine tailings seepage water via a photoelectrochemical method

    Get PDF
    Current photocatalytic uranium (U) extraction methods have intrinsic obstacles, such as the recombination of charge carriers, and the deactivation of catalysts by extracted U. Here we show that, by applying a bias potential on the photocatalyst, the photoelectrochemical (PEC) method can address these limitations. We demonstrate that, owing to efficient spatial charge-carriers separation driven by the applied bias, the PEC method enables efficient and durable U extraction. The effects of multiple operation conditions are investigated. The U extraction proceeds via single-step one-electron reduction, resulting in the formation of pentavalent U, which can facilitate future studies on this often-overlooked U species. In real seepage water the PEC method achieves an extraction capacity of 0.67 gU m(-3).h(-1) without deactivation for 156 h continuous operation, which is 17 times faster than the photocatalytic method. This work provides an alternative tool for U resource recovery and facilitates future studies on U(V) chemistry
    corecore