27 research outputs found

    Approximate Bayesian inference for robust speech processing

    Get PDF
    Speech processing applications such as speech enhancement and speaker identification rely on the estimation of relevant parameters from the speech signal. Theseparameters must often be estimated from noisy observations since speech signals are rarely obtained in ‘clean’ acoustic environments in the real world. As a result, the parameter estimation algorithms we employ must be robust to environmental factors such as additive noise and reverberation. In this work we derive and evaluate approximate Bayesian algorithms for the following speech processing tasks: 1) speech enhancement 2) speaker identification 3) speaker verification and 4) voice activity detection.Building on previous work in the field of statistical model based speech enhancement, we derive speech enhancement algorithms that rely on speaker dependent priors over linear prediction parameters. These speaker dependent priors allow us to handle speech enhancement and speaker identification in a joint framework. Furthermore, we show how these priors allow voice activity detection to be performed in a robust manner.We also develop algorithms in the log spectral domain with applications in robust speaker verification. The use of speaker dependent priors in the log spectral domain is shown to improve equal error rates in noisy environments and to compensate for mismatch between training and testing conditions.Ph.D., Electrical Engineering -- Drexel University, 201

    Raspberry Pi based recording system for acoustic monitoring of bird species

    Get PDF
    Severe degradation of ecosystems due to human encroachment and climate change call for close monitoring of the ecosystems in order to conserve them. Ecosystems have a lot of acoustic data that can be used to study changes taking place in them remotely. In this paper, we present an acoustic system that is based on the Raspberry Pi and is used to collect audio recordings for use in acoustic monitoring of birds. The system has been designed to work optimally in the field. It has been able to collect good quality acoustic data of several bird species during its pilot deployment. Acoustic data collected over a reasonable amount of time will be used to create datasets that will be used in developing machine learning models for automatic classification of bird species. This will offer a tool to provide continuous monitoring of ecosystems.National Research Fund Kenya (NRF

    Inference of RNA Polymerase II Transcription Dynamics from Chromatin Immunoprecipitation Time Course Data

    Get PDF
    Gene transcription mediated by RNA polymerase II (pol-II) is a key step in gene expression. The dynamics of pol-II moving along the transcribed region influence the rate and timing of gene expression. In this work, we present a probabilistic model of transcription dynamics which is fitted to pol-II occupancy time course data measured using ChIP-Seq. The model can be used to estimate transcription speed and to infer the temporal pol-II activity profile at the gene promoter. Model parameters are estimated using either maximum likelihood estimation or via Bayesian inference using Markov chain Monte Carlo sampling. The Bayesian approach provides confidence intervals for parameter estimates and allows the use of priors that capture domain knowledge, e.g. the expected range of transcription speeds, based on previous experiments. The model describes the movement of pol-II down the gene body and can be used to identify the time of induction for transcriptionally engaged genes. By clustering the inferred promoter activity time profiles, we are able to determine which genes respond quickly to stimuli and group genes that share activity profiles and may therefore be co-regulated. We apply our methodology to biological data obtained using ChIP-seq to measure pol-II occupancy genome-wide when MCF-7 human breast cancer cells are treated with estradiol (E2). The transcription speeds we obtain agree with those obtained previously for smaller numbers of genes with the advantage that our approach can be applied genome-wide. We validate the biological significance of the pol-II promoter activity clusters by investigating cluster-specific transcription factor binding patterns and determining canonical pathway enrichment. We find that rapidly induced genes are enriched for both estrogen receptor alpha (ER) and FOXA1 binding in their proximal promoter regions.Peer reviewe

    DSAIL power management board : powering the Raspberry Pi autonomously off the grid

    No full text
    The Raspberry Pi is a credit card sized single board computer that finds its use in very diverse projects. Being a computer it runs on a full operating system and can be interfaced with a wide range of hardware. Its ability to collect and store data and its superior processing capabilities gives it an edge over other microprocessors. When used to collect data away from the grid, alternative methods of powering the Raspberry Pi have to be used. An ideal powering system should be autonomous, allowing the Raspberry Pi to be deployed indefinitely without the need to check on the system due to power shortcomings. In this paper we introduce the DSAIL Power Management Board that is used to power the Raspberry Pi autonomously. We have developed a prototype and used it to collect ecological data from a conservancy in Central Kenya.The Swedish International Development Cooperation Agency (Sida)GoogleThe National Research Fund (NRF) Keny

    Speech recognition datasets for low-resource Congolese languages

    No full text
    Large pre-trained Automatic Speech Recognition (ASR) models have shown improved performance in low-resource languages due to the increased availability of benchmark corpora and the advantages of transfer learning. However, only a limited number of languages possess ample resources to fully leverage transfer learning. In such contexts, benchmark corpora become crucial for advancing methods. In this article, we introduce two new benchmark corpora designed for low-resource languages spoken in the Democratic Republic of the Congo: the Lingala Read Speech Corpus, with 4 h of labelled audio, and the Congolese Speech Radio Corpus, which offers 741 h of unlabelled audio spanning four significant low-resource languages of the region. During data collection, Lingala Read Speech recordings of thirty-two distinct adult speakers, each with a unique context under various settings with different accents, were recorded. Concurrently, Congolese Speech Radio raw data were taken from the archive of broadcast station, followed by a designed curation process. During data preparation, numerous strategies have been utilised for pre-processing the data. The datasets, which have been made freely accessible to all researchers, serve as a valuable resource for not only investigating and developing monolingual methods and approaches that employ linguistically distant languages but also multilingual approaches with linguistically similar languages. Using techniques such as supervised learning and self-supervised learning, they are able to develop inaugural benchmarking of speech recognition systems for Lingala and mark the first instance of a multilingual model tailored for four Congolese languages spoken by an aggregated population of 95 million. Moreover, two models were applied to this dataset. The first is supervised learning modelling and the second is for self-supervised pre-training

    BER Performance of Stratified ACO-OFDM for Optical Wireless Communications over Multipath Channel

    No full text
    In intensity modulation/direct detection- (IM/DD-) based optical OFDM systems, the requirement of the input signal to be real and positive unipolar imposes a reduction of system performances. Among previously proposed unipolar optical OFDM schemes for optical wireless communications (OWC), asymmetrically clipped optical OFDM (ACO-OFDM) and direct current biased optical OFDM (DCO-OFDM) are the most accepted ones. But those proposed schemes experience either spectral efficiency loss or energy efficiency loss which is a big challenge to realize high speed OWC. To improve the spectral and energy efficiencies, we previously proposed a multistratum-based stratified asymmetrically clipped optical OFDM (STACO-OFDM), and its performance was analyzed for AWGN channel. STACO-OFDM utilizes even subcarriers on the first stratum and odd subcarriers on the rest of strata to transmit multiple ACO-OFDM frames simultaneously. STACO-OFDM provides equal spectral efficiency as DCO-OFDM and better spectral efficiency compared to ACO-OFDM. In this paper, we analyze the BER performance of STACO-OFDM under the effect of multipath fading. The theoretical bit error rate (BER) bound is derived and compared with the simulation results, and good agreement is achieved. Moreover, STACO-OFDM shows better BER performance compared to ACO-OFDM and DCO-OFDM

    Cross-Tier Interference Mitigation for RIS-Assisted Heterogeneous Networks

    No full text
    With the development of the next generation of mobile networks, new research challenges have emerged, and new technologies have been proposed to address them. On the other hand, reconfigurable intelligent surface (RIS) technology is being investigated for partially controlling wireless channels. RIS is a promising technology for improving signal quality by controlling the scattering of electromagnetic waves in a nearly passive manner. Heterogeneous networks (HetNets) are another promising technology that is designed to meet the capacity requirements of the network. RIS technology can be used to improve system performance in the context of HetNets. This study investigates the applications of reconfigurable intelligent surfaces (RISs) in heterogeneous downlink networks (HetNets). Due to the network densification, the small cell base station (SBS) interferes with the macrocell users (MUEs). In this paper, we utilise RIS to mitigate cross-tier interference in a HetNet via directional beamforming by adjusting the phase shift of the RIS. We consider RIS-assisted heterogeneous networks consisting of multiple SBS nodes and MUEs that utilise both direct paths and reflected paths. Therefore, the aim of this study is to maximise the sum rate of all MUEs by jointly optimising the transmit beamforming of the macrocell base station (MBS) and the phase shift of the RIS. An efficient RIS reflecting coefficient-based optimisation (RCO) is proposed based on a successive convex approximation approach. Simulation results are provided to show the effectiveness of the proposed scheme in terms of its sum rate in comparison with the scheme HetNet without RIS and the scheme HetNet with RIS but with random phase shifts

    DeKUWC Audio Recordings 2016

    No full text
    We provide acoustic recordings from the Dedan Kimathi University Wildlife Conservancy (DeKUWC) in the Mt. Kenya ecosystem obtained using a low cost acoustic recorder. A total of 2701 minute long recordings are provided including both daytime and nighttime recordings. We present an annotation of a subset of the daytime recordings indicating the bird species present in the recordings. The dataset contains recordings of at least 36 bird species. In addition, the presence of a few nocturnal species within the conservancy is also confirmed

    Low cost, LoRa based river water level data acquisition system

    No full text
    In recent years, climate change and catchment degradation have negatively affected stage patterns in rivers which in turn have affected the availability of enough water for various ecosystems. To realize and quantify the effects of climate change and catchment degradation on rivers, water level monitoring is essential. Various effective infrastructures for river water level monitoring that have been developed and deployed in developing countries over the years, are often bulky, complex and expensive to build and maintain. Additionally, most are not equipped with communication hardware components which can enable wireless data transmission. This paper presents a river water level data acquisition system that improves on the effectiveness, size, deployment design and data transmission capabilities of systems being utilized. The main component of the system is a river water level sensor node. The node is based on the MultiTech mDot – an ARM-Mbed programmable, low power RF module – interfaced with an ultrasonic sensor for data acquisition. The data is transmitted via LoRaWAN and stored on servers. The quality of the stored raw data is controlled using various outlier detection and prediction machine learning models. Simplified firmware and easy to connect hardware make the sensor node design easy to develop. The developed sensor nodes were deployed along River Muringato in Nyeri, Kenya for a period of 18 months for continuous data collection. The results obtained showed that the developed system can practically and accurately obtain data that can be useful for analysis of river catchment areas
    corecore