21,995 research outputs found

    Robust audio indexing for Dutch spoken-word collections

    Get PDF
    Abstract—Whereas the growth of storage capacity is in accordance with widely acknowledged predictions, the possibilities to index and access the archives created is lagging behind. This is especially the case in the oral history domain and much of the rich content in these collections runs the risk to remain inaccessible for lack of robust search technologies. This paper addresses the history and development of robust audio indexing technology for searching Dutch spoken-word collections and compares Dutch audio indexing in the well-studied broadcast news domain with an oral-history case-study. It is concluded that despite significant advances in Dutch audio indexing technology and demonstrated applicability in several domains, further research is indispensable for successful automatic disclosure of spoken-word collections

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Experimental and computational applications of microarray technology for malaria eradication in Africa

    Get PDF
    Various mutation assisted drug resistance evolved in Plasmodium falciparum strains and insecticide resistance to female Anopheles mosquito account for major biomedical catastrophes standing against all efforts to eradicate malaria in Sub-Saharan Africa. Malaria is endemic in more than 100 countries and by far the most costly disease in terms of human health causing major losses among many African nations including Nigeria. The fight against malaria is failing and DNA microarray analysis need to keep up the pace in order to unravel the evolving parasite’s gene expression profile which is a pointer to monitoring the genes involved in malaria’s infective metabolic pathway. Huge data is generated and biologists have the challenge of extracting useful information from volumes of microarray data. Expression levels for tens of thousands of genes can be simultaneously measured in a single hybridization experiment and are collectively called a “gene expression profile”. Gene expression profiles can also be used in studying various state of malaria development in which expression profiles of different disease states at different time points are collected and compared to each other to establish a classifying scheme for purposes such as diagnosis and treatments with adequate drugs. This paper examines microarray technology and its application as supported by appropriate software tools from experimental set-up to the level of data analysis. An assessment of the level of microarray technology in Africa, its availability and techniques required for malaria eradication and effective healthcare in Nigeria and Africa in general were also underscored

    From aggregation to interpretation:how assessors judge complex data in a competency-based portfolio

    Get PDF
    While portfolios are increasingly used to assess competence, the validity of such portfolio-based assessments has hitherto remained unconfirmed. The purpose of the present research is therefore to further our understanding of how assessors form judgments when interpreting the complex data included in a competency-based portfolio. Eighteen assessors appraised one of three competency-based mock portfolios while thinking aloud, before taking part in semi-structured interviews. A thematic analysis of the think-aloud protocols and interviews revealed that assessors reached judgments through a 3-phase cyclical cognitive process of acquiring, organizing, and integrating evidence. Upon conclusion of the first cycle, assessors reviewed the remaining portfolio evidence to look for confirming or disconfirming evidence. Assessors were inclined to stick to their initial judgments even when confronted with seemingly disconfirming evidence. Although assessors reached similar final (pass-fail) judgments of students' professional competence, they differed in their information-processing approaches and the reasoning behind their judgments. Differences sprung from assessors' divergent assessment beliefs, performance theories, and inferences about the student. Assessment beliefs refer to assessors' opinions about what kind of evidence gives the most valuable and trustworthy information about the student's competence, whereas assessors' performance theories concern their conceptualizations of what constitutes professional competence and competent performance. Even when using the same pieces of information, assessors furthermore differed with respect to inferences about the student as a person as well as a (future) professional. Our findings support the notion that assessors' reasoning in judgment and decision-making varies and is guided by their mental models of performance assessment, potentially impacting feedback and the credibility of decisions. Our findings also lend further credence to the assertion that portfolios should be judged by multiple assessors who should, moreover, thoroughly substantiate their judgments. Finally, it is suggested that portfolios be designed in such a way that they facilitate the selection of and navigation through the portfolio evidence

    Molecular processes underlying synergistic linuron mineralization in a triple-species bacterial consortium biofilm revealed by differential transcriptomics

    Get PDF
    The proteobacteria Variovorax sp. WDL1, Comamonas testosteroni WDL7, and Hyphomicrobium sulfonivorans WDL6 compose a triple-species consortium that synergistically degrades and grows on the phenylurea herbicide linuron. To acquire a better insight into the interactions between the consortium members and the underlying molecular mechanisms, we compared the transcriptomes of the key biodegrading strains WDL7 and WDL1 grown as biofilms in either isolation or consortium conditions by differential RNAseq analysis. Differentially expressed pathways and cellular systems were inferred using the network-based algorithm PheNetic. Coculturing affected mainly metabolism in WDL1. Significantly enhanced expression of hylA encoding linuron hydrolase was observed. Moreover, differential expression of several pathways involved in carbohydrate, amino acid, nitrogen, and sulfur metabolism was observed indicating that WDL1 gains carbon and energy from linuron indirectly by consuming excretion products from WDL7 and/or WDL6. Moreover, in consortium conditions, WDL1 showed a pronounced stress response and overexpression of cell to cell interaction systems such as quorum sensing, contact-dependent inhibition, and Type VI secretion. Since the latter two systems can mediate interference competition, it prompts the question if synergistic linuron degradation is the result of true adaptive cooperation or rather a facultative interaction between bacteria that coincidentally occupy complementary metabolic niches

    The Spoken British National Corpus 2014:design, compilation and analysis

    Get PDF
    The ESRC-funded Centre for Corpus Approaches to Social Science at Lancaster University (CASS) and the English Language Teaching group at Cambridge University Press (CUP) have compiled a new, publicly-accessible corpus of spoken British English from the 2010s, known as the Spoken British National Corpus 2014 (Spoken BNC2014). The 11.5 million-word corpus, gathered solely in informal contexts, is the first freely-accessible corpus of its kind since the spoken component of the original British National Corpus (the Spoken BNC1994), which, despite its age, is still used as a proxy for present-day English in research today. This thesis presents a detailed account of each stage of the Spoken BNC2014’s construction, including its conception, design, transcription, processing and dissemination. It also demonstrates the research potential of the corpus, by presenting a diachronic analysis of ‘bad language’ in spoken British English, comparing the 1990s to the 2010s. The thesis shows how the research team struck a delicate balance between backwards compatibility with the Spoken BNC1994 and optimal practice in the context of compiling a new corpus. Although comparable with its predecessor, the Spoken BNC2014 is shown to represent innovation in approaches to the compilation of spoken corpora. This thesis makes several useful contributions to the linguistic research community. The Spoken BNC2014 itself should be of use to many researchers, educators and students in the corpus linguistics and English language communities and beyond. In addition, the thesis represents an example of good practice with regards to academic collaboration with a commercial stakeholder. Thirdly, although not a ‘user guide’, the methodological discussions and analysis presented in this thesis are intended to help the Spoken BNC2014 to be as useful to as many people, and for as many purposes, as possible
    • 

    corecore