Search CORE

21,995 research outputs found

Robust audio indexing for Dutch spoken-word collections

Author: Huijbregts Marijn
Jong Franciska de
Leeuwen David van
Ordelman Roeland
Publication venue: KNAW
Publication date: 01/01/2005
Field of study

Abstract—Whereas the growth of storage capacity is in accordance with widely acknowledged predictions, the possibilities to index and access the archives created is lagging behind. This is especially the case in the oral history domain and much of the rich content in these collections runs the risk to remain inaccessible for lack of robust search technologies. This paper addresses the history and development of robust audio indexing technology for searching Dutch spoken-word collections and compares Dutch audio indexing in the well-studied broadcast news domain with an oral-history case-study. It is concluded that despite significant advances in Dutch audio indexing technology and demonstrated applicability in several domains, further research is indispensable for successful automatic disclosure of spoken-word collections

University of Twente Research Information

Access to recorded interviews: A research agenda

Author: Heeren W.F.L.
Jong F.M.G. de
Oard D.W.
Ordelman R.J.F.
Publication venue: ACM
Publication date: 01/01/2008
Field of study

Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

University of Twente Research Information

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Experimental and computational applications of microarray technology for malaria eradication in Africa

Author: Osamor V. C.
Publication venue: 'Academic Journals'
Publication date: 01/07/2009
Field of study

Various mutation assisted drug resistance evolved in Plasmodium falciparum strains and insecticide resistance to female Anopheles mosquito account for major biomedical catastrophes standing against all efforts to eradicate malaria in Sub-Saharan Africa. Malaria is endemic in more than 100 countries and by far the most costly disease in terms of human health causing major losses among many African nations including Nigeria. The fight against malaria is failing and DNA microarray analysis need to keep up the pace in order to unravel the evolving parasite’s gene expression profile which is a pointer to monitoring the genes involved in malaria’s infective metabolic pathway. Huge data is generated and biologists have the challenge of extracting useful information from volumes of microarray data. Expression levels for tens of thousands of genes can be simultaneously measured in a single hybridization experiment and are collectively called a “gene expression profile”. Gene expression profiles can also be used in studying various state of malaria development in which expression profiles of different disease states at different time points are collected and compared to each other to establish a classifying scheme for purposes such as diagnosis and treatments with adequate drugs. This paper examines microarray technology and its application as supported by appropriate software tools from experimental set-up to the level of data analysis. An assessment of the level of microarray technology in Africa, its availability and techniques required for malaria eradication and effective healthcare in Nigeria and Africa in general were also underscored

Covenant University Repository

Recommended from our members

MC2: MPEG-7 content modelling communities

Author: Daylamani Zad Damon
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityThe use of multimedia content on the web has grown significantly in recent years. Websites such as Facebook, YouTube and Flickr cater for enormous amounts of multimedia content uploaded by users. This vast amount of multimedia content requires comprehensive content modelling otherwise retrieving relevant content will be challenging. Modelling multimedia content can be an extremely time consuming task that may seem impossible particularly when undertaken by individual users. However, the advent of Web 2.0 and associated communities, such as YouTube and Flickr, has shown that users appear to be more willing to collaborate in order to take on enormous tasks such as multimedia content modelling. Harnessing the power of communities to achieve comprehensive content modelling is the primary focus of this research. The aim of this thesis is to explore collaborative multimedia content modelling and in particular the effectiveness of existing multimedia content modelling tools, taking into account the key development challenges of existing collaborative content modelling research and the associated modelling tools. Four research objectives are pursued in order to achieve this; first, design a user experiment to study users’ tagging behaviour with existing multimedia tagging tools and identify any relationships between such user behaviour; second, design and develop a framework for MPEG-7 content modelling communities based on the results of the experiment; third, implement an online service as a proof of concept of the framework; fourth, validate the framework through the online service during a repeat of the initial user experiment. This research contributes first, a conceptual model of user behaviour visualised as a fuzzy cognitive map and, second, an MPEG-7 framework for multimedia content modelling communities (MC2) and its proof of concept as an online service. The fuzzy cognitive model embodies relationships between user tagging behaviour and context and provides an understanding of user priorities in the description of content features and the relationships that exist between them. The MC2 framework, developed based on the fuzzy cognitive model, is deep-rooted in user content modelling behaviour and content preferences. A proof of concept of the MC2 framework is implemented as an online service in which all metadata is modelled using MPEG-7. The online service is validated, first, empirically with the same group of users and through the same experiment that led to the development of the fuzzy cognitive model and, second, functionally against the folksonomy and MPEG-7 content modelling tools used in the initial experiment. The validation demonstrates that MC2 has the advantages without the shortcomings of existing multimedia tagging tools by harnessing the ease of use of folksonomy tools while producing comprehensive structured metadata.Supported by UK Engineering and Physical Sciences Research Council (EPSRC

Brunel University Research Archive

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU

From aggregation to interpretation:how assessors judge complex data in a competency-based portfolio

Author: Driessen Erik W
Govaerts Marjan J B
Jaarsma Debbie A D C
Oudkerk Pool Andrea
Publication venue
Publication date: 01/05/2018
Field of study

While portfolios are increasingly used to assess competence, the validity of such portfolio-based assessments has hitherto remained unconfirmed. The purpose of the present research is therefore to further our understanding of how assessors form judgments when interpreting the complex data included in a competency-based portfolio. Eighteen assessors appraised one of three competency-based mock portfolios while thinking aloud, before taking part in semi-structured interviews. A thematic analysis of the think-aloud protocols and interviews revealed that assessors reached judgments through a 3-phase cyclical cognitive process of acquiring, organizing, and integrating evidence. Upon conclusion of the first cycle, assessors reviewed the remaining portfolio evidence to look for confirming or disconfirming evidence. Assessors were inclined to stick to their initial judgments even when confronted with seemingly disconfirming evidence. Although assessors reached similar final (pass-fail) judgments of students' professional competence, they differed in their information-processing approaches and the reasoning behind their judgments. Differences sprung from assessors' divergent assessment beliefs, performance theories, and inferences about the student. Assessment beliefs refer to assessors' opinions about what kind of evidence gives the most valuable and trustworthy information about the student's competence, whereas assessors' performance theories concern their conceptualizations of what constitutes professional competence and competent performance. Even when using the same pieces of information, assessors furthermore differed with respect to inferences about the student as a person as well as a (future) professional. Our findings support the notion that assessors' reasoning in judgment and decision-making varies and is guided by their mental models of performance assessment, potentially impacting feedback and the credibility of decisions. Our findings also lend further credence to the assertion that portfolios should be judged by multiple assessors who should, moreover, thoroughly substantiate their judgments. Finally, it is suggested that portfolios be designed in such a way that they facilitate the selection of and navigation through the portfolio evidence

Maastricht University Research Portal

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Molecular processes underlying synergistic linuron mineralization in a triple-species bacterial consortium biofilm revealed by differential transcriptomics

Author: Anders
Aziz
Beliaev
Bers
Bleichrodt
Boon
Breugelmans
Breugelmans
Cerqueda-Garcia
Colwell
Dealtry
Dejonghe
Dwyer
Foster
Frias-Lopez
Garbeva
Garcia
Gonzalez-Torres
Hansen
Horemans
Horemans
Horemans
Horemans
Horinouchi
Ilut
Kanehisa
Kell
Kimes
Krol
Love
MacLean
Maeyer
Nadell
Palama
Perez
Reasoner
Rickard
Rosenthal
Russell
Sabag-Daigle
Sambrook
Shimoyama
Subramoni
Suzuki
Tai
Teramoto
Ward
Wenter
West
Wu
Zerbino
Zgur-Bertok
Zhou
Publication venue: 'Wiley'
Publication date: 01/01/2018
Field of study

The proteobacteria Variovorax sp. WDL1, Comamonas testosteroni WDL7, and Hyphomicrobium sulfonivorans WDL6 compose a triple-species consortium that synergistically degrades and grows on the phenylurea herbicide linuron. To acquire a better insight into the interactions between the consortium members and the underlying molecular mechanisms, we compared the transcriptomes of the key biodegrading strains WDL7 and WDL1 grown as biofilms in either isolation or consortium conditions by differential RNAseq analysis. Differentially expressed pathways and cellular systems were inferred using the network-based algorithm PheNetic. Coculturing affected mainly metabolism in WDL1. Significantly enhanced expression of hylA encoding linuron hydrolase was observed. Moreover, differential expression of several pathways involved in carbohydrate, amino acid, nitrogen, and sulfur metabolism was observed indicating that WDL1 gains carbon and energy from linuron indirectly by consuming excretion products from WDL7 and/or WDL6. Moreover, in consortium conditions, WDL1 showed a pronounced stress response and overexpression of cell to cell interaction systems such as quorum sensing, contact-dependent inhibition, and Type VI secretion. Since the latter two systems can mediate interference competition, it prompts the question if synergistic linuron degradation is the result of true adaptive cooperation or rather a facultative interaction between bacteria that coincidentally occupy complementary metabolic niches

Crossref

Ghent University Academic Bibliography

Archivsystem Ask23

The Spoken British National Corpus 2014:design, compilation and analysis

Author: Love Robbie
Publication venue: Lancaster University
Publication date: 29/01/2018
Field of study

The ESRC-funded Centre for Corpus Approaches to Social Science at Lancaster University (CASS) and the English Language Teaching group at Cambridge University Press (CUP) have compiled a new, publicly-accessible corpus of spoken British English from the 2010s, known as the Spoken British National Corpus 2014 (Spoken BNC2014). The 11.5 million-word corpus, gathered solely in informal contexts, is the first freely-accessible corpus of its kind since the spoken component of the original British National Corpus (the Spoken BNC1994), which, despite its age, is still used as a proxy for present-day English in research today. This thesis presents a detailed account of each stage of the Spoken BNC2014’s construction, including its conception, design, transcription, processing and dissemination. It also demonstrates the research potential of the corpus, by presenting a diachronic analysis of ‘bad language’ in spoken British English, comparing the 1990s to the 2010s. The thesis shows how the research team struck a delicate balance between backwards compatibility with the Spoken BNC1994 and optimal practice in the context of compiling a new corpus. Although comparable with its predecessor, the Spoken BNC2014 is shown to represent innovation in approaches to the compilation of spoken corpora. This thesis makes several useful contributions to the linguistic research community. The Spoken BNC2014 itself should be of use to many researchers, educators and students in the corpus linguistics and English language communities and beyond. In addition, the thesis represents an example of good practice with regards to academic collaboration with a commercial stakeholder. Thirdly, although not a ‘user guide’, the methodological discussions and analysis presented in this thesis are intended to help the Spoken BNC2014 to be as useful to as many people, and for as many purposes, as possible

Lancaster E-Prints