12,677 research outputs found
Forensic Face Recognition: A Survey
Beside a few papers which focus on the forensic aspects of automatic face recognition, there is not much published about it in contrast to the literature on developing new techniques and methodologies for biometric face recognition. In this report, we review forensic facial identification which is the forensic expertsâ way of manual facial comparison. Then we review famous works in the domain of forensic face recognition. Some of these papers describe general trends in forensics [1], guidelines for manual forensic facial comparison and training of face examiners who will be required to verify the outcome of automatic forensic face recognition system [2]. Some proposes theoretical framework for application of face recognition technology in forensics [3] and automatic forensic facial comparison [4, 5]. Bayesian framework is discussed in detail and it is elaborated how it can be adapted to forensic face recognition. Several issues related with court admissibility and reliability of system are also discussed. \ud
Until now, there is no operational system available which automatically compare image of a suspect with mugshot database and provide result usable in court. The fact that biometric face recognition can in most cases be used for forensic purpose is true but the issues related to integration of technology with legal system of court still remain to be solved. There is a great need for research which is multi-disciplinary in nature and which will integrate the face recognition technology with existing legal systems. In this report we present a review of the existing literature in this domain and discuss various aspects and requirements for forensic face recognition systems particularly focusing on Bayesian framework
Why a diagnosis of neurofibromatosis calls for the attention of a deaf educator
This paper will seek to describe neurofibromatosis (NF), the scope of its impact, how NF relates to hearing loss, and why someone with a teacher of the deafâs expertise may have information to offer the intervention team for a child diagnosed with NF
Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi
In this paper we discuss an in-progress work on the development of a speech
corpus for four low-resource Indo-Aryan languages -- Awadhi, Bhojpuri, Braj and
Magahi using the field methods of linguistic data collection. The total size of
the corpus currently stands at approximately 18 hours (approx. 4-5 hours each
language) and it is transcribed and annotated with grammatical information such
as part-of-speech tags, morphological features and Universal dependency
relationships. We discuss our methodology for data collection in these
languages, most of which was done in the middle of the COVID-19 pandemic, with
one of the aims being to generate some additional income for low-income groups
speaking these languages. In the paper, we also discuss the results of the
baseline experiments for automatic speech recognition system in these
languages.Comment: Speech for Social Good Workshop, 2022, Interspeech 202
Voice-controlled in-vehicle infotainment system
Abstract. Speech is a form of a human to human communication that can convey information in a context-rich way that is natural to humans. The naturalness enables us to speak while doing other things, such as driving a vehicle. With the advancement of computing technologies, more and more personal services are introduced for the in-vehicle environment. A limiting factor for these advancements is the impact they cause towards driver distraction with the increased cognitive stress load. This has led to developing in-vehicle devices and applications with a heightened focus on lessening distraction.
Amazon Alexa is a natural language processing system that enables its users to receive information and operate smart devices with their voices. This Masterâs thesis aims to demonstrate how Alexa could be utilized when operating the in-vehicle infotainment (IVI) systems. This research was conducted by utilizing the design science research methodology. The feasibility of voice-based interaction was assessed by implementing the system as a demonstrable use-case in collaboration with the APPSTACLE project. Prior research was gathered by conducting a literature review on voice-based interaction and its integration to the vehicular domain. The system was designed by applying existing theories together with the requirements of the application domain.
The designed system utilized the Amazon Alexa ecosystem and AWS services to provide the vehicular environment with new functionalities. Access to cloud-based speech processing and decision-making makes it possible to design an extendable speech interface where the driver can carry out secondary tasks by using their voice, such as requesting navigation information. The evaluation was done by comparing the systemâs performance against the derived requirements.
With the results of the evaluation process, the feasibility of the system could be assessed against the objectives of the study: The resulting artefact enables the user to operate the in-vehicle infotainment system while focusing on a separate task. The research proved that speech interfaces with modern technology can improve the handling of secondary tasks while driving, and the resulting system was operable without introducing additional distractions to the driver. The resulting artefact can be integrated into similar systems and used as a base tool for future research on voice-controlled interfaces
Biometrics â Developments and Potential
This article describes the use of biometric technology in forensic science, for the development of new methods and tools, improving the current forensic biometric applications, and allowing for the creation of new ones. The article begins with a definition and a summary of the development of this field. It then describes the data and automated biometric modalities of interest in forensic science and the forensic applications embedding biometric technology. On this basis, it describes the solutions and limitations of the current practice regarding the data, the technology, and the inference models. Finally, it proposes research orientations for the improvement of the current forensic biometric applications and suggests some ideas for the development of some new forensic biometric applications
A Formal Framework for Linguistic Annotation
`Linguistic annotation' covers any descriptive or analytic notations applied
to raw language data. The basic data may be in the form of time functions --
audio, video and/or physiological recordings -- or it may be textual. The added
notations may include transcriptions of all sorts (from phonetic features to
discourse structures), part-of-speech and sense tagging, syntactic analysis,
`named entity' identification, co-reference annotation, and so on. While there
are several ongoing efforts to provide formats and tools for such annotations
and to publish annotated linguistic databases, the lack of widely accepted
standards is becoming a critical problem. Proposed standards, to the extent
they exist, have focussed on file formats. This paper focuses instead on the
logical structure of linguistic annotations. We survey a wide variety of
existing annotation formats and demonstrate a common conceptual core, the
annotation graph. This provides a formal framework for constructing,
maintaining and searching linguistic annotations, while remaining consistent
with many alternative data structures and file formats.Comment: 49 page
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a âpiece â of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Open-set Speaker Identification
This study is motivated by the growing need for effective extraction of intelligence and evidence from audio recordings in the fight against crime, a need made ever more apparent with the recent expansion of criminal and terrorist organisations. The main focus is to enhance open-set speaker identification process within the speaker identification systems, which are affected by noisy audio data obtained under uncontrolled environments such as in the street, in restaurants or other places of businesses. Consequently, two investigations are initially carried out including the effects of environmental noise on the accuracy of open-set speaker recognition, which thoroughly cover relevant conditions in the considered application areas, such as variable training data length, background noise and real world noise, and the effects of short and varied duration reference data in open-set speaker recognition.
The investigations led to a novel method termed âvowel boostingâ to enhance the reliability in speaker identification when operating with varied duration speech data under uncontrolled conditions. Vowels naturally contain more speaker specific information. Therefore, by emphasising this natural phenomenon in speech data, it enables better identification performance. The traditional state-of-the-art GMM-UBMs and i-vectors are used to evaluate âvowel boostingâ. The proposed approach boosts the impact of the vowels on the speaker scores, which improves the recognition accuracy for the specific case of open-set identification with short and varied duration of speech material
- âŠ