Search CORE

303 research outputs found

Evaluation of preprocessors for neural network speaker verification

Author: Salleh Sheikh-Hussain
Publication venue: The University of Edinburgh
Publication date: 01/01/1997
Field of study

Microphone smart device fingerprinting from video recordings

Author: BESLAY LAURENT
FERRARA PASQUALE
Publication venue: 'Publications Office of the European Union'
Publication date: 16/01/2018
Field of study

This report aims at summarizing the on-going research activity carried out by DG-JRC in the framework of the institutional project Authors and Victims Identification of Child Abuse on-line, concerning the use of microphone fingerprinting for source device classification. Starting from an exhaustive study of the State of Art regarding the matter, this report describes a feasibility study about the adoption of microphone fingerprinting for source identification of video recordings. A set of operational scenarios have been established in collaboration with EUROPOL law enforcers, according to investigators needs. A critical analysis of the obtained results has demonstrated the feasibility of microphone fingerprinting and it has suggested a set of recommendations, both in terms of usability and future researches in the field.JRC.E.3-Cyber and Digital Citizens' Securit

JRC Publications Repository

ROBUST SPEAKER RECOGNITION BASED ON LATENT VARIABLE MODELS

Author: Garcia-Romero Daniel
Publication venue
Publication date: 01/01/2012
Field of study

Automatic speaker recognition in uncontrolled environments is a very challenging task due to channel distortions, additive noise and reverberation. To address these issues, this thesis studies probabilistic latent variable models of short-term spectral information that leverage large amounts of data to achieve robustness in challenging conditions. Current speaker recognition systems represent an entire speech utterance as a single point in a high-dimensional space. This representation is known as "supervector". This thesis starts by analyzing the properties of this representation. A novel visualization procedure of supervectors is presented by which qualitative insight about the information being captured is obtained. We then propose the use of an overcomplete dictionary to explicitly decompose a supervector into a speaker-specific component and an undesired variability component. An algorithm to learn the dictionary from a large collection of data is discussed and analyzed. A subset of the entries of the dictionary is learned to represent speaker-specific information and another subset to represent distortions. After encoding the supervector as a linear combination of the dictionary entries, the undesired variability is removed by discarding the contribution of the distortion components. This paradigm is closely related to the previously proposed paradigm of Joint Factor Analysis modeling of supervectors. We establish a connection between the two approaches and show how our proposed method provides improvements in terms of computation and recognition accuracy. An alternative way to handle undesired variability in supervector representations is to first project them into a lower dimensional space and then to model them in the reduced subspace. This low-dimensional projection is known as "i-vector". Unfortunately, i-vectors exhibit non-Gaussian behavior, and direct statistical modeling requires the use of heavy-tailed distributions for optimal performance. These approaches lack closed-form solutions, and therefore are hard to analyze. Moreover, they do not scale well to large datasets. Instead of directly modeling i-vectors, we propose to first apply a non-linear transformation and then use a linear-Gaussian model. We present two alternative transformations and show experimentally that the transformed i-vectors can be optimally modeled by a simple linear-Gaussian model (factor analysis). We evaluate our method on a benchmark dataset with a large amount of channel variability and show that the results compare favorably against the competitors. Also, our approach has closed-form solutions and scales gracefully to large datasets. Finally, a multi-classifier architecture trained on a multicondition fashion is proposed to address the problem of speaker recognition in the presence of additive noise. A large number of experiments are conducted to analyze the proposed architecture and to obtain guidelines for optimal performance in noisy environments. Overall, it is shown that multicondition training of multi-classifier architectures not only produces great robustness in the anticipated conditions, but also generalizes well to unseen conditions

CiteSeerX

Digital Repository at the University of Maryland

A knowledge acquisition tool to assist case authoring from texts.

Author: Asiimwe Stella Maris
Publication venue
Publication date: 31/03/2009
Field of study

Case-Based Reasoning (CBR) is a technique in Artificial Intelligence where a new problem is solved by making use of the solution to a similar past problem situation. People naturally solve problems in this way, without even thinking about it. For example, an occupational therapist (OT) that assesses the needs of a new disabled person may be reminded of a previous person in terms of their disabilities. He may or may not decide to recommend the same devices based on the outcome of an earlier (disabled) person. Case-based reasoning makes use of a collection of past problem-solving experiences thus enabling users to exploit the information of others successes and failures to solve their own problem(s). This project has developed a CBR tool to assist in matching SmartHouse technology to the needs of the elderly and people with disabilities. The tool makes suggestions of SmartHouse devices that could assist with given impairments. SmartHouse past problem-solving textual reports have been used to obtain knowledge for the CBR system. Creating a case-based reasoning system from textual sources is challenging because it requires that the text be interpreted in a meaningful way in order to create cases that are effective in problem-solving and to be able to reasonably interpret queries. Effective case retrieval and query interpretation is only possible if a domain-specific conceptual model is available and if the different meanings that a word can take can be recognised in the text. Approaches based on methods in information retrieval require large amounts of data and typically result in knowledge-poor representations. The costs become prohibitive if an expert is engaged to manually craft cases or hand tag documents for learning. Furthermore, hierarchically structured case representations are preferred to flat-structured ones for problem-solving because they allow for comparison at different levels of specificity thus resulting in more effective retrieval than flat structured cases. This project has developed SmartCAT-T, a tool that creates knowledge-rich hierarchically structured cases from semi-structured textual reports. SmartCAT-T highlights important phrases in the textual SmartHouse problem-solving reports and uses the phrases to create a conceptual model of the domain. The model then becomes a standard structure onto which each semi-structured SmartHouse report is mapped in order to obtain the correspondingly structured case. SmartCAT-T also relies on an unsupervised methodology that recognises word synonyms in text. The methodology is used to create a uniform vocabulary for the textual reports and the resulting harmonised text is used to create the standard conceptual model of the domain. The technique is also employed in query interpretation during problem solving. SmartCAT-T does not require large sets of tagged data for learning, and the concepts in the conceptual model are interpretable, allowing for expert refinement of knowledge. Evaluation results show that the created cases contain knowledge that is useful for problem solving. An improvement in results is also observed when the text and queries are harmonised. A further evaluation highlights a high potential for the techniques developed in this research to be useful in domains other than SmartHouse. All this has been implemented in the Smarter case-based reasoning system

CiteSeerX

Open Access Institutional Repository at Robert Gordon University

Exploring ICMetrics to detect abnormal program behaviour on embedded devices

Author: Appiah
Dongbing Gu
Gareth Howells
Hanilci
Hopkins
Hopkins
Huosheng Hu
Klaus McDonald-Maier
Kofi Appiah
Kovalchuk
Rahmatian
Shoaib Ehsan
Xiaojun Zhai
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Execution of unknown or malicious software on an embedded system may trigger harmful system behaviour targeted at stealing sensitive data and/or causing damage to the system. It is thus considered a potential and significant threat to the security of embedded systems. Generally, the resource constrained nature of Commercial off-the-shelf (COTS) embedded devices, such as embedded medical equipment, does not allow computationally expensive protection solutions to be deployed on these devices, rendering them vulnerable. A Self-Organising Map (SOM) based and Fuzzy C-means based approaches are proposed in this paper for detecting abnormal program behaviour to boost embedded system security. The presented technique extracts features derived from processor's Program Counter (PC) and Cycles per Instruction (CPI), and then utilises the features to identify abnormal behaviour using the SOM. Results achieved in our experiment show that the proposed SOM based and Fuzzy C-means based methods can identify unknown program behaviours not included in the training set with 90.9% and 98.7% accuracy

University of Essex Research Repository

Crossref

Southampton (e-Prints Soton)

Nottingham Trent Institutional Repository (IRep)

Sheffield Hallam University Research Archive

Kent Academic Repository

UDORA - University of Derby Online Research Archive

Semi-continuous hidden Markov models for automatic speaker verification

Author: Forsyth Mark Eric
Publication venue: The University of Edinburgh
Publication date: 01/01/1995
Field of study

Edinburgh Research Archive

Audio Splicing Detection and Localization Based on Acquisition Device Traces

Author: Aichroth P.
Bestagini P.
Cuccovillo L.
Leonzio D. U.
Marcon M.
Tubaro S.
Publication venue
Publication date: 01/01/2023
Field of study

In recent years, the multimedia forensic community has put a great effort in developing solutions to assess the integrity and authenticity of multimedia objects, focusing especially on manipulations applied by means of advanced deep learning techniques. However, in addition to complex forgeries as the deepfakes, very simple yet effective manipulation techniques not involving any use of state-of-the-art editing tools still exist and prove dangerous. This is the case of audio splicing for speech signals, i.e., to concatenate and combine multiple speech segments obtained from different recordings of a person in order to cast a new fake speech. Indeed, by simply adding a few words to an existing speech we can completely alter its meaning. In this work, we address the overlooked problem of detection and localization of audio splicing from different models of acquisition devices. Our goal is to determine whether an audio track under analysis is pristine, or it has been manipulated by splicing one or multiple segments obtained from different device models. Moreover, if a recording is detected as spliced, we identify where the modification has been introduced in the temporal dimension. The proposed method is based on a Convolutional Neural Network (CNN) that extracts model-specific features from the audio recording. After extracting the features, we determine whether there has been a manipulation through a clustering algorithm. Finally, we identify the point where the modification has been introduced through a distance-measuring technique. The proposed method allows to detect and localize multiple splicing points within a recording

Archivio istituzionale della ricerca - Politecnico di Milano

Optimizing spectral feature based text-Independent speaker recognition

Author: Kinnunen Tomi H.
Publication venue: University of Joensuu
Publication date
Field of study

UEF Electronic Publications