701 research outputs found

    Bringing Introspection into BlobSeer: Towards a Self-Adaptive Distributed Data Management System

    Get PDF
    International audienceIntrospection is the prerequisite of an autonomic behavior, the first step towards a performance improvement and a resource-usage optimization for large-scale distributed systems. In Grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, about data access patterns, etc. This paper discusses the requirements for an introspection layer in a data-management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious clients detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and the behavior of the system

    Explainable Information Retrieval: A Survey

    Full text link
    Explainable information retrieval is an emerging research area aiming to make transparent and trustworthy information retrieval systems. Given the increasing use of complex machine learning models in search systems, explainability is essential in building and auditing responsible information retrieval models. This survey fills a vital gap in the otherwise topically diverse literature of explainable information retrieval. It categorizes and discusses recent explainability methods developed for different application domains in information retrieval, providing a common framework and unifying perspectives. In addition, it reflects on the common concern of evaluating explanations and highlights open challenges and opportunities.Comment: 35 pages, 10 figures. Under revie

    Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification

    Get PDF
    Automatic speaker verification (ASV) systems are highly vulnerable to presentation attacks, also called spoofing attacks. Replay is among the simplest attacks to mount - yet difficult to detect reliably. The generalization failure of spoofing countermeasures (CMs) has driven the community to study various alternative deep learning CMs. The majority of them are supervised approaches that learn a human-spoof discriminator. In this paper, we advocate a different, deep generative approach that leverages from powerful unsupervised manifold learning in classification. The potential benefits include the possibility to sample new data, and to obtain insights to the latent features of genuine and spoofed speech. To this end, we propose to use variational autoencoders (VAEs) as an alternative backend for replay attack detection, via three alternative models that differ in their class-conditioning. The first one, similar to the use of Gaussian mixture models (GMMs) in spoof detection, is to train independently two VAEs - one for each class. The second one is to train a single conditional model (C-VAE) by injecting a one-hot class label vector to the encoder and decoder networks. Our final proposal integrates an auxiliary classifier to guide the learning of the latent space. Our experimental results using constant-Q cepstral coefficient (CQCC) features on the ASVspoof 2017 and 2019 physical access subtask datasets indicate that the C-VAE offers substantial improvement in comparison to training two separate VAEs for each class. On the 2019 dataset, the C-VAE outperforms the VAE and the baseline GMM by an absolute 9-10% in both equal error rate (EER) and tandem detection cost function (t-DCF) metrics. Finally, we propose VAE residuals --- the absolute difference of the original input and the reconstruction as features for spoofing detection. The proposed frontend approach augmented with a convolutional neural network classifier demonstrated substantial improvement over the VAE backend use case

    Lexical Strategies of Chinese Learners of English in L1-L2 Translation

    Get PDF
    Es handelt sich hier um eine Dissertation, die sich mit dem Gebrauch von lexikalischen Suchstrategien in der Schriftform von L1 in L2 bei chinesischen Fortgeschrittenen und AnfĂ€ngern mit Vorkenntnissen als Fremdsprache beschĂ€ftigt. Die Absicht der Studie war: (a) die lexikalischen Strategien zu beschreiben und zu analysieren wie englischlernende Chinesen ein Wort oder Satz versuchen zu ĂŒbersetzten, wenn es ihnen nicht möglich ist dieses Wort oder die Bedeutung einer Phrase in einem geschriebenen Text zu Formulieren. (b) zu Untersuchen ob es Beziehungen zwischen den lexikalischen Strategien und den Fertigkeiten entsprechend eines L2 Lerners gibt (c) die EffektivitĂ€t von verschiedenen Strategietypen fĂŒr Gruppen verschiedener Fertigkeitslevels herauszuarbeiten (d) den Vorteil von L2 benutzende Gruppen und individuellen Strategien auszuarbeiten (e) die Diskrepanz zwischen L2 profiency levels und der ÜbersetzungsfĂ€higkeit und die Charakteristik des Denkens und der lexikalischen Strategien der Lernenden beim Übersetzen von L1 in L2 darzustellen. Es wird die Annahme gestellt, dass lautes Denken und retrospektive Studien als Untersuchungsmethode zur Sammlung von empirischen Daten genutzt werden können. Alle Protokolle ĂŒber lautes Denken und retrospektive Daten wurden aufgezeichnet und ausgearbeitet, um die verschieden lexikalen Strategien auszuarbeiten, die englischlernende Chinesen verschiedener Fertigkeitslevels benutzen. Verschiedene Strategien ĂŒber Gruppen verschiedener Fertigkeiten wurden statistisch und die Signifikanz ĂŒber den Gebrauch von verschiedenen lexikalischen Strategien ausgewertet und mit Hilfe von referenzieller Statistik bewiesen. Beim Analysieren der Daten betreffend the theoretical backround of the bilingual mental lexicon ( De Bot, 1993), language transfer (Odlin, 1989; Ringbom, 1987, 1991, 2001) und communication strategies (Bialystok, 1990; Kasper & Kellerman, 1997; Tarone, 1983); diese Ergebnisse wurden in der Studie erzielt: (1) die Taxonomie der lexikalischen Strategien von englischlernenden Chinesen als AnfĂ€ngern mit Vorkenntnissen und Fortgeschrittenen wurde ermittelt. (2) Fortgeschrittene englischlernende Chinesen bevorzugten L2-basierte Strategien, wĂ€hrend AnfĂ€nger mit Vorkenntnissen Strategien aussuchten, die aus ihrer Muttersprache stammten. (3) es gibt die Möglichkeit die sowohl L1- als auch L2-Strategien zu nutzten. (4) es zeigte sich dass, die EffektivitĂ€t der lexikalischen Strategien abhĂ€ngig ist von „the ease of comprehension“ (Littlemore, 2003) (5) Nomen-Plus-Nomen Komposita Strukturen wurden mehr von AnfĂ€ngern mit Vorkenntnissen benutzt, jedoch auch die Fortgeschrittenen bevorzugten diese Techniken, weil die chinesische Sprache sich der Komposita Struktur bedient. (6) L2 proficiency bedeutet nicht unbedingt, dass eine Übersetzung besser sein muss. Dies bedeutet nicht unbedingt, dass die L2 profiency fĂŒr eine höhere QualitĂ€t steht, sondern es gibt auch andere Faktoren. Die Studie kommt zum Schluss, dass Chinesen verschiedener L2 FĂ€higkeiten, die englisch lernen eine Kombination von lexikalischen Strategien benutzen, wobei die Bevorzugung der lexikalischen Strategien sich unterscheiden zwischen den einzelnen Personen und deren Fertigkeiten. Die Lehrimplikationen die diese Studien auslösen, könnten behilflich sein bei der Wortsuche im geistigen Lexikon der Lernenden. So ist das Lehren von lexikalen Strategien erstrebenswert. (Zimmermann, 1999). Obgleich die Ergebnisse der Studie ein besseres Verstehen von L2 Akquisition und Bilingualismus bewirkt, ist diese Studie selbstverstĂ€ndlich begrenzt und benötigt weitere Studien

    Threat Detection with Computer Vision

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsThis document describes the work conducted during an internship experience at the AI Innovation Department of Everis UK (now NTT Data). It reports what was done, learned, and developed with the sole objective of having a commercial product solution for the company's clients. The primary goal was to implement a solution in retail stores, to help assist the security team with threat detection. To do so, the solution consists in deploying trained deep learning models into hardware connected to the CCTV security cameras and detecting in that live feed any potential threats. By the time I started working on this project, was at an advanced stage so I had to study all the work previously done to understand what was needed and properly integrate the team fully. My contribution was focused on the model training process, where I had to create and structure a dataset and train a model capable of detecting the targeted classes quickly and accurately

    Optimizing Real Time fMRI Neurofeedback for Therapeutic Discovery and Development [preprint]

    Get PDF
    While reducing the burden of brain disorders remains a top priority of organizations like the World Health Organization and National Institutes of Health (BRAIN, 2013), the development of novel, safe and effective treatments for brain disorders has been slow. In this paper, we describe the state of the science for an emerging technology, real time functional magnetic resonance imaging (rtfMRI) neurofeedback, in clinical neurotherapeutics. We review the scientific potential of rtfMRI and outline research strategies to optimize the development and application of rtfMRI neurofeedback as a next generation therapeutic tool. We propose that rtfMRI can be used to address a broad range of clinical problems by improving our understanding of brain-behavior relationships in order to develop more specific and effective interventions for individuals with brain disorders. We focus on the use of rtfMRI neurofeedback as a clinical neurotherapeutic tool to drive plasticity in brain function, cognition, and behavior. Our overall goal is for rtfMRI to advance personalized assessment and intervention approaches to enhance resilience and reduce morbidity by correcting maladaptive patterns of brain function in those with brain disorders

    Dynamic data shapers optimize performance in Dynamic Binary Optimization (DBO) environment

    Get PDF
    Processor hardware has been architected with the assumption that most data access patterns would be linearly spatial in nature. But, most applications involve algorithms that are designed with optimal efficiency in mind, which results in non-spatial, multi-dimensional data access. Moreover, this data view or access pattern changes dynamically in different program phases. This results in a mismatch between the processor hardware\u27s view of data and the algorithmic view of data, leading to significant memory access bottlenecks. This variation in data views is especially more pronounced in applications involving large datasets, leading to significantly increased latency and user response times. Previous attempts to tackle this problem were primarily targeted at execution time optimization. We present a dynamic technique piggybacked on the classical dynamic binary optimization (DBO) to shape the data view for each program phase differently resulting in program execution time reduction along with reductions in access energy. Our implementation rearranges non-adjacent data into a contiguous dataview. It uses wrappers to replace irregular data access patterns with spatially local dataview. HDTrans, a runtime dynamic binary optimization framework has been used to perform runtime instrumentation and dynamic data optimization to achieve this goal. This scheme not only ensures a reduced program execution time, but also results in lower energy use. Some of the commonly used benchmarks from the SPEC 2006 suite were profiled to determine irregular data accesses from procedures which contributed heavily to the overall execution time. Wrappers built to replace these accesses with spatially adjacent data led to a significant improvement in the total execution time. On average, 20% reduction in time was achieved along with a 5% reduction in energy
    • 

    corecore