93 research outputs found

    A quick search method for audio signals based on a piecewise linear representation of feature trajectories

    Full text link
    This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users. The method involves feature-dimension reduction based on a piecewise linear representation of a sequential feature trajectory extracted from a long audio stream. Two techniques enable us to obtain a piecewise linear representation: the dynamic segmentation of feature trajectories and the segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method guarantees the same search results as the search method without the proposed feature-dimension reduction method in principle. Experiment results indicate significant improvements in search speed. For example the proposed method reduced the total search time to approximately 1/12 that of previous methods and detected queries in approximately 0.3 seconds from a 200-hour audio database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and Language Processin

    Handling Massive N-Gram Datasets Efficiently

    Get PDF
    This paper deals with the two fundamental problems concerning the handling of large n-gram language models: indexing, that is compressing the n-gram strings and associated satellite data without compromising their retrieval speed; and estimation, that is computing the probability distribution of the strings from a large textual source. Regarding the problem of indexing, we describe compressed, exact and lossless data structures that achieve, at the same time, high space reductions and no time degradation with respect to state-of-the-art solutions and related software packages. In particular, we present a compressed trie data structure in which each word following a context of fixed length k, i.e., its preceding k words, is encoded as an integer whose value is proportional to the number of words that follow such context. Since the number of words following a given context is typically very small in natural languages, we lower the space of representation to compression levels that were never achieved before. Despite the significant savings in space, our technique introduces a negligible penalty at query time. Regarding the problem of estimation, we present a novel algorithm for estimating modified Kneser-Ney language models, that have emerged as the de-facto choice for language modeling in both academia and industry, thanks to their relatively low perplexity performance. Estimating such models from large textual sources poses the challenge of devising algorithms that make a parsimonious use of the disk. The state-of-the-art algorithm uses three sorting steps in external memory: we show an improved construction that requires only one sorting step thanks to exploiting the properties of the extracted n-gram strings. With an extensive experimental analysis performed on billions of n-grams, we show an average improvement of 4.5X on the total running time of the state-of-the-art approach.Comment: Published in ACM Transactions on Information Systems (TOIS), February 2019, Article No: 2

    A User Oriented Image Retrieval System using Halftoning BBTC

    Get PDF
    The objective of this paper is to develop a system for content based image retrieval (CBIR) by accomplishing the benefits of low complexity Ordered Dither Block Truncation Coding based on half toning technique for the generation of image content descriptor. In the encoding step ODBTC compresses an image block into corresponding quantizes and bitmap image. Two image features are proposed to index an image namely co-occurrence features and bitmap patterns which are generated using ODBTC encoded data streams without performing the decoding process. The CCF and BPF of an image are simply derived from the two quantizes and bitmap respectively by including visual codebooks. The proposed system based on block truncation coding image retrieval method is not only convenient for an image compression but it also satisfy the demands of users by offering effective descriptor to index images in CBIR system

    Break-Resilient Codes for Forensic 3D Fingerprinting

    Full text link
    3D printing brings about a revolution in consumption and distribution of goods, but poses a significant risk to public safety. Any individual with internet access and a commodity printer can now produce untraceable firearms, keys, and dangerous counterfeit products. To aid government authorities in combating these new security threats, objects are often tagged with identifying information. This information, also known as fingerprints, is written into the object using various bit embedding techniques, such as varying the width of the molten thermoplastic layers. Yet, due to the adversarial nature of the problem, it is important to devise tamper resilient fingerprinting techniques, so that the fingerprint could be extracted even if the object was damaged. While fingerprinting various forms of digital media (such as videos, images, etc.) has been studied extensively in the past, 3D printing is a relatively new medium which is exposed to different types of adversarial physical tampering that do not exist in the digital world. This paper focuses on one such type of adversarial tampering, where the adversary breaks the object to at most a certain number of parts. This gives rise to a new adversarial coding problem, which is formulated and investigated herein. We survey the existing technology, present an abstract problem definition, provide lower bounds for the required redundancy, and construct a code which attains it up to asymptotically small factors. Notably, the problem bears some resemblance to the torn paper channel, which was recently studied for applications in DNA storage

    Cryptanalysis of the Fuzzy Vault for Fingerprints: Vulnerabilities and Countermeasures

    Get PDF
    Das Fuzzy Vault ist ein beliebter Ansatz, um die Minutien eines menschlichen Fingerabdrucks in einer Sicherheitsanwendung geschützt zu speichern. In dieser Arbeit werden verschiedene Implementationen des Fuzzy Vault für Fingerabdrücke in verschiedenen Angriffsszenarien untersucht. Unsere Untersuchungen und Analysen bestätigen deutlich, dass die größte Schwäche von Implementationen des Fingerabdruck Fuzzy Vaults seine hohe Anfälligkeit gegen False-Accept Angriffe ist. Als Gegenmaßnahme könnten mehrere Finger oder sogar mehrere biometrische Merkmale eines Menschen gleichzeitig verwendet werden. Allerdings besitzen traditionelle Fuzzy Vault Konstruktionen eine wesentliche Schwäche: den Korrelationsangriff. Es ist bekannt, dass das Runden von Minutien auf ein starres System, diese Schwäche beheben. Ausgehend davon schlagen wir eine Implementation vor. Würden nun Parameter traditioneller Konstruktionen übernommen, so würden wir einen signifikanten Verlust an Verifikations-Leistung hinnehmen müssen. In einem Training wird daher eine gute Parameterkonfiguration neu bestimmt. Um den Authentifizierungsaufwand praktikabel zu machen, verwenden wir einen randomisierten Dekodierer und zeigen, dass die erreichbaren Raten vergleichbar mit den Raten einer traditionellen Konstruktion sind. Wir folgern, dass das Fuzzy Vault ein denkbarer Ansatz bleibt, um die schwierige Aufgabe ein kryptographisch sicheres biometrisches Kryptosystem in Zukunft zu implementieren.The fuzzy fingerprint vault is a popular approach to protect a fingerprint's minutiae as a building block of a security application. In this thesis simulations of several attack scenarios are conducted against implementations of the fuzzy fingerprint vault from the literature. Our investigations clearly confirm that the weakest link in the fuzzy fingerprint vault is its high vulnerability to false-accept attacks. Therefore, multi-finger or even multi-biometric cryptosystems should be conceived. But there remains a risk that cannot be resolved by using more biometric information of an individual if features are protected using a traditional fuzzy vault construction: The correlation attack remains a weakness of such constructions. It is known that quantizing minutiae to a rigid system while filling the whole space with chaff makes correlation obsolete. Based on this approach, we propose an implementation. If parameters were adopted from a traditional fuzzy fingerprint vault implementation, we would experience a significant loss in authentication performance. Therefore, we perform a training to determine reasonable parameters for our implementation. Furthermore, to make authentication practical, the decoding procedure is proposed to be randomized. By running a performance evaluation on a dataset generally used, we find that achieving resistance against the correlation attack does not have to be at the cost of authentication performance. Finally, we conclude that fuzzy vault remains a possible construction for helping in solving the challenging task of implementing a cryptographically secure multi-biometric cryptosystem in future

    Framework for privacy-aware content distribution in peer-to- peer networks with copyright protection

    Get PDF
    The use of peer-to-peer (P2P) networks for multimedia distribution has spread out globally in recent years. This mass popularity is primarily driven by the efficient distribution of content, also giving rise to piracy and copyright infringement as well as privacy concerns. An end user (buyer) of a P2P content distribution system does not want to reveal his/her identity during a transaction with a content owner (merchant), whereas the merchant does not want the buyer to further redistribute the content illegally. Therefore, there is a strong need for content distribution mechanisms over P2P networks that do not pose security and privacy threats to copyright holders and end users, respectively. However, the current systems being developed to provide copyright and privacy protection to merchants and end users employ cryptographic mechanisms, which incur high computational and communication costs, making these systems impractical for the distribution of big files, such as music albums or movies.El uso de soluciones de igual a igual (peer-to-peer, P2P) para la distribución multimedia se ha extendido mundialmente en los últimos años. La amplia popularidad de este paradigma se debe, principalmente, a la distribución eficiente de los contenidos, pero también da lugar a la piratería, a la violación del copyright y a problemas de privacidad. Un usuario final (comprador) de un sistema de distribución de contenidos P2P no quiere revelar su identidad durante una transacción con un propietario de contenidos (comerciante), mientras que el comerciante no quiere que el comprador pueda redistribuir ilegalmente el contenido más adelante. Por lo tanto, existe una fuerte necesidad de mecanismos de distribución de contenidos por medio de redes P2P que no supongan un riesgo de seguridad y privacidad a los titulares de derechos y los usuarios finales, respectivamente. Sin embargo, los sistemas actuales que se desarrollan con el propósito de proteger el copyright y la privacidad de los comerciantes y los usuarios finales emplean mecanismos de cifrado que implican unas cargas computacionales y de comunicaciones muy elevadas que convierten a estos sistemas en poco prácticos para distribuir archivos de gran tamaño, tales como álbumes de música o películas.L'ús de solucions d'igual a igual (peer-to-peer, P2P) per a la distribució multimèdia s'ha estès mundialment els darrers anys. L'àmplia popularitat d'aquest paradigma es deu, principalment, a la distribució eficient dels continguts, però també dóna lloc a la pirateria, a la violació del copyright i a problemes de privadesa. Un usuari final (comprador) d'un sistema de distribució de continguts P2P no vol revelar la seva identitat durant una transacció amb un propietari de continguts (comerciant), mentre que el comerciant no vol que el comprador pugui redistribuir il·legalment el contingut més endavant. Per tant, hi ha una gran necessitat de mecanismes de distribució de continguts per mitjà de xarxes P2P que no comportin un risc de seguretat i privadesa als titulars de drets i els usuaris finals, respectivament. Tanmateix, els sistemes actuals que es desenvolupen amb el propòsit de protegir el copyright i la privadesa dels comerciants i els usuaris finals fan servir mecanismes d'encriptació que impliquen unes càrregues computacionals i de comunicacions molt elevades que fan aquests sistemes poc pràctics per a distribuir arxius de grans dimensions, com ara àlbums de música o pel·lícules

    Privacy in Biometric Systems

    Get PDF
    Biometrics are physiological and/or behavioral characteristics of a person that have been used to provide an automatic proof of identity in a growing list of applications including crime/terrorism fighting, forensics, access and border control, securing e-/m-commerce transactions and service entitlements. In recent years, a great deal of research into a variety of new and traditional biometrics has widened the scope of investigations beyond improving accuracy into mechanisms that deal with serious concerns raised about the potential misuse of collected biometric data. Despite the long list of biometrics’ benefits, privacy concerns have become widely shared due to the fact that every time the biometric of a person is checked, a trace is left that could reveal personal and confidential information. In fact, biometric-based recognition has an inherent privacy problem as it relies on capturing, analyzing, and storing personal data about us as individuals. For example, biometric systems deal with data related to the way we look (face, iris), the way we walk (gait), the way we talk (speaker recognition), the way we write (handwriting), the way we type on a keyboard (keystroke), the way we read (eye movement), and many more. Privacy has become a serious concern for the public as biometric systems are increasingly deployed in many applications ranging from accessing our account on a Smartphone or computer to border control and national biometric cards on a very large scale. For example, the Unique Identification Authority of India (UIDAI) has issued 56 million biometric cards as of January 2014 [1], where each biometric card holds templates of the 10 fingers, the two irises and the face. An essential factor behind the growing popularity of biometrics in recent years is the fact that biometric sensors have become a lot cheaper as well as easier to install and handle. CCTV cameras are installed nearly everywhere and almost all Smartphones are equipped with a camera, microphone, fingerprint scanner, and probably very soon, an iris scanner
    corecore