229 research outputs found

    Homomorphic Encryption for Speaker Recognition: Protection of Biometric Templates and Vendor Model Parameters

    Full text link
    Data privacy is crucial when dealing with biometric data. Accounting for the latest European data privacy regulation and payment service directive, biometric template protection is essential for any commercial application. Ensuring unlinkability across biometric service operators, irreversibility of leaked encrypted templates, and renewability of e.g., voice models following the i-vector paradigm, biometric voice-based systems are prepared for the latest EU data privacy legislation. Employing Paillier cryptosystems, Euclidean and cosine comparators are known to ensure data privacy demands, without loss of discrimination nor calibration performance. Bridging gaps from template protection to speaker recognition, two architectures are proposed for the two-covariance comparator, serving as a generative model in this study. The first architecture preserves privacy of biometric data capture subjects. In the second architecture, model parameters of the comparator are encrypted as well, such that biometric service providers can supply the same comparison modules employing different key pairs to multiple biometric service operators. An experimental proof-of-concept and complexity analysis is carried out on the data from the 2013-2014 NIST i-vector machine learning challenge

    Ontology-based Access Control in Open Scenarios: Applications to Social Networks and the Cloud

    Get PDF
    La integració d'Internet a la societat actual ha fet possible compartir fàcilment grans quantitats d'informació electrònica i recursos informàtics (que inclouen maquinari, serveis informàtics, etc.) en entorns distribuïts oberts. Aquests entorns serveixen de plataforma comuna per a usuaris heterogenis (per exemple, empreses, individus, etc.) on es proporciona allotjament d'aplicacions i sistemes d'usuari personalitzades; i on s'ofereix un accés als recursos compartits des de qualsevol lloc i amb menys esforços administratius. El resultat és un entorn que permet a individus i empreses augmentar significativament la seva productivitat. Com ja s'ha dit, l'intercanvi de recursos en entorns oberts proporciona importants avantatges per als diferents usuaris, però, també augmenta significativament les amenaces a la seva privacitat. Les dades electròniques compartides poden ser explotades per tercers (per exemple, entitats conegudes com "Data Brokers"). Més concretament, aquestes organitzacions poden agregar la informació compartida i inferir certes característiques personals sensibles dels usuaris, la qual cosa pot afectar la seva privacitat. Una manera de del.liar aquest problema consisteix a controlar l'accés dels usuaris als recursos potencialment sensibles. En concret, la gestió de control d'accés regula l'accés als recursos compartits d'acord amb les credencials dels usuaris, el tipus de recurs i les preferències de privacitat dels propietaris dels recursos/dades. La gestió eficient de control d'accés és crucial en entorns grans i dinàmics. D'altra banda, per tal de proposar una solució viable i escalable, cal eliminar la gestió manual de regles i restriccions (en la qual, la majoria de les solucions disponibles depenen), atès que aquesta constitueix una pesada càrrega per a usuaris i administradors . Finalment, la gestió del control d'accés ha de ser intuïtiu per als usuaris finals, que en general no tenen grans coneixements tècnics.La integración de Internet en la sociedad actual ha hecho posible compartir fácilmente grandes cantidades de información electrónica y recursos informáticos (que incluyen hardware, servicios informáticos, etc.) en entornos distribuidos abiertos. Estos entornos sirven de plataforma común para usuarios heterogéneos (por ejemplo, empresas, individuos, etc.) donde se proporciona alojamiento de aplicaciones y sistemas de usuario personalizadas; y donde se ofrece un acceso ubicuo y con menos esfuerzos administrativos a los recursos compartidos. El resultado es un entorno que permite a individuos y empresas aumentar significativamente su productividad. Como ya se ha dicho, el intercambio de recursos en entornos abiertos proporciona importantes ventajas para los distintos usuarios, no obstante, también aumenta significativamente las amenazas a su privacidad. Los datos electrónicos compartidos pueden ser explotados por terceros (por ejemplo, entidades conocidas como “Data Brokers”). Más concretamente, estas organizaciones pueden agregar la información compartida e inferir ciertas características personales sensibles de los usuarios, lo cual puede afectar a su privacidad. Una manera de paliar este problema consiste en controlar el acceso de los usuarios a los recursos potencialmente sensibles. En concreto, la gestión de control de acceso regula el acceso a los recursos compartidos de acuerdo con las credenciales de los usuarios, el tipo de recurso y las preferencias de privacidad de los propietarios de los recursos/datos. La gestión eficiente de control de acceso es crucial en entornos grandes y dinámicos. Por otra parte, con el fin de proponer una solución viable y escalable, es necesario eliminar la gestión manual de reglas y restricciones (en la cual, la mayoría de las soluciones disponibles dependen), dado que ésta constituye una pesada carga para usuarios y administradores. Por último, la gestión del control de acceso debe ser intuitivo para los usuarios finales, que por lo general carecen de grandes conocimientos técnicos.Thanks to the advent of the Internet, it is now possible to easily share vast amounts of electronic information and computer resources (which include hardware, computer services, etc.) in open distributed environments. These environments serve as a common platform for heterogeneous users (e.g., corporate, individuals etc.) by hosting customized user applications and systems, providing ubiquitous access to the shared resources and requiring less administrative efforts; as a result, they enable users and companies to increase their productivity. Unfortunately, sharing of resources in open environments has significantly increased the privacy threats to the users. Indeed, shared electronic data may be exploited by third parties, such as Data Brokers, which may aggregate, infer and redistribute (sensitive) personal features, thus potentially impairing the privacy of the individuals. A way to palliate this problem consists on controlling the access of users over the potentially sensitive resources. Specifically, access control management regulates the access to the shared resources according to the credentials of the users, the type of resource and the privacy preferences of the resource/data owners. The efficient management of access control is crucial in large and dynamic environments such as the ones described above. Moreover, in order to propose a feasible and scalable solution, we need to get rid of manual management of rules/constraints (in which most available solutions rely) that constitutes a serious burden for the users and the administrators. Finally, access control management should be intuitive for the end users, who usually lack technical expertise, and they may find access control mechanism more difficult to understand and rigid to apply due to its complex configuration settings

    Content-Based Access Control

    Get PDF
    In conventional database, the most popular access control model specifies policies explicitly for each role of every user against each data object manually. Nowadays, in large-scale content-centric data sharing, conventional approaches could be impractical due to exponential explosion of the data growth and the sensitivity of data objects. What's more, conventional database access control policy will not be functional when the semantic content of data is expected to play a role in access decisions. Users are often over-privileged, and ex post facto auditing is enforced to detect misuse of the privileges. Unfortunately, it is usually difficult to reverse the damage, as (large amount of) data has been disclosed already. In this dissertation, we first introduce Content-Based Access Control (CBAC), an innovative access control model for content-centric information sharing. As a complement to conventional access control models, the CBAC model makes access control decisions based on the content similarity between user credentials and data content automatically. In CBAC, each user is allowed by a metarule to access "a subset" of the designated data objects of a content-centric database, while the boundary of the subset is dynamically determined by the textual content of data objects. We then present an enforcement mechanism for CBAC that exploits Oracles Virtual Private Database (VPD) to implement a row-wise access control and to prevent data objects from being abused by unnecessary access admission. To further improve the performance of the proposed approach, we introduce a content-based blocking mechanism to improve the efficiency of CBAC enforcement to further reveal a more relevant part of the data objects comparing with only using the user credentials and data content. We also utilized several tagging mechanisms for more accurate textual content matching for short text snippets (e.g. short VarChar attributes) to extract topics other than pure word occurrences to represent the content of data. In the tagging mechanism, the similarity of content is calculated not purely dependent on the word occurrences but the semantic topics underneath the text content. Experimental results show that CBAC makes accurate access control decisions with a small overhead

    Interaction Testing, Fault Location, and Anonymous Attribute-Based Authorization

    Get PDF
    abstract: This dissertation studies three classes of combinatorial arrays with practical applications in testing, measurement, and security. Covering arrays are widely studied in software and hardware testing to indicate the presence of faulty interactions. Locating arrays extend covering arrays to achieve identification of the interactions causing a fault by requiring additional conditions on how interactions are covered in rows. This dissertation introduces a new class, the anonymizing arrays, to guarantee a degree of anonymity by bounding the probability a particular row is identified by the interaction presented. Similarities among these arrays lead to common algorithmic techniques for their construction which this dissertation explores. Differences arising from their application domains lead to the unique features of each class, requiring tailoring the techniques to the specifics of each problem. One contribution of this work is a conditional expectation algorithm to build covering arrays via an intermediate combinatorial object. Conditional expectation efficiently finds intermediate-sized arrays that are particularly useful as ingredients for additional recursive algorithms. A cut-and-paste method creates large arrays from small ingredients. Performing transformations on the copies makes further improvements by reducing redundancy in the composed arrays and leads to fewer rows. This work contains the first algorithm for constructing locating arrays for general values of dd and tt. A randomized computational search algorithmic framework verifies if a candidate array is (dˉ,t)(\bar{d},t)-locating by partitioning the search space and performs random resampling if a candidate fails. Algorithmic parameters determine which columns to resample and when to add additional rows to the candidate array. Additionally, analysis is conducted on the performance of the algorithmic parameters to provide guidance on how to tune parameters to prioritize speed, accuracy, or a combination of both. This work proposes anonymizing arrays as a class related to covering arrays with a higher coverage requirement and constraints. The algorithms for covering and locating arrays are tailored to anonymizing array construction. An additional property, homogeneity, is introduced to meet the needs of attribute-based authorization. Two metrics, local and global homogeneity, are designed to compare anonymizing arrays with the same parameters. Finally, a post-optimization approach reduces the homogeneity of an anonymizing array.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    A structural classification of protein-protein interactions for detection of convergently evolved motifs and for prediction of protein binding sites on sequence level

    Get PDF
    BACKGROUND: A long-standing challenge in the post-genomic era of Bioinformatics is the prediction of protein-protein interactions, and ultimately the prediction of protein functions. The problem is intrinsically harder, when only amino acid sequences are available, but a solution is more universally applicable. So far, the problem of uncovering protein-protein interactions has been addressed in a variety of ways, both experimentally and computationally. MOTIVATION: The central problem is: How can protein complexes with solved threedimensional structure be utilized to identify and classify protein binding sites and how can knowledge be inferred from this classification such that protein interactions can be predicted for proteins without solved structure? The underlying hypothesis is that protein binding sites are often restricted to a small number of residues, which additionally often are well-conserved in order to maintain an interaction. Therefore, the signal-to-noise ratio in binding sites is expected to be higher than in other parts of the surface. This enables binding site detection in unknown proteins, when homology based annotation transfer fails. APPROACH: The problem is addressed by first investigating how geometrical aspects of domain-domain associations can lead to a rigorous structural classification of the multitude of protein interface types. The interface types are explored with respect to two aspects: First, how do interface types with one-sided homology reveal convergently evolved motifs? Second, how can sequential descriptors for local structural features be derived from the interface type classification? Then, the use of sequential representations for binding sites in order to predict protein interactions is investigated. The underlying algorithms are based on machine learning techniques, in particular Hidden Markov Models. RESULTS: This work includes a novel approach to a comprehensive geometrical classification of domain interfaces. Alternative structural domain associations are found for 40% of all family-family interactions. Evaluation of the classification algorithm on a hand-curated set of interfaces yielded a precision of 83% and a recall of 95%. For the first time, a systematic screen of convergently evolved motifs in 102.000 protein-protein interactions with structural information is derived. With respect to this dataset, all cases related to viral mimicry of human interface bindings are identified. Finally, a library of 740 motif descriptors for binding site recognition - encoded as Hidden Markov Models - is generated and cross-validated. Tests for the significance of motifs are provided. The usefulness of descriptors for protein-ligand binding sites is demonstrated for the case of "ATP-binding", where a precision of 89% is achieved, thus outperforming comparable motifs from PROSITE. In particular, a novel descriptor for a P-loop variant has been used to identify ATP-binding sites in 60 protein sequences that have not been annotated before by existing motif databases

    Optimization of Access Control Policies

    Get PDF
    Organizations undertake complex and costly projects to model high-quality Access Control Policies (ACPs). Once built, these policies must be maintained and managed in an ongoing process to keep their quality high. Insufficient maintenance leads to inaccurate authorization decisions and increases the policies’ administrative effort and susceptibility to errors. While the initial modeling of ACPs has received significant research interest, their optimization is not yet covered as broadly. This work provides a theoretical foundation for ACP quality and its optimization. Furthermore, it analyzes how existing research addresses optimization of ACPs with regard to six crucial optimization dimensions. It presents a structured literature survey tracing these optimization dimensions, the contributed research artifact and data requirements. Building on this literature catalogue, this work elaborates on inaccuracies for user permission assignments, data availability, minimal perturbation and recommendation-based optimization

    Extending the Exposure Score of Web Browsers by Incorporating CVSS

    Get PDF
    When browsing the Internet, HTTP headers enable both clients and servers send extra data in their requests or responses such as the User-Agent string. This string contains information related to the sender’s device, browser, and operating system. Yet its content differs from one browser to another. Despite the privacy and security risks of User-Agent strings, very few works have tackled this problem. Our previous work proposed giving Internet browsers exposure relative scores to aid users to choose less intrusive ones. Thus, the objective of this work is to extend our previous work through: first, conducting a user study to identify its limitations. Second, extending the exposure score via incorporating data from the NVD. Third, providing a full implementation, instead of a limited prototype. The proposed system: assigns scores to users’ browsers upon visiting our website. It also suggests alternative safe browsers, and finally it allows updating the back-end database with a click of a button. We applied our method to a data set of more than 52 thousand unique browsers. Our performance and validation analysis show that our solution is accurate and efficient. The source code and data set are publicly available here [4].</p

    Central monitoring system for ambient assisted living

    Get PDF
    Smart homes for aged care enable the elderly to stay in their own homes longer. By means of various types of ambient and wearable sensors information is gathered on people living in smart homes for aged care. This information is then processed to determine the activities of daily living (ADL) and provide vital information to carers. Many examples of smart homes for aged care can be found in literature, however, little or no evidence can be found with respect to interoperability of various sensors and devices along with associated functions. One key element with respect to interoperability is the central monitoring system in a smart home. This thesis analyses and presents key functions and requirements of a central monitoring system. The outcomes of this thesis may benefit developers of smart homes for aged care

    Robust recognition and exploratory analysis of crystal structures using machine learning

    Get PDF
    In den Materialwissenschaften läuten Künstliche-Intelligenz Methoden einen Paradigmenwechsel in Richtung Big-data zentrierter Forschung ein. Datenbanken mit Millionen von Einträgen, sowie hochauflösende Experimente, z.B. Elektronenmikroskopie, enthalten eine Fülle wachsender Information. Um diese ungenützten, wertvollen Daten für die Entdeckung verborgener Muster und Physik zu nutzen, müssen automatische analytische Methoden entwickelt werden. Die Kristallstruktur-Klassifizierung ist essentiell für die Charakterisierung eines Materials. Vorhandene Daten bieten vielfältige atomare Strukturen, enthalten jedoch oft Defekte und sind unvollständig. Eine geeignete Methode sollte diesbezüglich robust sein und gleichzeitig viele Systeme klassifizieren können, was für verfügbare Methoden nicht zutrifft. In dieser Arbeit entwickeln wir ARISE, eine Methode, die auf Bayesian deep learning basiert und mehr als 100 Strukturklassen robust und ohne festzulegende Schwellwerte klassifiziert. Die einfach erweiterbare Strukturauswahl ist breit gefächert und umfasst nicht nur Bulk-, sondern auch zwei- und ein-dimensionale Systeme. Für die lokale Untersuchung von großen, polykristallinen Systemen, führen wir die strided pattern matching Methode ein. Obwohl nur auf perfekte Strukturen trainiert, kann ARISE stark gestörte mono- und polykristalline Systeme synthetischen als auch experimentellen Ursprungs charakterisieren. Das Model basiert auf Bayesian deep learning und ist somit probabilistisch, was die systematische Berechnung von Unsicherheiten erlaubt, welche mit der Kristallordnung von metallischen Nanopartikeln in Elektronentomographie-Experimenten korrelieren. Die Anwendung von unüberwachtem Lernen auf interne Darstellungen des neuronalen Netzes enthüllt Korngrenzen und nicht ersichtliche Regionen, die über interpretierbare geometrische Eigenschaften verknüpft sind. Diese Arbeit ermöglicht die Analyse atomarer Strukturen mit starken Rauschquellen auf bisher nicht mögliche Weise.In materials science, artificial-intelligence tools are driving a paradigm shift towards big data-centric research. Large computational databases with millions of entries and high-resolution experiments such as electron microscopy contain large and growing amount of information. To leverage this under-utilized - yet very valuable - data, automatic analytical methods need to be developed. The classification of the crystal structure of a material is essential for its characterization. The available data is structurally diverse but often defective and incomplete. A suitable method should therefore be robust with respect to sources of inaccuracy, while being able to treat multiple systems. Available methods do not fulfill both criteria at the same time. In this work, we introduce ARISE, a Bayesian-deep-learning based framework that can treat more than 100 structural classes in robust fashion, without any predefined threshold. The selection of structural classes, which can be easily extended on demand, encompasses a wide range of materials, in particular, not only bulk but also two- and one-dimensional systems. For the local study of large, polycrystalline samples, we extend ARISE by introducing so-called strided pattern matching. While being trained on ideal structures only, ARISE correctly characterizes strongly perturbed single- and polycrystalline systems, from both synthetic and experimental resources. The probabilistic nature of the Bayesian-deep-learning model allows to obtain principled uncertainty estimates which are found to be correlated with crystalline order of metallic nanoparticles in electron-tomography experiments. Applying unsupervised learning to the internal neural-network representations reveals grain boundaries and (unapparent) structural regions sharing easily interpretable geometrical properties. This work enables the hitherto hindered analysis of noisy atomic structural data
    corecore