165 research outputs found

    Searching by approximate personal-name matching

    Get PDF
    We discuss the design, building and evaluation of a method to access theinformation of a person, using his name as a search key, even if it has deformations. We present a similarity function, the DEA function, based on the probabilities of the edit operations accordingly to the involved letters and their position, and using a variable threshold. The efficacy of DEA is quantitatively evaluated, without human relevance judgments, very superior to the efficacy of known methods. A very efficient approximate search technique for the DEA function is also presented based on a compacted trie-tree structure.Postprint (published version

    Funciones de comparación de carácteres para APNM: la distancia DEA

    Get PDF
    A typical application of the ASM (Approximate String Matching) is the matching of personal names, as for example to search people in the DB of an Information System. Through the years, several similarity functions have been proposed:phonetic codes, simple edit distance, n-gram distances, etc.A typical application of the ASM (Approximate String Matching) is the matching of personal names, as for example to search people in the DB of an Information System. Through the years, several similarity functions have been proposed: phonetic codes, simple edit distance, n-gram distances, etc. In this report a function is presented, DEA, having substantially better efficacy than existing ones, and mainly oriented to spanish surnames. The DEA distance is an edit distance, with costs based on the probabilities of the operations, characters and positions. The distance threshold is defined as a function of the lenght of the string. The efficacy of DEA is evaluated objectively, without human relevance judgements.Postprint (published version

    Network Analysis with Stochastic Grammars

    Get PDF
    Digital forensics requires significant manual effort to identify items of evidentiary interest from the ever-increasing volume of data in modern computing systems. One of the tasks digital forensic examiners conduct is mentally extracting and constructing insights from unstructured sequences of events. This research assists examiners with the association and individualization analysis processes that make up this task with the development of a Stochastic Context -Free Grammars (SCFG) knowledge representation for digital forensics analysis of computer network traffic. SCFG is leveraged to provide context to the low-level data collected as evidence and to build behavior profiles. Upon discovering patterns, the analyst can begin the association or individualization process to answer criminal investigative questions. Three contributions resulted from this research. First , domain characteristics suitable for SCFG representation were identified and a step -by- step approach to adapt SCFG to novel domains was developed. Second, a novel iterative graph-based method of identifying similarities in context-free grammars was developed to compare behavior patterns represented as grammars. Finally, the SCFG capabilities were demonstrated in performing association and individualization in reducing the suspect pool and reducing the volume of evidence to examine in a computer network traffic analysis use case

    Dependencies revisited for improving data quality

    Get PDF

    The exact minimum number of triangles in graphs of given order and size

    Get PDF
    What is the minimum number of triangles in a graph of given order and size? Motivated by earlier results of Mantel and Tur\'an, Rademacher solved the first non-trivial case of this problem in 1941. The problem was revived by Erd\H{o}s in 1955; it is now known as the Erd\H{o}s-Rademacher problem. After attracting much attention, it was solved asymptotically in a major breakthrough by Razborov in 2008. In this paper, we provide an exact solution for all large graphs whose edge density is bounded away from~11, which in this range confirms a conjecture of Lov\'asz and Simonovits from 1975. Furthermore, we give a description of the extremal graphs.Comment: Published in Forum of Mathematics, Pi, Volume 8, e8 (2020

    Topics in Programming Languages, a Philosophical Analysis through the case of Prolog

    Get PDF
    [EN]Programming languages seldom find proper anchorage in philosophy of logic, language and science. is more, philosophy of language seems to be restricted to natural languages and linguistics, and even philosophy of logic is rarely framed into programming languages topics. The logic programming paradigm and Prolog are, thus, the most adequate paradigm and programming language to work on this subject, combining natural language processing and linguistics, logic programming and constriction methodology on both algorithms and procedures, on an overall philosophizing declarative status. Not only this, but the dimension of the Fifth Generation Computer system related to strong Al wherein Prolog took a major role. and its historical frame in the very crucial dialectic between procedural and declarative paradigms, structuralist and empiricist biases, serves, in exemplar form, to treat straight ahead philosophy of logic, language and science in the contemporaneous age as well. In recounting Prolog's philosophical, mechanical and algorithmic harbingers, the opportunity is open to various routes. We herein shall exemplify some: - the mechanical-computational background explored by Pascal, Leibniz, Boole, Jacquard, Babbage, Konrad Zuse, until reaching to the ACE (Alan Turing) and EDVAC (von Neumann), offering the backbone in computer architecture, and the work of Turing, Church, Gödel, Kleene, von Neumann, Shannon, and others on computability, in parallel lines, throughly studied in detail, permit us to interpret ahead the evolving realm of programming languages. The proper line from lambda-calculus, to the Algol-family, the declarative and procedural split with the C language and Prolog, and the ensuing branching and programming languages explosion and further delimitation, are thereupon inspected as to relate them with the proper syntax, semantics and philosophical élan of logic programming and Prolog

    Finding Differences in Privilege Protection and their Origin in Role-Based Access Control Implementations

    Get PDF
    Les applications Web sont très courantes, et ont des besoins de sécurité. L’un d’eux est le contrôle d’accès. Le contrôle d’accès s’assure que la politique de sécurité est respectée. Cette politique définit l’accès légitime aux données et aux opérations de l’application. Les applications Web utilisent régulièrement le contrôle d’accès à base de rôles (en anglais, « Role-Based Access Control » ou RBAC). Les politiques de sécurité RBAC permettent aux développeurs de définir des rôles et d’assigner des utilisateurs à ces rôles. De plus, l’assignation des privilèges d’accès se fait au niveau des rôles. Les applications Web évoluent durant leur maintenance et des changements du code source peuvent affecter leur sécurité de manière inattendue. Pour éviter que ces changements engendrent des régressions et des vulnérabilités, les développeurs doivent revalider l’implémentation RBAC de leur application. Ces revalidations peuvent exiger des ressources considérables. De plus, la tâche est compliquée par l’éloignement possible entre le changement et son impact sur la sécurité (e.g. dans des procédures ou fichiers différents). Pour s’attaquer à cette problématique, nous proposons des analyses statiques de programmes autour de la protection garantie des privilèges. Nous générons automatiquement des modèles de protection des privilèges. Pour ce faire, nous utilisons l’analyse de flux par traversement de patron (en anglais, « Pattern Traversal Flow Analysis » ou PTFA) à partir du code source de l’application. En comparant les modèles PTFA de différentes versions, nous déterminons les impacts des changements de code sur la protection des privilèges. Nous appelons ces impacts de sécurité des différences de protection garantie (en anglais, « Definite Protection Difference » ou DPD). En plus de trouver les DPD entre deux versions, nous établissons une classification des différences reposant sur la théorie des ensembles.----------ABSTRACT : Web applications are commonplace, and have security needs. One of these is access control. Access control enforces a security policy that allows and restricts access to information and operations. Web applications often use Role-Based Access Control (RBAC) to restrict operations and protect security-sensitive information and resources. RBAC allows developers to assign users to various roles, and assign privileges to the roles. Web applications undergo maintenance and evolution. Their security may be affected by source code changes between releases. Because these changes may impact security in unexpected ways, developers need to revalidate their RBAC implementation to prevent regressions and vulnerabilities. This may be resource-intensive. This task is complicated by the fact that the code change and its security impact may be distant (e.g. in different functions or files). To address this issue, we propose static program analyses of definite privilege protection. We automatically generate privilege protection models from the source code using Pattern Traversal Flow Analysis (PTFA). Using differences between versions and PTFA models, we determine privilege-level security impacts of code changes using definite protection differences (DPDs) and apply a set-theoretic classification to them. We also compute explanatory counter-examples for DPDs in PTFA models. In addition, we shorten them using graph transformations in order to facilitate their understanding. We define protection-impacting changes (PICs), changed code during evolution that impact privilege protection. We do so using graph reachability and differencing of two versions’ PTFA models. We also identify a superset of source code changes that contain root causes of DPDs by reverting these changes. We survey the distribution of DPDs and their classification over 147 release pairs of Word-Press, spanning from 2.0 to 4.5.1. We found that code changes caused no DPDs in 82 (56%) release pairs. The remaining 65 (44%) release pairs are security-affected. For these release pairs, only 0.30% of code is affected by DPDs on average. We also found that the most common change categories are complete gains (� 41%), complete losses (� 18%) and substitution (� 20%)

    Foundations of secure computation

    Get PDF
    Issued as Workshop proceedings and Final report, Project no. G-36-61
    • …
    corecore