565 research outputs found

    Knowledge analysis in code mapping graphs

    Get PDF
    Dissertação de mestrado integrado em Engenharia InformáticaOs mapas de código, que podem ser representados como grafos, são usados para descrever completamente um sistema de software. Quando inspecionamos um mapa de código, somos capazes de observar o software de uma nova perspetiva e, portanto, entendê-lo melhor. Por exemplo, podemos analisar o seu comportamento bem como as dependências que possam existir entre os vários elementos do sistema. Nesta dissertação, estudámos um mapa de código, em forma de grafo, contendo dados de controlo de versão, provenientes de projetos que estavam guardados em repositórios Git. O grafo referido contém vários tipos de informação sobre o sistema, incluindo métricas de código, como complexidade, e dados sobre os desenvolvimentos realizados. Uma vez que a estrutura organizacional definida pelos autores que desenvolvem o sistema pode originar problemas de qualidade no código, o nosso estudo concentrou-se nos problemas relacionados com os autores dos programas desenvolvidos, usando principalmente alguns dos seus dados de desenvolvimento. Após explorados os problemas relacionados com os autores, agrupámos os seus dados consoante as suas equipas e analisámos os problemas detetados, nomeadamente fragmentação e perda de conhecimento, tendo como perspetiva de análise a própria equipa de desenvolvimento. Nesse sentido, desenvolvemos um programa que é capaz de detectar os referidos problemas e de os revelar ao utilizador de forma que a sua identificação seja feita quase que instantaneamente, o que, como se sabe, facilita muito a gestão de um projeto de software.Code maps, which can be represented as graphs, are used to describe an entire software system. When we inspect a code map, we are able to observe the software in a new form, and therefore understand it better. For example, we can analyse the software behaviour and dependencies between several elements of the system. In this thesis, we study a code mapping graph that contains version control data of a set of projects located in Git repositories. The referred graph contains various information about the system, not only about code metrics like complexity, but also developers’ development data. Since the organizational structure of the developers that build the system can lead to quality problems in the software, our study focused on developers’ related problems using mainly developers’ development data. After the developers’ related problems are explored, we group the developers’ data into their correspondent teams, and analyse the problems (knowledge loss and fragmentation) once again but now in a team view. In this sense, we developed a program that detects these problems and display them to the user in a way that the identification of the problems is almost done instantly, facilitating the management of a software project

    Code deobfuscation by program synthesis-aided simplification of mixed boolean-arithmetic expressions

    Get PDF
    Treballs Finals de Grau de Matemàtiques, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2020, Director: Raúl Roca Cánovas, Antoni Benseny i Mario Reyes de los Mozos[en] This project studies the theoretical background of Mixed Boolean-Arithmetic (MBA) expressions as well as its practical applicability within the field of code obfuscation, which is a technique used both by malware threats and software protection in order to complicate the process of reverse engineering (parts of) a program. An MBA expression is composed of integer arithmetic operators, e.g. (+,,)(+,-, *) and bitwise operators, e.g. (,,,¬).(\wedge, \vee, \oplus, \neg). MBA expressions can be leveraged to obfuscate the data-flow of code by iteratively applying rewrite rules and function identities that complicate (obfuscate) the initial expression while preserving its semantic behavior. This possibility is motivated by the fact that the combination of operators from these different fields do not interact well together: we have no rules (distributivity, factorization...) or general theory to deal with this mixing of operators. Current deobfuscation techniques to address simplification of this type of data-flow obfuscation are limited by being strongly tied to syntactic complexity. We explore novel program synthesis approaches for addressing simplification of MBA expressions by reasoning on the semantics of the obfuscated expressions instead of syntax, discussing their applicability as well as their limits. We present our own tool rr 2syntia that integrates Syntia, an open source program synthesis tool, into the reverse engineering framework radare 2 in order to retrieve the semantics of obfuscated code from its Input/Output behavior. Finally, we provide some improvement ideas and potential areas for future work to be done

    Natural language software registry (second edition)

    Get PDF

    Automatically Detecting the Resonance of Terrorist Movement Frames on the Web

    Get PDF
    The ever-increasing use of the internet by terrorist groups as a platform for the dissemination of radical, violent ideologies is well documented. The internet has, in this way, become a breeding ground for potential lone-wolf terrorists; that is, individuals who commit acts of terror inspired by the ideological rhetoric emitted by terrorist organizations. These individuals are characterized by their lack of formal affiliation with terror organizations, making them difficult to intercept with traditional intelligence techniques. The radicalization of individuals on the internet poses a considerable threat to law enforcement and national security officials. This new medium of radicalization, however, also presents new opportunities for the interdiction of lone wolf terrorism. This dissertation is an account of the development and evaluation of an information technology (IT) framework for detecting potentially radicalized individuals on social media sites and Web fora. Unifying Collective Action Framing Theory (CAFT) and a radicalization model of lone wolf terrorism, this dissertation analyzes a corpus of propaganda documents produced by several, radically different, terror organizations. This analysis provides the building blocks to define a knowledge model of terrorist ideological framing that is implemented as a Semantic Web Ontology. Using several techniques for ontology guided information extraction, the resultant ontology can be accurately processed from textual data sources. This dissertation subsequently defines several techniques that leverage the populated ontological representation for automatically identifying individuals who are potentially radicalized to one or more terrorist ideologies based on their postings on social media and other Web fora. The dissertation also discusses how the ontology can be queried using intuitive structured query languages to infer triggering events in the news. The prototype system is evaluated in the context of classification and is shown to provide state of the art results. The main outputs of this research are (1) an ontological model of terrorist ideologies (2) an information extraction framework capable of identifying and extracting terrorist ideologies from text, (3) a classification methodology for classifying Web content as resonating the ideology of one or more terrorist groups and (4) a methodology for rapidly identifying news content of relevance to one or more terrorist groups

    Adaptive Big Data Pipeline

    Get PDF
    Over the past three decades, data has exponentially evolved from being a simple software by-product to one of the most important companies’ assets used to understand their customers and foresee trends. Deep learning has demonstrated that big volumes of clean data generally provide more flexibility and accuracy when modeling a phenomenon. However, handling ever-increasing data volumes entail new challenges: the lack of expertise to select the appropriate big data tools for the processing pipelines, as well as the speed at which engineers can take such pipelines into production reliably, leveraging the cloud. We introduce a system called Adaptive Big Data Pipelines: a platform to automate data pipelines creation. It provides an interface to capture the data sources, transformations, destinations and execution schedule. The system builds up the cloud infrastructure, schedules and fine-tunes the transformations, and creates the data lineage graph. This system has been tested on data sets of 50 gigabytes, processing them in just a few minutes without user intervention.ITESO, A. C

    Expanding Social Network Modeling Software and Agent Models for Diffusion Processes

    Get PDF
    In an increasingly digitally interconnected world, the study of social networks and their dynamics is burgeoning. Anthropologically, the ubiquity of online social networks has had striking implications for the condition of large portions of humanity. This technology has facilitated content creation of virtually all sorts, information sharing on an unprecedented scale, and connections and communities among people with similar interests and skills. The first part of my research is a social network evolution and visualization engine. Built on top of existing technologies, my software is designed to provide abstractions from the underlying libraries, drive real-time network evolution based on user-defined parameters, and optionally visualize that evolution at each step of the process. My software provides a low maintenance interface for the creation of networks and update schemes for a wide array of experimental contexts, an engine to drive network evolution, and a visualization platform to provide real-time feedback about different aspects of the network to the researcher, as well as fine-grained debugging tools. We conducted investigations into the opinion dynamics of networks when multiple agent “archetypes” interact together with this platform. We modeled agents’ archetypes with respect to two attributes: their preference over their friends’ opinion profiles, and their tendency to change their opinion over time. We extended the current state of agent modeling in opinion diffusion by providing a unified 2D trajectory/preference space for agents that incorporates most common models in the literature. We investigated six agent archetypes from this space, and examined the behavior of the network as a whole and the individual agents in a variety of contexts. In another branch of work using our software, we developed a network of agents who must carry out both economic and social activities during a pandemic. Agents’ decisions about what actions to take (self-protective measures like masking, social distancing, or waiting to run errands) are based on several factors, including perception of risk (obtained from news reports, social connections, etc.) and economic need. We show with preliminary testing that this platform is able to execute standard pandemic models successfully with the incorporation of the economic and social dimensions, and that this paradigm may provide useful insight into effective agent-level response policies that can be used in concert with other top-down approaches that comprise most of the recent pandemic response research. We have investigated the implications of varying behavior profiles within a network of agents, and how those behavioral compositions affect the overall climate of the network in return, and this software will continue to facilitate similar research into the future

    On the Security of Software Systems and Services

    Get PDF
    This work investigates new methods for facing the security issues and threats arising from the composition of software. This task has been carried out through the formal modelling of both the software composition scenarios and the security properties, i.e., policies, to be guaranteed. Our research moves across three different modalities of software composition which are of main interest for some of the most sensitive aspects of the modern information society. They are mobile applications, trust-based composition and service orchestration. Mobile applications are programs designed for being deployable on remote platforms. Basically, they are the main channel for the distribution and commercialisation of software for mobile devices, e.g., smart phones and tablets. Here we study the security threats that affect the application providers and the hosting platforms. In particular, we present a programming framework for the development of applications with a static and dynamic security support. Also, we implemented an enforcement mechanism for applying fine-grained security controls on the execution of possibly malicious applications. In addition to security, trust represents a pragmatic and intuitive way for managing the interactions among systems. Currently, trust is one of the main factors that human beings keep into account when deciding whether to accept a transaction or not. In our work we investigate the possibility of defining a fully integrated environment for security policies and trust including a runtime monitor. Finally, Service-Oriented Computing (SOC) is the leading technology for business applications distributed over a network. The security issues related to the service networks are many and multi-faceted. We mainly deal with the static verification of secure composition plans of web services. Moreover, we introduce the synthesis of dynamic security checks for protecting the services against illegal invocations

    A theoretically motivated tool for automatically generating command aliases

    No full text
    A useful approach towards improving interface design is to incorporate known HCI theory in design tools. As a step toward this, we have created a tool incorporating several known psychological results (e.g., alias generation rules and the keystroke model). The tool, simple additions to a spreadsheet developed for psychology, helps create theoretically motivated aliases for command line interfaces, and could be further extended to other interface types. It was used to semi-automatically generate a set of aliases for the interface to a cognitive modelling system. These aliases reduce typing time by approximately 50%. Command frequency data, necessary for computing time savings and useful for arbitrating alias clashes, can be difficult to obtain. We found that expert users can quickly provide useful and reasonably consistent estimates, and that the time savings predictions were robust across their predictions and when compared with a uniform command frequency distribution

    HCI models, theories, and frameworks: Toward a multidisciplinary science

    Get PDF
    Motivation The movement of body and limbs is inescapable in human-computer interaction (HCI). Whether browsing the web or intensively entering and editing text in a document, our arms, wrists, and fingers are at work on the keyboard, mouse, and desktop. Our head, neck, and eyes move about attending to feedback marking our progress. This chapter is motivated by the need to match the movement limits, capabilities, and potential of humans with input devices and interaction techniques on computing systems. Our focus is on models of human movement relevant to human-computer interaction. Some of the models discussed emerged from basic research in experimental psychology, whereas others emerged from, and were motivated by, the specific need in HCI to model the interaction between users and physical devices, such as mice and keyboards. As much as we focus on specific models of human movement and user interaction with devices, this chapter is also about models in general. We will say a lot about the nature of models, what they are, and why they are important tools for the research and development of humancomputer interfaces. Overview: Models and Modeling By its very nature, a model is a simplification of reality. However a model is useful only if it helps in designing, evaluating, or otherwise providing a basis for understanding the behaviour of a complex artifact such as a computer system. It is convenient to think of models as lying in a continuum, with analogy and metaphor at one end and mathematical equations at the other. Most models lie somewhere in-between. Toward the metaphoric end are descriptive models; toward the mathematical end are predictive models. These two categories are our particular focus in this chapter, and we shall visit a few examples of each. Two models will be presented in detail and in case studies: Fitts' model of the information processing capability of the human motor system and Guiard's model of bimanual control. Fitts' model is a mathematical expression emerging from the rigors of probability theory. It is a predictive model at the mathematical end of the continuum, to be sure, yet when applied as a model of human movement it has characteristics of a metaphor. Guiard's model emerged from a detailed analysis of how human's use their hands in everyday tasks, such as writing, drawing, playing a sport, or manipulating objects. It is a descriptive model, lacking in mathematical rigor but rich in expressive power
    corecore