5,668 research outputs found

    An interactive human centered data science approach towards crime pattern analysis

    Get PDF
    The traditional machine learning systems lack a pathway for a human to integrate their domain knowledge into the underlying machine learning algorithms. The utilization of such systems, for domains where decisions can have serious consequences (e.g. medical decision-making and crime analysis), requires the incorporation of human experts' domain knowledge. The challenge, however, is how to effectively incorporate domain expert knowledge with machine learning algorithms to develop effective models for better decision making. In crime analysis, the key challenge is to identify plausible linkages in unstructured crime reports for the hypothesis formulation. Crime analysts painstakingly perform time-consuming searches of many different structured and unstructured databases to collate these associations without any proper visualization. To tackle these challenges and aiming towards facilitating the crime analysis, in this paper, we examine unstructured crime reports through text mining to extract plausible associations. Specifically, we present associative questioning based searching model to elicit multi-level associations among crime entities. We coupled this model with partition clustering to develop an interactive, human-assisted knowledge discovery and data mining scheme. The proposed human-centered knowledge discovery and data mining scheme for crime text mining is able to extract plausible associations between crimes, identifying crime pattern, grouping similar crimes, eliciting co-offender network and suspect list based on spatial-temporal and behavioral similarity. These similarities are quantified through calculating Cosine, Jacquard, and Euclidean distances. Additionally, each suspect is also ranked by a similarity score in the plausible suspect list. These associations are then visualized through creating a two-dimensional re-configurable crime cluster space along with a bipartite knowledge graph. This proposed scheme also inspects the grand challenge of integrating effective human interaction with the machine learning algorithms through a visualization feedback loop. It allows the analyst to feed his/her domain knowledge including choosing of similarity functions for identifying associations, dynamic feature selection for interactive clustering of crimes and assigning weights to each component of the crime pattern to rank suspects for an unsolved crime. We demonstrate the proposed scheme through a case study using the Anonymized burglary dataset. The scheme is found to facilitate human reasoning and analytic discourse for intelligence analysis

    BlogForever D3.3: Development of the Digital Rights Management Policy

    Get PDF
    This report presents a set of recommended practices and approaches that a future BlogForever repository can use to develop a digital rights management policy. The report outlines core legal aspects of digital rights that might need consideration in developing policies, and what the challenges are, in particular, in relation to web archives and blog archives. These issues are discussed in the context of the digital information life cycle and steps that might be taken within the workflow of the BlogForever platform to facilitate the gathering and management of digital rights information. Further, the reports on interviews with experts in the field highlight current perspectives on rights management and provide empirical support for the recommendations that have been put forward

    Criminal Network Mining and Analysis for Forensic Investigations

    Get PDF
    Criminal network analysis tools are widely used by law enforcement, mainly in cases of organized crime. The data required for a majority of these tools are police records and databases. In many cases, forensically collected data contains valuable information about the suspect’s social network. This information is normally obtained by manual inspection of the collected documents using forensic tools’ queries and other basic search features. The information is then manually entered in the police database. There are no known tools that provide methods to automatically extract social networks from raw documents on behalf of the investigator add them to a knowledge base and then analyze them. In this thesis, we propose a method that is capable of performing these tasks. In our proposed system, we claim three distinct contributions to cyber forensics investigations. The first is by constructing the social network of one or multiple suspects from documents in a file system. Secondly, we provide an analysis of the interactions and structures of these social networks and the communities comprising them. Thirdly, potential evidence and leads are identified by extracting conceptual links between members of the social network across the document set. Finally, the proposed method is implemented and experimental results are obtained to demonstrate the feasibility of the approach

    Beyond the Prediction Paradigm: Challenges for AI in the Struggle Against Organized Crime

    Get PDF
    In the future, audiological rehabilitation of adults with hearing loss will be more available, personalized and thorough due to the possibilities offered by the internet. By using the internet as a platform it is also possible to perform the process of rehabilitation in a cost-effective way. With tailored online rehabilitation programs containing topics such as communication strategies, hearing tactics and how to handle hearing aids it might be possible to foster behavioral changes that will positively affect hearing aid users. Four studies were carried out in this thesis. The first study investigated internet usage among adults with hearing loss. In the second study the administration format, online vs. paper- and pencil, of four standardized questionnaires was evaluated. Finally two randomized controlled trials were performed evaluating the efficacy of online rehabilitation programs including professional guidance by an audiologist. The programs lasted over five weeks and were designed for experienced adult hearing-aid users. The effects of the online programs were compared with the effects of a control group. It can be concluded that the use of computers and the internet overall is at least at the same level for people with hearing loss as for the general age-matched population in Sweden. Furthermore, for three of the four included questionnaires, the participants’ scores remained the same across formats. It is however recommended that the administration format remain consistent across assessment points. Finally, results from the two concluding intervention studies provide preliminary evidence that the internet can be used to deliver education and rehabilitation to experienced hearing aid users who report residual hearing problems and that their problems are reduced by the intervention; however the content and design of the online rehabilitation program requires further investigation

    Complex network tools to enable identification of a criminal community

    Get PDF
    Retrieving criminal ties and mining evidence from an organised crime incident, for example money laundering, has been a difficult task for crime investigators due to the involvement of different groups of people and their complex relationships. Extracting the criminal association from enormous amount of raw data and representing them explicitly is tedious and time consuming. A study of the complex networks literature reveals that graph-based detection methods have not, as yet, been used for money laundering detection. In this research, I explore the use of complex network analysis to identify the money laundering criminals’ communication associations, that is, the important people who communicate between known criminals and the reliance of the known criminals on the other individuals in a communication path. For this purpose, I use the publicly available Enron email database that happens to contain the communications of 10 criminals who were convicted of a money laundering crime. I show that my new shortest paths network search algorithm (SPNSA) combining shortest paths and network centrality measures is better able to isolate and identify criminals’ connections when compared with existing community detection algorithms and k-neighbourhood detection. The SPNSA is validated using three different investigative scenarios and in each scenario, the criminal network graphs formed are small and sparse hence suitable for further investigation. My research starts with isolating emails with ‘BCC’ recipients with a minimum of two recipients bcc-ed. ‘BCC’ recipients are inherently secretive and the email connections imply a trust relationship between sender and ‘BCC’ recipients. There are no studies on the usage of only those emails that have ‘BCC’ recipients to form a trust network, which leads me to analyse the ‘BCC’ email group separately. SPNSA is able to identify the group of criminals and their active intermediaries in this ‘BCC’ trust network. Corroborating this information with published information about the crimes that led to the collapse of Enron yields the discovery of persons of interest that were hidden between criminals, and could have contributed to the money laundering activity. For validation, larger email datasets that comprise of all ‘BCC’ and ‘TO/CC’ email transactions are used. On comparison with existing community detection algorithms, SPNSA is found to perform much better with regards to isolating the sub-networks that contain criminals. I have adapted the betweenness centrality measure to develop a reliance measure. This measure calculates the reliance of a criminal on an intermediate node and ranks the importance level of each intermediate node based on this reliability value. Both SPNSA and the reliance measure could be used as primary investigation tools to investigate connections between criminals in a complex network

    Web Tracking: Mechanisms, Implications, and Defenses

    Get PDF
    This articles surveys the existing literature on the methods currently used by web services to track the user online as well as their purposes, implications, and possible user's defenses. A significant majority of reviewed articles and web resources are from years 2012-2014. Privacy seems to be the Achilles' heel of today's web. Web services make continuous efforts to obtain as much information as they can about the things we search, the sites we visit, the people with who we contact, and the products we buy. Tracking is usually performed for commercial purposes. We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, or yet other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users (price discrimination, assessing financial credibility, determining insurance coverage, government surveillance, and identity theft). For each of the tracking methods, we present possible defenses. Apart from describing the methods and tools used for keeping the personal data away from being tracked, we also present several tools that were used for research purposes - their main goal is to discover how and by which entity the users are being tracked on their desktop computers or smartphones, provide this information to the users, and visualize it in an accessible and easy to follow way. Finally, we present the currently proposed future approaches to track the user and show that they can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference

    Exploration of documents concerning Foundlings in Fafe along XIX Century

    Get PDF
    Dissertação de mestrado integrado em Informatics EngineeringThe abandonment of children and newborns is a problem in our society. In the last few decades, the introduction of contraceptive methods, the development of social programs and family planning were fundamental to control undesirable pregnancies and support families in need. But these developments were not enough to solve the abandonment epidemic. The anonymous abandonment has a dangerous aspect. In order to preserve the family identity, a child is usually left in a public place at night. Since children and newborns are one of the most vulnerable groups in our society, the time between the abandonment and the assistance of the child is potentially deadly. The establishment of public institutions in the past, such as the foundling wheel, was extremely important as a strategy to save lives. These institutions supported the abandoned children, while simultaneously providing a safer abandonment process, without compromising the anonymity of the family. The focus of the Master’s Project discussed in this dissertation is the analysis and processing of nineteenth century documents, concerning the Foundling Wheel of Fafe. The analysis of sample documents is the initial step in the development of an ontology. The ontology has a fundamental role in the organization and structure of the information contained in these historical documents. The identification of concepts and the relationships between them, culminates in a structured knowledge repository. Other important component is the development of a digital platform, where users are able to access the content stored in the knowledge repository and explore the digital archive, which incorporates the digitized version of documents and books from these historical institutions. The development of this project is important for some reasons. Directly, the implementation of a knowledge repository and a digital platform preserves information. These documents are mostly unique records and due to their age and advanced state of degradation, the substitution of the physical by digital access reduces the wear and tear associated to each consultation. Additionally, the digital archive facilitates the dissemination of valuable information. Research groups or the general public are able to use the platform as a tool to discover the past, by performing biographic, cultural or socio-economic studies over documents dated to the ninetieth century.O abandono de crianças e de recém-nascidos é um flagelo da sociedade. Nas últimas décadas, a introdução de métodos contraceptivos e de programas sociais foram essenciais para o desenvolvimento do planeamento familiar. Apesar destes avanços, estes programas não solucionaram a problemática do abandono de crianças e recém-nascidos. Problemas socioeconómicos são o principal factor que explica o abandono. O processo de abandono de crianças possui uma agravante perigosa. De forma a proteger a identidade da família, este processo ocorre normalmente em locais públicos e durante a noite. Como crianças e recém-nascidos constituem um dos grupos mais vulneráveis da sociedade, o tempo entre o abandono da criança e seu salvamento, pode ser demasiado longo e fatal. A casa da roda foi uma instituição introduzida de forma a tornar o processo de abandono anónimo mais seguro. O foco do Projeto de Mestrado discutido nesta dissertação é a análise e tratamento de documentos do século XIX, relativos à Casa da Roda de Fafe preservados pelo Arquivo Municipal de Fafe. A análise documental representa o ponto de partida do processo de desenvolvimento de uma ontologia. A ontologia possui um papel fundamental na organização e estruturação da informação contida nos documentos históricos. O processo de desenvolvimento de uma base de conhecimento consiste na identificação de conceitos e relações existentes nos documentos. Outra componente fundamental deste projecto é o desenvolvimento de uma plataforma digital, que permite utilizadores acederem à base de conhecimento desenvolvida. Os utilizadores podem pesquisar, explorar e adicionar informação à base de conhecimento. O desenvolvimento deste projecto possui importância. De forma imediata, a implementação de uma plataforma digital permite salvaguardar e preservar informação contida nos documentos. Estes documentos são os únicos registos existentes com esse conteúdo e muitos encontram-se num estado avançado de degradação. A substituição de acessos físicos por acessos digitais reduz o desgaste associado a cada consulta. O desenvolvimento da plataforma digital permite disseminar a informação contida na base documental. Investigadores ou o público em geral podem utilizar esta ferramenta com o intuito de realizar estudos biográficos, culturais e sociais sobre este arquivo histórico
    corecore