11,060 research outputs found

    Assessing and Remedying Coverage for a Given Dataset

    Full text link
    Data analysis impacts virtually every aspect of our society today. Often, this analysis is performed on an existing dataset, possibly collected through a process that the data scientists had limited control over. The existing data analyzed may not include the complete universe, but it is expected to cover the diversity of items in the universe. Lack of adequate coverage in the dataset can result in undesirable outcomes such as biased decisions and algorithmic racism, as well as creating vulnerabilities such as opening up room for adversarial attacks. In this paper, we assess the coverage of a given dataset over multiple categorical attributes. We first provide efficient techniques for traversing the combinatorial explosion of value combinations to identify any regions of attribute space not adequately covered by the data. Then, we determine the least amount of additional data that must be obtained to resolve this lack of adequate coverage. We confirm the value of our proposal through both theoretical analyses and comprehensive experiments on real data.Comment: in ICDE 201

    Does Child Abuse Cause Crime?

    Get PDF
    Child maltreatment, which includes both child abuse and child neglect, is a major social problem. This paper focuses on measuring the effects of child maltreatment on crime using data from the National Longitudinal Study of Adolescent Health (Add Health). We focus on crime because it is one of the most socially costly potential outcomes of maltreatment, and because the proposed mechanisms linking maltreatment and crime are relatively well elucidated in the literature. Our work addresses many limitations of the existing literature on child maltreatment. First, we use a large national sample, and investigate different types of abuse in a similar framework. Second, we pay careful attention to identifying the causal impact of abuse, by using a variety of statistical methods that make differing assumptions. These methods include: Ordinary Least Squares (OLS), propensity score matching estimators, and twin fixed effects. Finally, we examine the extent to which the effects of maltreatment vary with socio-economic status (SES), gender, and the severity of the maltreatment.We find that maltreatment approximately doubles the probability of engaging in many types of crime. Low SES children are both more likely to be mistreated and suffer more damaging effects. Boys are at greater risk than girls, at least in terms of increased propensity to commit crime. Sexual abuse appears to have the largest negative effects, perhaps justifying the emphasis on this type of abuse in the literature. Finally, the probability of engaging in crime increases with the experience of multiple forms of maltreatment as well as the experience of Child Protective Services (CPS) investigation. Working Paper 06-3

    Faculty Research in Progress, 2018-2019

    Get PDF
    The production of scholarly research continues to be one of the primary missions of the ILR School. During a typical academic year, ILR faculty members published or had accepted for publication over 25 books, edited volumes, and monographs, 170 articles and chapters in edited volumes, numerous book reviews. In addition, a large number of manuscripts were submitted for publication, presented at professional association meetings, or circulated in working paper form. Our faculty\u27s research continues to find its way into the very best industrial relations, social science and statistics journal

    Does Child Abuse Cause Crime?

    Get PDF
    Child maltreatment, which includes both child abuse and child neglect, is a major social problem. This paper focuses on measuring the effects of child maltreatment on crime using data from the National Longitudinal Study of Adolescent Health (Add Health). We focus on crime because it is one of the most socially costly potential outcomes of maltreatment, and because the proposed mechanisms linking maltreatment and crime are relatively well elucidated in the literature. Our work addresses many limitations of the existing literature on child maltreatment. First, we use a large national sample, and investigate different types of abuse in a similar framework. Second, we pay careful attention to identifying the causal impact of abuse, by using a variety of statistical methods that make differing assumptions. These methods include: Ordinary Least Squares (OLS), propensity score matching estimators, and twin fixed effects. Finally, we examine the extent to which the effects of maltreatment vary with socio-economic status (SES), gender, and the severity of the maltreatment. We find that maltreatment approximately doubles the probability of engaging in many types of crime. Low SES children are both more likely to be mistreated and suffer more damaging effects. Boys are at greater risk than girls, at least in terms of increased propensity to commit crime. Sexual abuse appears to have the largest negative effects, perhaps justifying the emphasis on this type of abuse in the literature. Finally, the probability of engaging in crime increases with the experience of multiple forms of maltreatment as well as the experience of Child Protective Services (CPS) investigation.

    An interactive human centered data science approach towards crime pattern analysis

    Get PDF
    The traditional machine learning systems lack a pathway for a human to integrate their domain knowledge into the underlying machine learning algorithms. The utilization of such systems, for domains where decisions can have serious consequences (e.g. medical decision-making and crime analysis), requires the incorporation of human experts' domain knowledge. The challenge, however, is how to effectively incorporate domain expert knowledge with machine learning algorithms to develop effective models for better decision making. In crime analysis, the key challenge is to identify plausible linkages in unstructured crime reports for the hypothesis formulation. Crime analysts painstakingly perform time-consuming searches of many different structured and unstructured databases to collate these associations without any proper visualization. To tackle these challenges and aiming towards facilitating the crime analysis, in this paper, we examine unstructured crime reports through text mining to extract plausible associations. Specifically, we present associative questioning based searching model to elicit multi-level associations among crime entities. We coupled this model with partition clustering to develop an interactive, human-assisted knowledge discovery and data mining scheme. The proposed human-centered knowledge discovery and data mining scheme for crime text mining is able to extract plausible associations between crimes, identifying crime pattern, grouping similar crimes, eliciting co-offender network and suspect list based on spatial-temporal and behavioral similarity. These similarities are quantified through calculating Cosine, Jacquard, and Euclidean distances. Additionally, each suspect is also ranked by a similarity score in the plausible suspect list. These associations are then visualized through creating a two-dimensional re-configurable crime cluster space along with a bipartite knowledge graph. This proposed scheme also inspects the grand challenge of integrating effective human interaction with the machine learning algorithms through a visualization feedback loop. It allows the analyst to feed his/her domain knowledge including choosing of similarity functions for identifying associations, dynamic feature selection for interactive clustering of crimes and assigning weights to each component of the crime pattern to rank suspects for an unsolved crime. We demonstrate the proposed scheme through a case study using the Anonymized burglary dataset. The scheme is found to facilitate human reasoning and analytic discourse for intelligence analysis

    Securing Public Places with PCA Based Recognition of Criminal Faces Detected from Surveillance CCTV Footage

    Get PDF
    This paper aims in ensuring the safety of common people at public places by using the existing CCTV systems which are deployed for the surveillance and to determine the security of that place, by identifying the suspicious faces that are captured and notifying the officials. The existing video surveillance systems capture data through CCTVs and store it in their database. After an unpredicted incident already taken place, these databases are used to recognize the culprit. Instead of this, the proposed system keeps a track of the live videos and exacts out frames from it after a fixed interval of time. These frames are then used to fetch faces and compare them with the criminal faces which are already stored in a suspicious faces database, using the feature extraction technique. If the comparison is successful, an alarm is generated which gives an alert about the presence of a criminal at that place. Various face detection algorithms and recognition techniques are used to identify the suspicious face in the crowd and enhances the safety of the public places
    • …
    corecore