34 research outputs found

    Predicting Crime Using Spatial Features

    Full text link
    Our study aims to build a machine learning model for crime prediction using geospatial features for different categories of crime. The reverse geocoding technique is applied to retrieve open street map (OSM) spatial data. This study also proposes finding hotpoints extracted from crime hotspots area found by Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). A spatial distance feature is then computed based on the position of different hotpoints for various types of crime and this value is used as a feature for classifiers. We test the engineered features in crime data from Royal Canadian Mounted Police of Halifax, NS. We observed a significant performance improvement in crime prediction using the new generated spatial features.Comment: Paper accepted to 31st Canadian Conference in Artificial Intelligence, 201

    Framework for Identification and Prevention of Direct and Indirect Discrimination using Data mining

    Get PDF
    Extraction of useful and important information from huge collection of data is known as data mining. Negative social perception about data mining is also there, among which potential privacy invasion and potential discrimination are there. Discrimination involves unequally or unfairly treating people on the basis of their belongings to a specific group. Automated data collection and data mining techniques like classification rule mining have made easier to make automated decisions, like loan granting/denial, insurance premium computation, etc. If the training data sets are biased in what regards discriminatory (sensitive) attributes like age, gender, race, religion, etc., discriminatory decisions may ensue. For this reason, antidiscrimination techniques including discrimination discovery, identification and prevention have been introduced in data mining. Discrimination may of two types, either direct or indirect. Direct discrimination is the one where decisions are taken on basis of sensitive attributes. Indirect discrimination is the one where decisions are made based on non-sensitive attributes which are strongly correlated with biased sensitive ones. In this paper, we are dealing with discrimination prevention in data mining and propose new methods applicable for direct or indirect discrimination prevention individually or both at the same time. We discuss how to clean training data sets and transformed data sets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (non-discriminatory) classification rules. We also propose new measures and metrics to analyse the utility of the proposed approaches and we compare these approaches

    Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search

    Full text link
    We present a framework for quantifying and mitigating algorithmic bias in mechanisms designed for ranking individuals, typically used as part of web-scale search and recommendation systems. We first propose complementary measures to quantify bias with respect to protected attributes such as gender and age. We then present algorithms for computing fairness-aware re-ranking of results. For a given search or recommendation task, our algorithms seek to achieve a desired distribution of top ranked results with respect to one or more protected attributes. We show that such a framework can be tailored to achieve fairness criteria such as equality of opportunity and demographic parity depending on the choice of the desired distribution. We evaluate the proposed algorithms via extensive simulations over different parameter choices, and study the effect of fairness-aware ranking on both bias and utility measures. We finally present the online A/B testing results from applying our framework towards representative ranking in LinkedIn Talent Search, and discuss the lessons learned in practice. Our approach resulted in tremendous improvement in the fairness metrics (nearly three fold increase in the number of search queries with representative results) without affecting the business metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users worldwide. Ours is the first large-scale deployed framework for ensuring fairness in the hiring domain, with the potential positive impact for more than 630M LinkedIn members.Comment: This paper has been accepted for publication at ACM KDD 201

    Multilevel Anti Discrimination and Privacy Preservation Correlativity

    Get PDF
    In the fast growing technology, most organizations will need to reveal their crucial data which includes the sensitive information that discloses one’s identity during their business analytics and to provide services. To limit the access to such sensitive data, various privacy preservation techniques are applied based on the level of priority assumed. The multilevel privacy preserved discrimination free data transmission deals with the correlation of discrimination prevention and privacy preservation. By applying appropriate privacy preservation techniques, it can be shown that the discrimination prevention can be easily accomplished along with secure transmission of data to different levels of users. On the basis of sociological aspect, discrimination is the unfair treatment of an individual or group based on their membership on a particular category. So, the decision attribute that leads to discrimination needs to be hided or transformed. The unified discrimination prevention method is available which avoids both direct and indirect discrimination simultaneously or both at the same time. Although discriminatory biases are eliminated, it results in huge data loss which drops down the data transmission efficiency. The data quality is much preserved since encryption technique is included. The proposed system is dynamic in nature and can be implemented in any organization. The experimental evaluation aids us to conclude that the proposed work is efficient for data transmission without discrimination and with maximum privacy preservation. DOI: 10.17762/ijritcc2321-8169.15070
    corecore