34 research outputs found
Predicting Crime Using Spatial Features
Our study aims to build a machine learning model for crime prediction using
geospatial features for different categories of crime. The reverse geocoding
technique is applied to retrieve open street map (OSM) spatial data. This study
also proposes finding hotpoints extracted from crime hotspots area found by
Hierarchical Density-Based Spatial Clustering of Applications with Noise
(HDBSCAN). A spatial distance feature is then computed based on the position of
different hotpoints for various types of crime and this value is used as a
feature for classifiers. We test the engineered features in crime data from
Royal Canadian Mounted Police of Halifax, NS. We observed a significant
performance improvement in crime prediction using the new generated spatial
features.Comment: Paper accepted to 31st Canadian Conference in Artificial
Intelligence, 201
Framework for Identification and Prevention of Direct and Indirect Discrimination using Data mining
Extraction of useful and important information from huge collection of data is known as data mining. Negative social perception about data mining is also there, among which potential privacy invasion and potential discrimination are there. Discrimination involves unequally or unfairly treating people on the basis of their belongings to a specific group. Automated data collection and data mining techniques like classification rule mining have made easier to make automated decisions, like loan granting/denial, insurance premium computation, etc. If the training data sets are biased in what regards discriminatory (sensitive) attributes like age, gender, race, religion, etc., discriminatory decisions may ensue. For this reason, antidiscrimination techniques including discrimination discovery, identification and prevention have been introduced in data mining. Discrimination may of two types, either direct or indirect. Direct discrimination is the one where decisions are taken on basis of sensitive attributes. Indirect discrimination is the one where decisions are made based on non-sensitive attributes which are strongly correlated with biased sensitive ones. In this paper, we are dealing with discrimination prevention in data mining and propose new methods applicable for direct or indirect discrimination prevention individually or both at the same time. We discuss how to clean training data sets and transformed data sets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (non-discriminatory) classification rules. We also propose new measures and metrics to analyse the utility of the proposed approaches and we compare these approaches
Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search
We present a framework for quantifying and mitigating algorithmic bias in
mechanisms designed for ranking individuals, typically used as part of
web-scale search and recommendation systems. We first propose complementary
measures to quantify bias with respect to protected attributes such as gender
and age. We then present algorithms for computing fairness-aware re-ranking of
results. For a given search or recommendation task, our algorithms seek to
achieve a desired distribution of top ranked results with respect to one or
more protected attributes. We show that such a framework can be tailored to
achieve fairness criteria such as equality of opportunity and demographic
parity depending on the choice of the desired distribution. We evaluate the
proposed algorithms via extensive simulations over different parameter choices,
and study the effect of fairness-aware ranking on both bias and utility
measures. We finally present the online A/B testing results from applying our
framework towards representative ranking in LinkedIn Talent Search, and discuss
the lessons learned in practice. Our approach resulted in tremendous
improvement in the fairness metrics (nearly three fold increase in the number
of search queries with representative results) without affecting the business
metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users
worldwide. Ours is the first large-scale deployed framework for ensuring
fairness in the hiring domain, with the potential positive impact for more than
630M LinkedIn members.Comment: This paper has been accepted for publication at ACM KDD 201
Multilevel Anti Discrimination and Privacy Preservation Correlativity
In the fast growing technology, most organizations will need to reveal their crucial data which includes the sensitive information that discloses one’s identity during their business analytics and to provide services. To limit the access to such sensitive data, various privacy preservation techniques are applied based on the level of priority assumed. The multilevel privacy preserved discrimination free data transmission deals with the correlation of discrimination prevention and privacy preservation. By applying appropriate privacy preservation techniques, it can be shown that the discrimination prevention can be easily accomplished along with secure transmission of data to different levels of users. On the basis of sociological aspect, discrimination is the unfair treatment of an individual or group based on their membership on a particular category. So, the decision attribute that leads to discrimination needs to be hided or transformed. The unified discrimination prevention method is available which avoids both direct and indirect discrimination simultaneously or both at the same time. Although discriminatory biases are eliminated, it results in huge data loss which drops down the data transmission efficiency. The data quality is much preserved since encryption technique is included. The proposed system is dynamic in nature and can be implemented in any organization. The experimental evaluation aids us to conclude that the proposed work is efficient for data transmission without discrimination and with maximum privacy preservation.
DOI: 10.17762/ijritcc2321-8169.15070