16,574 research outputs found
De facto anonymised microdata file on income tax statistics 1998
With the data of the de facto anonymised Income Tax Statistics 1998 (FAST 98), the German official statistics are for the first time publishing microdata from the field of fiscal statistics. The scientific community can use these data to analyse politically-relevant questions on the fiscal and transfer system at their own workplace, subject to the premises of article 16 subsection 6 of the Law on Statistics for Federal Purposes, on the basis of "real" assessment data. Passing on individual data to the scientific community is only possible in a de facto anonymised form. This form may impair possibilities for scientific analysis possibilities. So that anonymised data can nevertheless be used by the scientific community, anonymisation must meet two equal challenges: It must firstly guarantee adequate protection of the individual items of data, and secondly it must optimally conserve the possibilities for analysis of the anonymised data. In order to achieve the right balance between these two goals, the Statistical Offices have involved potential scientific users in the anonymisation work in a research project.In the article entitled "De facto anonymised microdata file on income tax statistics 1998", in addition to the anonymisation concept the framework conditions of the project are explained and the analysis possibilities of income tax statistics demonstrated.microdata, de facto anonymization, income tax statistic
Collecting a corpus of Dutch SMS
In this paper we present the first freely available corpus of Dutch text messages containing data originating from the Netherlands and Flanders. This corpus has been collected in the framework of the SoNaR project and constitutes a viable part of this 500-million-word corpus. About 53,000 text messages were collected on a large scale, based on voluntary donations. These messages will be distributed as such. In this paper we focus on the data collection processes involved and after studying the effect of media coverage we show that especially free publicity in newspapers and on social media networks results in more contributions. All SMS are provided with metadata information. Looking at the composition of the corpus, it becomes visible that a small number of people have contributed a large amount of data, in total 272 people have contributed to the corpus during three months. The number of women contributing to the corpus is larger than the number of men, but male contributors submitted larger amounts of data. This corpus will be of paramount importance for sociolinguistic research and normalisation studies
User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy
Recommender systems have become an integral part of many social networks and
extract knowledge from a user's personal and sensitive data both explicitly,
with the user's knowledge, and implicitly. This trend has created major privacy
concerns as users are mostly unaware of what data and how much data is being
used and how securely it is used. In this context, several works have been done
to address privacy concerns for usage in online social network data and by
recommender systems. This paper surveys the main privacy concerns, measurements
and privacy-preserving techniques used in large-scale online social networks
and recommender systems. It is based on historical works on security,
privacy-preserving, statistical modeling, and datasets to provide an overview
of the technical difficulties and problems associated with privacy preserving
in online social networks.Comment: 26 pages, IET book chapter on big data recommender system
‘Is Anthropology Legal?’
In May 2018, the European Union (EU) introduced the General Data Protection Regulation (GDPR) with the aim of increasing transparency in data processing and enhancing the rights of data subjects. Within anthropology, concerns have been raised about how the new legislation will affect ethnographic fieldwork and whether the laws contradict the discipline’s core tenets. To address these questions, the School of Oriental and African Studies (SOAS) at the University of London hosted an event on 25 May 2018 entitled ‘Is Anthropology Legal?’, bringing together researchers and data managers to begin a dialogue about the future of anthropological work in the context of the GDPR. In this article, I report and reflect on the event and on the possible implications for anthropological research within this climate of increasing governance
Open University Learning Analytics dataset
Learning Analytics focuses on the collection and analysis of learners’ data to improve their learning experience by providing informed guidance and to optimise learning materials. To support the research in this area we have developed a dataset, containing data from courses presented at the Open University (OU). What makes the dataset unique is the fact that it contains demographic data together with aggregated clickstream data of students’ interactions in the Virtual Learning Environment (VLE). This enables the analysis of student behaviour, represented by their actions. The dataset contains the information about 22 courses, 32,593 students, their assessment results, and logs of their interactions with the VLE represented by daily summaries of student clicks (10,655,280 entries). The dataset is freely available at https://analyse.kmi.open.ac.uk/open_dataset under a CC-BY 4.0 license
Storage, Use and Access to the Scottish Guthrie Card Collection:Ethical, Legal, and Social Issues
- …
