Search CORE

32 research outputs found

Recommended from our members

Exploring Societal Computing based on the Example of Privacy

Author: Sheth Swapneel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

Data privacy when using online systems like Facebook and Amazon has become an increasingly popular topic in the last few years. This thesis will consist of the following four projects that aim to address the issues of privacy and software engineering. First, only a little is known about how users and developers perceive privacy and which concrete measures would mitigate their privacy concerns. To investigate privacy requirements, we conducted an online survey with closed and open questions and collected 408 valid responses. Our results show that users often reduce privacy to security, with data sharing and data breaches being their biggest concerns. Users are more concerned about the content of their documents and their personal data such as location than about their interaction data. Unlike users, developers clearly prefer technical measures like data anonymization and think that privacy laws and policies are less effective. We also observed interesting differences between people from different geographies. For example, people from Europe are more concerned about data breaches than people from North America. People from Asia/Pacific and Europe believe that content and metadata are more critical for privacy than people from North America. Our results contribute to developing a user-driven privacy framework that is based on empirical evidence in addition to the legal, technical, and commercial perspectives. Second, a related challenge to above, is to make privacy more understandable in complex systems that may have a variety of user interface options, which may change often. As social network platforms have evolved, the ability for users to control how and with whom information is being shared introduces challenges concerning the configuration and comprehension of privacy settings. To address these concerns, our crowd sourced approach simplifies the understanding of privacy settings by using data collected from 512 users over a 17 month period to generate visualizations that allow users to compare their personal settings to an arbitrary subset of individuals of their choosing. To validate our approach we conducted an online survey with closed and open questions and collected 59 valid responses after which we conducted follow-up interviews with 10 respondents. Our results showed that 70% of respondents found visualizations using crowd sourced data useful for understanding privacy settings, and 80% preferred a crowd sourced tool for configuring their privacy settings over current privacy controls. Third, as software evolves over time, this might introduce bugs that breach users' privacy. Further, there might be system-wide policy changes that could change users' settings to be more or less private than before. We present a novel technique that can be used by end-users for detecting changes in privacy, i.e., regression testing for privacy. Using a social approach for detecting privacy bugs, we present two prototype tools. Our evaluation shows the feasibility and utility of our approach for detecting privacy bugs. We highlight two interesting case studies on the bugs that were discovered using our tools. To the best of our knowledge, this is the first technique that leverages regression testing for detecting privacy bugs from an end-user perspective. Fourth, approaches to addressing these privacy concerns typically require substantial extra computational resources, which might be beneficial where privacy is concerned, but may have significant negative impact with respect to Green Computing and sustainability, another major societal concern. Spending more computation time results in spending more energy and other resources that make the software system less sustainable. Ideally, what we would like are techniques for designing software systems that address these privacy concerns but which are also sustainable - systems where privacy could be achieved "for free", i.e., without having to spend extra computational effort. We describe how privacy can indeed be achieved for free an accidental and beneficial side effect of doing some existing computation - in web applications and online systems that have access to user data. We show the feasibility, sustainability, and utility of our approach and what types of privacy threats it can mitigate. Finally, we generalize the problem of privacy and its tradeoffs. As Social Computing has increasingly captivated the general public, it has become a popular research area for computer scientists. Social Computing research focuses on online social behavior and using artifacts derived from it for providing recommendations and other useful community knowledge. Unfortunately, some of that behavior and knowledge incur societal costs, particularly with regards to Privacy, which is viewed quite differently by different populations as well as regulated differently in different locales. But clever technical solutions to those challenges may impose additional societal costs, e.g., by consuming substantial resources at odds with Green Computing, another major area of societal concern. We propose a new crosscutting research area, Societal Computing, that focuses on the technical tradeoffs among computational models and application domains that raise significant societal issues. We highlight some of the relevant research topics and open problems that we foresee in Societal Computing. We feel that these topics, and Societal Computing in general, need to gain prominence as they will provide useful avenues of research leading to increasing benefits for society as a whole

Columbia University Academic Commons

Societal Computing

Author: Sheth Swapneel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

As Social Computing has increasingly captivated the general public, it has become a popular research area for computer scientists. Social Computing research focuses on online social behavior and using artifacts derived from it for providing recommendations and other useful community knowledge. Unfortunately, some of that behavior and knowledge incur societal costs, particularly with regards to Privacy, which is viewed quite differently by different populations as well as regulated differently in different locales. But clever technical solutions to those challenges may impose additional societal costs, e.g., by consuming substantial resources at odds with Green Computing, another major area of societal concern. We propose a new crosscutting research area, Societal Computing, that focuses on the technical tradeoffs among computational models and application domains that raise significant societal issues. We highlight some of the relevant research topics and open problems that we foresee in Societal Computing. We feel that these topics, and Societal Computing in general, need to gain prominence as they will provide useful avenues of research leading to increasing benefits for society as a whole. This thesis will consist of the following four projects that aim to address the issues of Societal Computing. First, privacy in the context of ubiquitous social computing systems has become a major concern for society at large. As the number of online social computing systems that collect user data grows, concerns with privacy are further exacerbated. Examples of such online systems include social networks, recommender systems, and so on. Approaches to addressing these privacy concerns typically require substantial extra computational resources, which might be beneficial where privacy is concerned, but may have significant negative impact with respect to Green Computing and sustainability, another major societal concern. Spending more computation time results in spending more energy and other resources that make the software system less sustainable. Ideally, what we would like are techniques for designing software systems that address these privacy concerns but which are also sustainable — systems where privacy could be achieved “for free,” i.e., without having to spend extra computational effort. We describe how privacy can indeed be achieved for free — an accidental and beneficial side effect of doing some existing computation — in web applications and online systems that have access to user data. We show the feasibility, sustainability, and utility of our approach and what types of privacy threats it can mitigate. Second, we aim to understand what the expectations and needs to end-users and software developers are, with respect to privacy in social systems. Some questions that we want to answer are: Do end-users care about privacy? What aspects of privacy are the most important to end-users? Do we need different privacy mechanisms for technical vs. non-technical users? Should we customize privacy settings and systems based on the geographic location of the users? We have created a large scale user study using an online questionnaire to gather privacy requirements from a variety of stakeholders. We also plan to conduct follow-up semistructured interviews. This user study will help us answer these questions. Third, a related challenge to above, is to make privacy more understandable in complex systems that may have a variety of user interface options, which may change often. Our approach is to use crowdsourcing to find out how other users deal with privacy and what settings are commonly used to give users feedback on aspects like how public/private their settings are, what common settings are typically used by others, where do a certain users’ settings differ from a trusted group of friends, etc. We have a large dataset of privacy settings for over 500 users on Facebook and we plan to create a user study that will use the data to make privacy settings more understandable. Finally, end-users of such systems find it increasingly hard to understand complex privacy settings. As software evolves over time, this might introduce bugs that breach users’ privacy. Further, there might be system-wide policy changes that could change users’ settings to be more or less private than before. We present a novel technique that can be used by end-users for detecting changes in privacy, i.e., regression testing for privacy. Using a social approach for detecting privacy bugs, we present two prototype tools. Our evaluation shows the feasibility and utility of our approach for detecting privacy bugs. We highlight two interesting case studies on the bugs that were discovered using our tools. To the best of our knowledge, this is the first technique that leverages regression testing for detecting privacy bugs from an end-user perspective

Crossref

Columbia University Academic Commons

Recommended from our members

CPU Torrent -- CPU Cycle Offloading to Reduce User Wait Time and Provider Resource Requirements

Author: Kaiser Gail E.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2008
Field of study

Developers of novel scientific computing systems are often eager to make their algorithms and databases available for community use, but their own computational resources may be inadequate to fulfill external user demand -- yet the system's footprint is far too large for prospective user organizations to download and run locally. Some heavyweight systems have become part of designated "centers" providing remote access to supercomputers and/or clusters supported by substantial government funding; others use virtual supercomputers dispersed across grids formed by massive numbers of volunteer Internet-connected computers. But public funds are limited and not all systems are amenable to huge-scale divisibility into independent computation units. We have identified a class of scientific computing systems where "utility" sub-jobs can be offloaded to any of several alternative providers thereby freeing up local cycles for the main proprietary jobs, implemented a proof-of-concept framework enabling such deployments, and analyzed its expected throughput and response-time impact on a real-world bioinformatics system (Columbia's PredictProtein) whose present users endure long wait queues

Columbia University Academic Commons

Recommended from our members

The Tradeoffs of Societal Computing

Author: Kaiser Gail E.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

Columbia University Academic Commons

Recommended from our members

Towards Using Cached Data Mining for Large Scale Recommender Systems

Author: Kaiser Gail E.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

Recommender systems are becoming increasingly popular. As these systems become commonplace and the number of users increases, it will become important for these systems to be able to cope with a large and diverse set of users whose recommendation needs may be very different from each other. In particular, large scale recommender systems will need to ensure that users' requests for recommendations can be answered with low response times and high throughput. In this paper, we explore how to use caches and cached data mining to improve the performance of recommender systems by improving throughput and reducing response time for providing recommendations. We describe the structure of our cache, which can be viewed as a prefetch cache that prefetches all types of supported recommendations, and how it is used in our recommender system. We also describe the results of our simulation experiments to measure the efficacy of our cache

Columbia University Academic Commons

Recommended from our members

Us and Them - A Study of Privacy Requirements Across North America, Asia, and Europe

Author: Kaiser Gail E.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

Data privacy when using online systems like Facebook and Amazon has become an increasingly popular topic in the last few years. However, only a little is known about how users and developers perceive privacy and which concrete measures would mitigate privacy concerns. To investigate privacy requirements, we conducted an online survey with closed and open questions and collected 408 valid responses. Our results show that users often reduce privacy to security, with data sharing and data breaches being their biggest concerns. Users are more concerned about the content of their documents and personal data such as location than their interaction data. Unlike users, developers clearly prefer technical measures like data anonymization and think that privacy laws and policies are less effective. We also observed interesting differences between people from different geographies. For example, people from Europe are more concerned about data breaches than people from North America. People from Asia/Pacific and Europe believe that content and metadata are more critical for privacy than people from North America. Our results contribute to developing a user-driven privacy framework that is based on empirical evidence in addition to the legal, technical, and commercial perspectives

Columbia University Academic Commons

Recommended from our members

A Gameful Approach to Teaching Software Design and Software Testing - Assignments and Quests

Author: Bell Jonathan Schaffer
Kaiser Gail E.
Sheth Swapneel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

Introductory CS classes typically do not focus on software testing. A lot of students’ mental model when they start learning programming is that “if it compiles and runs without crashing, it must work fine.” Despite numerous attempts to introduce testing early in CS programs and many known benefits to inculcating good testing habits early in one’s programming life, students remain averse to software testing as there is low student interest in software testing. To address this problem, we used an internally developed research system called HALO — “Highly Addictive sociaLly Optimized Software Engineering”. Our previous work describes early prototypes of HALO; in this paper, we describe how we used it for the CS2 class and the feedback from real users. HALO uses game-like elements and motifs from popular games like World of Warcraft to make the whole software engineering process and in particular, the software testing process, more engaging and social. HALO is not a game; it leverages game mechanics and applies them to the software development process. For example, in HALO, students are given a number of “quests” that they need to complete. These quests are used to disguise standard software testing techniques like white and black box testing, unit testing, and boundary value analysis. Upon completing these quests, the students get social rewards in the form of achievements, titles, and experience points. They can see how they are doing compared to other students in the class. While the students think that they are competing just for points and achievements, the primary benefit of such a system is that the students’ code gets tested a lot better than it normally would have

Columbia University Academic Commons

Effectiveness of Teaching Metamorphic Testing, Part II

Author: Kaiser Gail E.
Mishra Kunal S.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

We study the ability of students in a senior/graduate software engineering course to understand and apply metamorphic testing, a relatively recently invented advance in software testing research that complements conventional approaches such as equivalence partitioning and boundary analysis. We previously reported our investigation of the fall 2011 offering of the Columbia University course COMS W4156 Advanced Software Engineering, and here report on the fall 2012 offering and contrast it to the previous year. Our main findings are: 1) Although the students in the second offering did not do very well on the newly added individual assignment specifically focused on metamorphic testing, thereafter they were better able to find metamorphic properties for their team projects than the students from the previous year who did not have that preliminary homework and, perhaps most significantly, did not have the solution set for that homework. 2) Students in the second offering did reasonably well using the relatively novel metamorphic testing technique vs. traditional black box testing techniques in their projects (such comparison data is not available for the first offering). 3) Finally, in both semesters, the majority of the student teams were able to apply metamorphic testing to their team projects after only minimal instruction, which would imply that metamorphic testing is a viable strategy for student testers

CiteSeerX

Columbia University Academic Commons

Recommended from our members

Money for Nothing and Privacy for Free?

Author: Kaiser Gail E.
Malkin Tal G.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

Privacy in the context of ubiquitous social computing systems has become a major concern for the society at large. As the number of online social computing systems that collect user data grows, this privacy threat is further exacerbated. There has been some work (both, recent and older) on addressing these privacy concerns. These approaches typically require extra computational resources, which might be beneficial where privacy is concerned, but when dealing with Green Computing and sustainability, this is not a great option. Spending more computation time results in spending more energy and more resources that make the software system less sustainable. Ideally, what we would like are techniques for designing software systems that address these privacy concerns but which are also sustainable - systems where privacy could be achieved "for free," i.e., without having to spend extra computational effort. In this paper, we describe how privacy can be achieved for free - an accidental and beneficial side effect of doing some existing computation - and what types of privacy threats it can mitigate. More precisely, we describe a "Privacy for Free" design pattern and show its feasibility, sustainability, and utility in building complex social computing systems

Columbia University Academic Commons

Recommended from our members

A Large-Scale, Longitudinal Study of Player Achievements in World of Warcraft

Author: Bell Jonathan Schaffer
Kaiser Gail E.
Sheth Swapneel Kalpesh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

We present a survey of usage of the popular Massively Multiplayer Online Role Playing Game, World of Warcraft. By mining publicly available data, we collected a dataset consisting of the player history for approximately six million characters, with partial data for another six million characters. This paper focuses on player achievement data in particular, exposing trends in play from this highly successful game. From this data, we present several findings on players' play styles. We correlate achievements with motivations based upon a previously-defined motivation model, and then classify players based on the categories of achievements that they pursued. Experiments show players who fall within each of these buckets can play differently, and that as players progress through game content, their play style evolves as well

Columbia University Academic Commons