2,004 research outputs found
Generating What-If Scenarios for Time Series Data
Time series data has become a ubiquitous and important data source in many application domains. Most companies and organizations strongly rely on this data for critical tasks like decision-making, planning, predictions, and analytics in general. While all these tasks generally focus on actual data representing organization and business processes, it is also desirable to apply them to alternative scenarios in order to prepare for developments that diverge from expectations or assess the robustness of current strategies. When it comes to the construction of such what-if scenarios, existing tools either focus on scalar data or they address highly specific scenarios. In this work, we propose a generally applicable and easy-to-use method for the generation of what-if scenarios on time series data. Our approach extracts descriptive features of a data set and allows the construction of an alternate version by means of filtering and modification of these features
Rankers, Rankees, & Rankings: Peeking into the Pandora's Box from a Socio-Technical Perspective
Algorithmic rankers have a profound impact on our increasingly data-driven
society. From leisurely activities like the movies that we watch, the
restaurants that we patronize; to highly consequential decisions, like making
educational and occupational choices or getting hired by companies -- these are
all driven by sophisticated yet mostly inaccessible rankers. A small change to
how these algorithms process the rankees (i.e., the data items that are ranked)
can have profound consequences. For example, a change in rankings can lead to
deterioration of the prestige of a university or have drastic consequences on a
job candidate who missed out being in the list of the preferred top-k for an
organization. This paper is a call to action to the human-centered data science
research community to develop principled methods, measures, and metrics for
studying the interactions among the socio-technical context of use,
technological innovations, and the resulting consequences of algorithmic
rankings on multiple stakeholders. Given the spate of new legislations on
algorithmic accountability, it is imperative that researchers from social
science, human-computer interaction, and data science work in unison for
demystifying how rankings are produced, who has agency to change them, and what
metrics of socio-technical impact one must use for informing the context of
use.Comment: Accepted for Interrogating Human-Centered Data Science workshop at
CHI'2
Transparency, Fairness, Data Protection, Neutrality: Data Management Challenges in the Face of New Regulation
International audienceThe data revolution continues to transform every sector of science, industry and government. Due to the incredible impact of data-driven technology on society, we are becoming increasingly aware of the imperative to use data and algorithms responsibly-in accordance with laws and ethical norms. In this article we discuss three recent regulatory frameworks: the European Union's General Data Protection Regulation (GDPR), the New York City Automated Decisions Systems (ADS) Law, and the Net Neutrality principle, that aim to protect the rights of individuals who are impacted by data collection and analysis. These frameworks are prominent examples of a global trend: Governments are starting to recognize the need to regulate data-driven algorithmic technology. Our goal in this paper is to bring these regulatory frameworks to the attention of the data management community, and to underscore the technical challenges they raise and which we, as a community, are well-equipped to address. The main takeaway of this article is that legal and ethical norms cannot be incorporated into data-driven systems as an afterthought. Rather, we must think in terms of responsibility by design, viewing it as a systems requirement
iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making
People are rated and ranked, towards algorithmic decision making in an
increasing number of applications, typically based on machine learning.
Research on how to incorporate fairness into such tasks has prevalently pursued
the paradigm of group fairness: giving adequate success rates to specifically
protected groups. In contrast, the alternative paradigm of individual fairness
has received relatively little attention, and this paper advances this less
explored direction. The paper introduces a method for probabilistically mapping
user records into a low-rank representation that reconciles individual fairness
and the utility of classifiers and rankings in downstream applications. Our
notion of individual fairness requires that users who are similar in all
task-relevant attributes such as job qualification, and disregarding all
potentially discriminating attributes such as gender, should have similar
outcomes. We demonstrate the versatility of our method by applying it to
classification and learning-to-rank tasks on a variety of real-world datasets.
Our experiments show substantial improvements over the best prior work for this
setting.Comment: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings versio
HydroShare – A Case Study of the Application of Modern Software Engineering to a Large Distributed Federally-Funded Scientific Software Development Project
HydroShare is an online collaborative system under development to support the open sharing of hydrologic data, analytical tools, and computer models. With HydroShare, scientists can easily discover, access, and analyze hydrologic data and thereby enhance the production and reproducibility of hydrologic scientific results. HydroShare also takes advantage of emerging social media functionality to enable users to enhance information about and collaboration around hydrologic data and models. HydroShare is being developed by an interdisciplinary collaborative team of domain scientists, university software developers, and professional software engineers from ten institutions located across the United States. While the combination of non–co-located, diverse stakeholders presents communication and management challenges, the interdisciplinary nature of the team is integral to the project’s goal of improving scientific software development and capabilities in academia. This chapter describes the challenges faced and lessons learned with the development of HydroShare, as well as the approach to software development that the HydroShare team adopted on the basis of the lessons learned. The chapter closes with recommendations for the application of modern software engineering techniques to large, collaborative, scientific software development projects, similar to the National Science Foundation (NSF)–funded HydroShare, in order to promote the successful application of the approach described herein by other teams for other projects
- …