13,124 research outputs found
A Review and Characterization of Progressive Visual Analytics
Progressive Visual Analytics (PVA) has gained increasing attention over the past years.
It brings the user into the loop during otherwise long-running and non-transparent computations
by producing intermediate partial results. These partial results can be shown to the user
for early and continuous interaction with the emerging end result even while it is still being
computed. Yet as clear-cut as this fundamental idea seems, the existing body of literature puts forth
various interpretations and instantiations that have created a research domain of competing terms,
various definitions, as well as long lists of practical requirements and design guidelines spread across
different scientific communities. This makes it more and more difficult to get a succinct understanding
of PVA’s principal concepts, let alone an overview of this increasingly diverging field. The review and
discussion of PVA presented in this paper address these issues and provide (1) a literature collection
on this topic, (2) a conceptual characterization of PVA, as well as (3) a consolidated set of practical
recommendations for implementing and using PVA-based visual analytics solutions
At the Nexus of Neoliberalism, Mass Incarceration, and Scientific Racism: the Conflation of Blackness with Risk in the 21st century
This paper examines how the systems of power of neoliberalism, scientific racism, and mass incarceration intersect to construct and uphold the image of “black criminality” and “blackness as a risk” to society. Risk assessments used to determine prison sentencing exemplify this phenomenon. Histories of deliberate associations between blackness and criminality--through science, media, political rhetoric, and economic systems--create a field in which risk assessment is widely regarded as a useful and scientifically neutral tool in mass incarceration. Particular scientific, economic, and carceral circumstances culminating in the 21st century collude to elevate risk assessments into one aspect of a big data apparatus endowed with the capacity to predict and control future behaviors. The paper suggests future directions for scientific research to promote racial justice in the context of mass incarceration
Launching the Grand Challenges for Ocean Conservation
The ten most pressing Grand Challenges in Oceans Conservation were identified at the Oceans Big Think and described in a detailed working document:A Blue Revolution for Oceans: Reengineering Aquaculture for SustainabilityEnding and Recovering from Marine DebrisTransparency and Traceability from Sea to Shore: Ending OverfishingProtecting Critical Ocean Habitats: New Tools for Marine ProtectionEngineering Ecological Resilience in Near Shore and Coastal AreasReducing the Ecological Footprint of Fishing through Smarter GearArresting the Alien Invasion: Combating Invasive SpeciesCombatting the Effects of Ocean AcidificationEnding Marine Wildlife TraffickingReviving Dead Zones: Combating Ocean Deoxygenation and Nutrient Runof
Stability and sensitivity of Learning Analytics based prediction models
Learning analytics seek to enhance the learning processes through systematic measurements of learning related data and to provide informative feedback to learners and educators. Track data from Learning Management Systems (LMS) constitute a main data source for learning analytics. This empirical contribution provides an application of Buckingham Shum and Deakin Crick’s theoretical framework of dispositional learning analytics: an infrastructure that combines learning dispositions data with data extracted from computer-assisted, formative assessments and LMSs. In two cohorts of a large introductory quantitative methods module, 2049 students were enrolled in a module based on principles of blended learning, combining face-to-face Problem-Based Learning sessions with e-tutorials. We investigated the predictive power of learning dispositions, outcomes of continuous formative assessments and other system generated data in modelling student performance and their potential to generate informative feedback. Using a dynamic, longitudinal perspective, computer-assisted formative assessments seem to be the best predictor for detecting underperforming students and academic performance, while basic LMS data did not substantially predict learning. If timely feedback is crucial, both use-intensity related track data from e-tutorial systems, and learning dispositions, are valuable sources for feedback generation
Speculative Approximations for Terascale Analytics
Model calibration is a major challenge faced by the plethora of statistical
analytics packages that are increasingly used in Big Data applications.
Identifying the optimal model parameters is a time-consuming process that has
to be executed from scratch for every dataset/model combination even by
experienced data scientists. We argue that the incapacity to evaluate multiple
parameter configurations simultaneously and the lack of support to quickly
identify sub-optimal configurations are the principal causes. In this paper, we
develop two database-inspired techniques for efficient model calibration.
Speculative parameter testing applies advanced parallel multi-query processing
methods to evaluate several configurations concurrently. The number of
configurations is determined adaptively at runtime, while the configurations
themselves are extracted from a distribution that is continuously learned
following a Bayesian process. Online aggregation is applied to identify
sub-optimal configurations early in the processing by incrementally sampling
the training dataset and estimating the objective function corresponding to
each configuration. We design concurrent online aggregation estimators and
define halting conditions to accurately and timely stop the execution. We apply
the proposed techniques to distributed gradient descent optimization -- batch
and incremental -- for support vector machines and logistic regression models.
We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big
Data analytics system -- and evaluate their performance over terascale-size
synthetic and real datasets. The results confirm that as many as 32
configurations can be evaluated concurrently almost as fast as one, while
sub-optimal configurations are detected accurately in as little as a
fraction of the time
- …