4,948 research outputs found
DROP: Dimensionality Reduction Optimization for Time Series
Dimensionality reduction is a critical step in scaling machine learning
pipelines. Principal component analysis (PCA) is a standard tool for
dimensionality reduction, but performing PCA over a full dataset can be
prohibitively expensive. As a result, theoretical work has studied the
effectiveness of iterative, stochastic PCA methods that operate over data
samples. However, termination conditions for stochastic PCA either execute for
a predetermined number of iterations, or until convergence of the solution,
frequently sampling too many or too few datapoints for end-to-end runtime
improvements. We show how accounting for downstream analytics operations during
DR via PCA allows stochastic methods to efficiently terminate after operating
over small (e.g., 1%) subsamples of input data, reducing whole workload
runtime. Leveraging this, we propose DROP, a DR optimizer that enables speedups
of up to 5x over Singular-Value-Decomposition-based PCA techniques, and exceeds
conventional approaches like FFT and PAA by up to 16x in end-to-end workloads
Time Series Management Systems:A Survey
The collection of time series data increases as more monitoring and
automation are being deployed. These deployments range in scale from an
Internet of things (IoT) device located in a household to enormous distributed
Cyber-Physical Systems (CPSs) producing large volumes of data at high velocity.
To store and analyze these vast amounts of data, specialized Time Series
Management Systems (TSMSs) have been developed to overcome the limitations of
general purpose Database Management Systems (DBMSs) for times series
management. In this paper, we present a thorough analysis and classification of
TSMSs developed through academic or industrial research and documented through
publications. Our classification is organized into categories based on the
architectures observed during our analysis. In addition, we provide an overview
of each system with a focus on the motivational use case that drove the
development of the system, the functionality for storage and querying of time
series a system implements, the components the system is composed of, and the
capabilities of each system with regard to Stream Processing and Approximate
Query Processing (AQP). Last, we provide a summary of research directions
proposed by other researchers in the field and present our vision for a next
generation TSMS.Comment: 20 Pages, 15 Figures, 2 Tables, Accepted for publication in IEEE TKD
K2P A photometry pipeline for the K2 mission
With the loss of a second reaction wheel, resulting in the inability to point
continuously and stably at the same field of view, the NASA Kepler satellite
recently entered a new mode of observation known as the K2 mission. The data
from this redesigned mission present a specific challenge; the targets
systematically drift in position on a ~6 hour time scale, inducing a
significant instrumental signal in the photometric time series --- this greatly
impacts the ability to detect planetary signals and perform asteroseismic
analysis. Here we detail our version of a reduction pipeline for K2 target
pixel data, which automatically: defines masks for all targets in a given
frame; extracts the target's flux- and position time series; corrects the time
series based on the apparent movement on the CCD (either in 1D or 2D) combined
with the correction of instrumental and/or planetary signals via the KASOC
filter (Handberg & Lund 2014), thus rendering the time series ready for
asteroseismic analysis; computes power spectra for all targets, and identifies
potential contaminations between targets. From a test of our pipeline on a
sample of targets from the K2 campaign 0, the recovery of data for multiple
targets increases the amount of potential light curves by a factor .
Our pipeline could be applied to the upcoming TESS (Ricker et al. 2014) and
PLATO 2.0 (Rauer et al. 2013) missions.Comment: 14 pages, 20 figures, Accepted for publication in The Astrophysical
Journal (Apj
Recommended from our members
Sensing gestures for business intelligence
The combination of sensor data with analytic techniques is growing in popularity for both practitioners and researchers as an Internet of Things (IoT) offers new opportunities and insights. Organisations are trying to use sensor technologies to derive intelligence and gain a competitive edge in their industries. Obtaining data from sensors might not pose too much of a problem, however subsequent utilisation in meeting an organisation’s decision making can be more problematic. Understanding how sensor data analytics can be undertaken is the first step to deriving business intelligence from front line retail environments. This paper explores the use of the Microsoft Kinect sensor to provide intelligence by identifying and sensing gestures to better understand customer behaviour in the retail space
A chemical survey of exoplanets with ARIEL
Thousands of exoplanets have now been discovered with a huge range of masses, sizes and orbits: from rocky Earth-like planets to large gas giants grazing the surface of their host star. However, the essential nature of these exoplanets remains largely mysterious: there is no known, discernible pattern linking the presence, size, or orbital parameters of a planet to the nature of its parent star. We have little idea whether the chemistry of a planet is linked to its formation environment, or whether the type of host star drives the physics and chemistry of the planet’s birth, and evolution. ARIEL was conceived to observe a large number (~1000) of transiting planets for statistical understanding, including gas giants, Neptunes, super-Earths and Earth-size planets around a range of host star types using transit spectroscopy in the 1.25–7.8 μm spectral range and multiple narrow-band photometry in the optical. ARIEL will focus on warm and hot planets to take advantage of their well-mixed atmospheres which should show minimal condensation and sequestration of high-Z materials compared to their colder Solar System siblings. Said warm and hot atmospheres are expected to be more representative of the planetary bulk composition. Observations of these warm/hot exoplanets, and in particular of their elemental composition (especially C, O, N, S, Si), will allow the understanding of the early stages of planetary and atmospheric formation during the nebular phase and the following few million years. ARIEL will thus provide a representative picture of the chemical nature of the exoplanets and relate this directly to the type and chemical environment of the host star. ARIEL is designed as a dedicated survey mission for combined-light spectroscopy, capable of observing a large and well-defined planet sample within its 4-year mission lifetime. Transit, eclipse and phase-curve spectroscopy methods, whereby the signal from the star and planet are differentiated using knowledge of the planetary ephemerides, allow us to measure atmospheric signals from the planet at levels of 10–100 part per million (ppm) relative to the star and, given the bright nature of targets, also allows more sophisticated techniques, such as eclipse mapping, to give a deeper insight into the nature of the atmosphere. These types of observations require a stable payload and satellite platform with broad, instantaneous wavelength coverage to detect many molecular species, probe the thermal structure, identify clouds and monitor the stellar activity. The wavelength range proposed covers all the expected major atmospheric gases from e.g. H2O, CO2, CH4 NH3, HCN, H2S through to the more exotic metallic compounds, such as TiO, VO, and condensed species. Simulations of ARIEL performance in conducting exoplanet surveys have been performed – using conservative estimates of mission performance and a full model of all significant noise sources in the measurement – using a list of potential ARIEL targets that incorporates the latest available exoplanet statistics. The conclusion at the end of the Phase A study, is that ARIEL – in line with the stated mission objectives – will be able to observe about 1000 exoplanets depending on the details of the adopted survey strategy, thus confirming the feasibility of the main science objectives.Peer reviewedFinal Published versio
- …