44,106 research outputs found
Clustering of Global Magnetospheric Observations
The use of supervised methods in space science have demonstrated powerful
capability in classification tasks, but unsupervised methods have been less
utilized for the clustering of spacecraft observations. We use a combination of
unsupervised methods, being principal component analysis, self-organizing maps,
and hierarchical agglomerative clustering, to make predictions on if THEMIS and
MMS observations occurred in the magnetosphere, magnetosheath, or the solar
wind. The resulting predictions are validated visually by analyzing the
distribution of predictions and studying individual time series. Particular
nodes in the self organizing map are studied to see what data they represent.
The capability of deeper hierarchical analysis using this model is briefly
explored. Finally, the changes in region prediction can be used to infer
magnetopause and bow shock crossings, which can act as an additional method of
validation, and are saved for their utility in solar wind validation,
understanding magnetopause processes, and the potential to develop a bow shock
model.Comment: 36 pages, 22 figure
Managing Uncertainty: A Case for Probabilistic Grid Scheduling
The Grid technology is evolving into a global, service-orientated
architecture, a universal platform for delivering future high demand
computational services. Strong adoption of the Grid and the utility computing
concept is leading to an increasing number of Grid installations running a wide
range of applications of different size and complexity. In this paper we
address the problem of elivering deadline/economy based scheduling in a
heterogeneous application environment using statistical properties of job
historical executions and its associated meta-data. This approach is motivated
by a study of six-month computational load generated by Grid applications in a
multi-purpose Grid cluster serving a community of twenty e-Science projects.
The observed job statistics, resource utilisation and user behaviour is
discussed in the context of management approaches and models most suitable for
supporting a probabilistic and autonomous scheduling architecture
Capturing Evolution Genes for Time Series Data
The modeling of time series is becoming increasingly critical in a wide
variety of applications. Overall, data evolves by following different patterns,
which are generally caused by different user behaviors. Given a time series, we
define the evolution gene to capture the latent user behaviors and to describe
how the behaviors lead to the generation of time series. In particular, we
propose a uniform framework that recognizes different evolution genes of
segments by learning a classifier, and adopt an adversarial generator to
implement the evolution gene by estimating the segments' distribution.
Experimental results based on a synthetic dataset and five real-world datasets
show that our approach can not only achieve a good prediction results (e.g.,
averagely +10.56% in terms of F1), but is also able to provide explanations of
the results.Comment: a preprint version. arXiv admin note: text overlap with
arXiv:1703.10155 by other author
MemeSequencer: sparse matching for embedding image macros
[Proceeding of]: The Web Conference 2018 (WWW2018), April 23 - 27, 2018, Lyon, FranceThe analysis of the creation, mutation, and propagation of social media content on the Internet is an essential problem in computational social science, affecting areas ranging from marketing to political mobilization. A first step towards understanding the evolution of images online is the analysis of rapidly modifying and propagating memetic imagery or "memes". However, a pitfall in proceeding with such an investigation is the current incapability to produce a robust semantic space for such imagery, capable of understanding differences in Image Macros. In this study, we provide a first step in the systematic study of image evolution on the Internet, by proposing an algorithm based on sparse representations and deep learning to decouple various types of content in such images and produce a rich semantic embedding. We demonstrate the benefits of our approach on a variety of tasks pertaining to memes and Image Macros, such as image clustering, image retrieval, topic prediction and virality prediction, surpassing the existing methods on each. In addition to its utility on quantitative tasks, our method opens up the possibility of obtaining the first large-scale understanding of the evolution and propagation of memetic imagery.Publicad
- …