6,934 research outputs found
Adaptive Evolutionary Clustering
In many practical applications of clustering, the objects to be clustered
evolve over time, and a clustering result is desired at each time step. In such
applications, evolutionary clustering typically outperforms traditional static
clustering by producing clustering results that reflect long-term trends while
being robust to short-term variations. Several evolutionary clustering
algorithms have recently been proposed, often by adding a temporal smoothness
penalty to the cost function of a static clustering method. In this paper, we
introduce a different approach to evolutionary clustering by accurately
tracking the time-varying proximities between objects followed by static
clustering. We present an evolutionary clustering framework that adaptively
estimates the optimal smoothing parameter using shrinkage estimation, a
statistical approach that improves a naive estimate using additional
information. The proposed framework can be used to extend a variety of static
clustering algorithms, including hierarchical, k-means, and spectral
clustering, into evolutionary clustering algorithms. Experiments on synthetic
and real data sets indicate that the proposed framework outperforms static
clustering and existing evolutionary clustering algorithms in many scenarios.Comment: To appear in Data Mining and Knowledge Discovery, MATLAB toolbox
available at http://tbayes.eecs.umich.edu/xukevin/affec
Time-varying Learning and Content Analytics via Sparse Factor Analysis
We propose SPARFA-Trace, a new machine learning-based framework for
time-varying learning and content analytics for education applications. We
develop a novel message passing-based, blind, approximate Kalman filter for
sparse factor analysis (SPARFA), that jointly (i) traces learner concept
knowledge over time, (ii) analyzes learner concept knowledge state transitions
(induced by interacting with learning resources, such as textbook sections,
lecture videos, etc, or the forgetting effect), and (iii) estimates the content
organization and intrinsic difficulty of the assessment questions. These
quantities are estimated solely from binary-valued (correct/incorrect) graded
learner response data and a summary of the specific actions each learner
performs (e.g., answering a question or studying a learning resource) at each
time instance. Experimental results on two online course datasets demonstrate
that SPARFA-Trace is capable of tracing each learner's concept knowledge
evolution over time, as well as analyzing the quality and content organization
of learning resources, the question-concept associations, and the question
intrinsic difficulties. Moreover, we show that SPARFA-Trace achieves comparable
or better performance in predicting unobserved learner responses than existing
collaborative filtering and knowledge tracing approaches for personalized
education
Dynamic modeling of mean-reverting spreads for statistical arbitrage
Statistical arbitrage strategies, such as pairs trading and its
generalizations, rely on the construction of mean-reverting spreads enjoying a
certain degree of predictability. Gaussian linear state-space processes have
recently been proposed as a model for such spreads under the assumption that
the observed process is a noisy realization of some hidden states. Real-time
estimation of the unobserved spread process can reveal temporary market
inefficiencies which can then be exploited to generate excess returns. Building
on previous work, we embrace the state-space framework for modeling spread
processes and extend this methodology along three different directions. First,
we introduce time-dependency in the model parameters, which allows for quick
adaptation to changes in the data generating process. Second, we provide an
on-line estimation algorithm that can be constantly run in real-time. Being
computationally fast, the algorithm is particularly suitable for building
aggressive trading strategies based on high-frequency data and may be used as a
monitoring device for mean-reversion. Finally, our framework naturally provides
informative uncertainty measures of all the estimated parameters. Experimental
results based on Monte Carlo simulations and historical equity data are
discussed, including a co-integration relationship involving two
exchange-traded funds.Comment: 34 pages, 6 figures. Submitte
- …