1,519 research outputs found
Streaming classification with emerging new class by class matrix sketching
National Research Foundation (NRF) Singapor
Online Product Quantization
Approximate nearest neighbor (ANN) search has achieved great success in many
tasks. However, existing popular methods for ANN search, such as hashing and
quantization methods, are designed for static databases only. They cannot
handle well the database with data distribution evolving dynamically, due to
the high computational effort for retraining the model based on the new
database. In this paper, we address the problem by developing an online product
quantization (online PQ) model and incrementally updating the quantization
codebook that accommodates to the incoming streaming data. Moreover, to further
alleviate the issue of large scale computation for the online PQ update, we
design two budget constraints for the model to update partial PQ codebook
instead of all. We derive a loss bound which guarantees the performance of our
online PQ model. Furthermore, we develop an online PQ model over a sliding
window with both data insertion and deletion supported, to reflect the
real-time behaviour of the data. The experiments demonstrate that our online PQ
model is both time-efficient and effective for ANN search in dynamic large
scale databases compared with baseline methods and the idea of partial PQ
codebook update further reduces the update cost.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering
(DOI: 10.1109/TKDE.2018.2817526
Spiking Neural Networks Through the Lens of Streaming Algorithms
We initiate the study of biological neural networks from the perspective of
streaming algorithms. Like computers, human brains suffer from memory
limitations which pose a significant obstacle when processing large scale and
dynamically changing data. In computer science, these challenges are captured
by the well-known streaming model, which can be traced back to Munro and
Paterson `78 and has had significant impact in theory and beyond. In the
classical streaming setting, one must compute some function of a stream of
updates , given restricted single-pass access
to the stream. The primary complexity measure is the space used by the
algorithm.
We take the first steps towards understanding the connection between
streaming and neural algorithms. On the upper bound side, we design neural
algorithms based on known streaming algorithms for fundamental tasks, including
distinct elements, approximate median, heavy hitters, and more. The number of
neurons in our neural solutions almost matches the space bounds of the
corresponding streaming algorithms. As a general algorithmic primitive, we show
how to implement the important streaming technique of linear sketching
efficient in spiking neural networks. On the lower bound side, we give a
generic reduction, showing that any space-efficient spiking neural network can
be simulated by a space-efficiently streaming algorithm. This reduction lets us
translate streaming-space lower bounds into nearly matching neural-space lower
bounds, establishing a close connection between these two models.Comment: To appear in DISC'20, shorten abstrac
Learning with Attributed Networks: Algorithms and Applications
abstract: Attributes - that delineating the properties of data, and connections - that describing the dependencies of data, are two essential components to characterize most real-world phenomena. The synergy between these two principal elements renders a unique data representation - the attributed networks. In many cases, people are inundated with vast amounts of data that can be structured into attributed networks, and their use has been attractive to researchers and practitioners in different disciplines. For example, in social media, users interact with each other and also post personalized content; in scientific collaboration, researchers cooperate and are distinct from peers by their unique research interests; in complex diseases studies, rich gene expression complements to the gene-regulatory networks. Clearly, attributed networks are ubiquitous and form a critical component of modern information infrastructure. To gain deep insights from such networks, it requires a fundamental understanding of their unique characteristics and be aware of the related computational challenges.
My dissertation research aims to develop a suite of novel learning algorithms to understand, characterize, and gain actionable insights from attributed networks, to benefit high-impact real-world applications. In the first part of this dissertation, I mainly focus on developing learning algorithms for attributed networks in a static environment at two different levels: (i) attribute level - by designing feature selection algorithms to find high-quality features that are tightly correlated with the network topology; and (ii) node level - by presenting network embedding algorithms to learn discriminative node embeddings by preserving node proximity w.r.t. network topology structure and node attribute similarity. As changes are essential components of attributed networks and the results of learning algorithms will become stale over time, in the second part of this dissertation, I propose a family of online algorithms for attributed networks in a dynamic environment to continuously update the learning results on the fly. In fact, developing application-aware learning algorithms is more desired with a clear understanding of the application domains and their unique intents. As such, in the third part of this dissertation, I am also committed to advancing real-world applications on attributed networks by incorporating the objectives of external tasks into the learning process.Dissertation/ThesisDoctoral Dissertation Computer Science 201
- …