1,219 research outputs found
Recommended from our members
Scalable algorithms for latent variable models in machine learning
Latent variable modeling (LVM) is a popular approach in many machine learning applications, such as recommender systems and topic modeling, due to its ability to succinctly represent data, even in the presence of several missing entries. Existing learning methods for LVMs, while attractive, are infeasible for the large-scale datasets required in modern big data applications. In addition, such applications often come with various types of side information such as the text description of items and the social network among users in a recommender system. In this thesis, we present scalable learning algorithms for a wide range of latent variable models such as low-rank matrix factorization and latent Dirichlet allocation. We also develop simple but effective techniques to extend existing LVMs to exploit various types of side information and make better predictions in many machine learning applications such as recommender systems, multi-label learning, and high-dimensional time-series prediction. In addition, we also propose a novel approach for the maximum inner product search problem to accelerate the prediction phase of many latent variable models.Computer Science
violation in charmed hadron decays into neutral kaons
We find a new violating effect in charmed hadron decays into neutral
kaons, which is induced by the interference between the Cabibbo-favored and
doubly Cabibbo-suppressed amplitudes with the mixing.
It is estimated to be of order of , much larger than the
direct asymmetry, but missed in the literature. To reveal this new
violation effect, we propose a new observable, the difference of the
asymmetries in the and
modes. Once the new effect is determined by experiments, the direct
asymmetry then can be extracted and used to search for new physics.Comment: 6 pages, 3 figures. Contribution to the proceeding of The 15th
International Conference on Flavor Physics & CP Violation, 5-9 June 2017,
Prague, Czech Republi
All that Matters does not Matter: The Politics of Dehyphenation in Wayson Choy\u27s All That Matters
Implications on the first observation of charm CPV at LHCb
Very recently, the LHCb Collaboration observed the violation (CPV) in
the charm sector for the first time, with .
This result is consistent with our prediction of obtained in the factorization-assisted
topological-amplitude (FAT) approach in [PRD86,036012(2012)]. It implies that
the current understanding of the penguin dynamics in charm decays in the
Standard Model is reasonable. Motivated by the success of the FAT approach, we
further suggest to measure the decay, which is the next
potential mode to reveal the CPV of the same order as .Comment: 10 page
Large-scale Multi-label Learning with Missing Labels
The multi-label classification problem has generated significant interest in
recent years. However, existing approaches do not adequately address two key
challenges: (a) the ability to tackle problems with a large number (say
millions) of labels, and (b) the ability to handle data with missing labels. In
this paper, we directly address both these problems by studying the multi-label
problem in a generic empirical risk minimization (ERM) framework. Our
framework, despite being simple, is surprisingly able to encompass several
recent label-compression based methods which can be derived as special cases of
our method. To optimize the ERM problem, we develop techniques that exploit the
structure of specific loss functions - such as the squared loss function - to
offer efficient algorithms. We further show that our learning framework admits
formal excess risk bounds even in the presence of missing labels. Our risk
bounds are tight and demonstrate better generalization performance for low-rank
promoting trace-norm regularization when compared to (rank insensitive)
Frobenius norm regularization. Finally, we present extensive empirical results
on a variety of benchmark datasets and show that our methods perform
significantly better than existing label compression based methods and can
scale up to very large datasets such as the Wikipedia dataset
- …