19,653 research outputs found
Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
Click-Through Rate prediction is an important task in recommender systems,
which aims to estimate the probability of a user to click on a given item.
Recently, many deep models have been proposed to learn low-order and high-order
feature interactions from original features. However, since useful interactions
are always sparse, it is difficult for DNN to learn them effectively under a
large number of parameters. In real scenarios, artificial features are able to
improve the performance of deep models (such as Wide & Deep Learning), but
feature engineering is expensive and requires domain knowledge, making it
impractical in different scenarios. Therefore, it is necessary to augment
feature space automatically. In this paper, We propose a novel Feature
Generation by Convolutional Neural Network (FGCNN) model with two components:
Feature Generation and Deep Classifier. Feature Generation leverages the
strength of CNN to generate local patterns and recombine them to generate new
features. Deep Classifier adopts the structure of IPNN to learn interactions
from the augmented feature space. Experimental results on three large-scale
datasets show that FGCNN significantly outperforms nine state-of-the-art
models. Moreover, when applying some state-of-the-art models as Deep
Classifier, better performance is always achieved, showing the great
compatibility of our FGCNN model. This work explores a novel direction for CTR
predictions: it is quite useful to reduce the learning difficulties of DNN by
automatically identifying important features
Matrix factorizations and link homology II
To a presentation of an oriented link as the closure of a braid we assign a
complex of bigraded vector spaces. The Euler characteristic of this complex
(and of its triply-graded cohomology groups) is the HOMFLYPT polynomial of the
link. We show that the dimension of each cohomology group is a link invariant.Comment: 37 pages, 20 figures; version 2 corrects an inaccuracy in the proof
of Proposition
Integration of molecular network data reconstructs Gene Ontology.
Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online
Supervised cross-modal factor analysis for multiple modal data classification
In this paper we study the problem of learning from multiple modal data for
purpose of document classification. In this problem, each document is composed
two different modals of data, i.e., an image and a text. Cross-modal factor
analysis (CFA) has been proposed to project the two different modals of data to
a shared data space, so that the classification of a image or a text can be
performed directly in this space. A disadvantage of CFA is that it has ignored
the supervision information. In this paper, we improve CFA by incorporating the
supervision information to represent and classify both image and text modals of
documents. We project both image and text data to a shared data space by factor
analysis, and then train a class label predictor in the shared space to use the
class label information. The factor analysis parameter and the predictor
parameter are learned jointly by solving one single objective function. With
this objective function, we minimize the distance between the projections of
image and text of the same document, and the classification error of the
projection measured by hinge loss function. The objective function is optimized
by an alternate optimization strategy in an iterative algorithm. Experiments in
two different multiple modal document data sets show the advantage of the
proposed algorithm over other CFA methods
- …