Search CORE

122,138 research outputs found

Representation Learning on Unstructured Data

Author: Han Fangqiu
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Representation learning, which transfers real world data such as graphs, images and texts, into representations that can be effectively processed by machine learning algorithms, has became a new focus in machine learning community. Traditional machine learning algorithms usually focus on modeling hand-crafted feature representations manually extracted from the raw data and performance of the model highly depends on the quality of the data representation. However, feature engineering is laborious, hardly accurate, and less generalizable. Thus the weakness of many current learning algorithms is not how well they can model the data, but how good their input data representation are.In this thesis, we adopt learning algorithms both on representing and modeling the graph data in two different applications. In the first work, We first developed representation on nodes, and later apply a well-known VG kernel on this representation. In the second work, we show the power of representation captured by applying jointly optimization on the nodes representations and the model. The results of both work show significant improvement over traditional machine learning methods

Ezid

eScholarship - University of California

Learning parametric dictionaries for graph signals

Author: Frossard Pascal
Shuman David I
Thanou Dorina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/01/2014
Field of study

In sparse signal representation, the choice of a dictionary often involves a tradeoff between two desirable properties -- the ability to adapt to specific signal data and a fast implementation of the dictionary. To sparsely represent signals residing on weighted graphs, an additional design challenge is to incorporate the intrinsic geometric structure of the irregular data domain into the atoms of the dictionary. In this work, we propose a parametric dictionary learning algorithm to design data-adapted, structured dictionaries that sparsely represent graph signals. In particular, we model graph signals as combinations of overlapping local patterns. We impose the constraint that each dictionary is a concatenation of subdictionaries, with each subdictionary being a polynomial of the graph Laplacian matrix, representing a single pattern translated to different areas of the graph. The learning algorithm adapts the patterns to a training set of graph signals. Experimental results on both synthetic and real datasets demonstrate that the dictionaries learned by the proposed algorithm are competitive with and often better than unstructured dictionaries learned by state-of-the-art numerical learning algorithms in terms of sparse approximation of graph signals. In contrast to the unstructured dictionaries, however, the dictionaries learned by the proposed algorithm feature localized atoms and can be implemented in a computationally efficient manner in signal processing tasks such as compression, denoising, and classification

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Multi-level 3D CNN for Learning Multi-scale Spatial Features

Author: Balu Aditya
Ghadai Sambit
Krishnamurthy Adarsh
Lee Xian
Sarkar Soumik
Publication venue
Publication date: 03/05/2019
Field of study

3D object recognition accuracy can be improved by learning the multi-scale spatial features from 3D spatial geometric representations of objects such as point clouds, 3D models, surfaces, and RGB-D data. Current deep learning approaches learn such features either using structured data representations (voxel grids and octrees) or from unstructured representations (graphs and point clouds). Learning features from such structured representations is limited by the restriction on resolution and tree depth while unstructured representations creates a challenge due to non-uniformity among data samples. In this paper, we propose an end-to-end multi-level learning approach on a multi-level voxel grid to overcome these drawbacks. To demonstrate the utility of the proposed multi-level learning, we use a multi-level voxel representation of 3D objects to perform object recognition. The multi-level voxel representation consists of a coarse voxel grid that contains volumetric information of the 3D object. In addition, each voxel in the coarse grid that contains a portion of the object boundary is subdivided into multiple fine-level voxel grids. The performance of our multi-level learning algorithm for object recognition is comparable to dense voxel representations while using significantly lower memory.Comment: CVPR 2019 workshop on Deep Learning for Geometric Shape Understandin

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref

Learning over Knowledge-Base Embeddings for Recommendation

Author: Ai Qingyao
Chen Xu
Wang Pengfei
Zhang Yongfeng
Publication venue: 'MDPI AG'
Publication date: 22/03/2018
Field of study

State-of-the-art recommendation algorithms -- especially the collaborative filtering (CF) based approaches with shallow or deep models -- usually work with various unstructured information sources for recommendation, such as textual reviews, visual images, and various implicit or explicit feedbacks. Though structured knowledge bases were considered in content-based approaches, they have been largely neglected recently due to the availability of vast amount of data, and the learning power of many complex models. However, structured knowledge bases exhibit unique advantages in personalized recommendation systems. When the explicit knowledge about users and items is considered for recommendation, the system could provide highly customized recommendations based on users' historical behaviors. A great challenge for using knowledge bases for recommendation is how to integrated large-scale structured and unstructured data, while taking advantage of collaborative filtering for highly accurate performance. Recent achievements on knowledge base embedding sheds light on this problem, which makes it possible to learn user and item representations while preserving the structure of their relationship with external knowledge. In this work, we propose to reason over knowledge base embeddings for personalized recommendation. Specifically, we propose a knowledge base representation learning approach to embed heterogeneous entities for recommendation. Experimental results on real-world dataset verified the superior performance of our approach compared with state-of-the-art baselines

arXiv.org e-Print Archive

Directory of Open Access Journals

Model Debiasing via Gradient-based Explanation on Representation

Author: Cao Caleb Chen
Chen Lei
Huang Yongxiang
Su Dan
Wang Luning
Zhang Jindi
Publication venue
Publication date: 03/09/2023
Field of study

Machine learning systems produce biased results towards certain demographic groups, known as the fairness problem. Recent approaches to tackle this problem learn a latent code (i.e., representation) through disentangled representation learning and then discard the latent code dimensions correlated with sensitive attributes (e.g., gender). Nevertheless, these approaches may suffer from incomplete disentanglement and overlook proxy attributes (proxies for sensitive attributes) when processing real-world data, especially for unstructured data, causing performance degradation in fairness and loss of useful information for downstream tasks. In this paper, we propose a novel fairness framework that performs debiasing with regard to both sensitive attributes and proxy attributes, which boosts the prediction performance of downstream task models without complete disentanglement. The main idea is to, first, leverage gradient-based explanation to find two model focuses, 1) one focus for predicting sensitive attributes and 2) the other focus for predicting downstream task labels, and second, use them to perturb the latent code that guides the training of downstream task models towards fairness and utility goals. We show empirically that our framework works with both disentangled and non-disentangled representation learning methods and achieves better fairness-accuracy trade-off on unstructured and structured datasets than previous state-of-the-art approaches

arXiv.org e-Print Archive