Search CORE

5,806 research outputs found

A Divide-and-Conquer Solver for Kernel Support Vector Machines

Author: Dhillon Inderjit S.
Hsieh Cho-Jui
Si Si
Publication venue
Publication date: 04/11/2013
Field of study

The kernel support vector machine (SVM) is one of the most widely used classification methods; however, the amount of computation required becomes the bottleneck when facing millions of samples. In this paper, we propose and analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the division step, we partition the kernel SVM problem into smaller subproblems by clustering the data, so that each subproblem can be solved independently and efficiently. We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering. In the conquer step, the local solutions from the subproblems are used to initialize a global coordinate descent solver, which converges quickly as suggested by our analysis. By extending this idea, we develop a multilevel Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction strategy, which outperforms state-of-the-art methods in terms of training speed, testing accuracy, and memory usage. As an example, on the covtype dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in obtaining the exact SVM solution (to within

10^{-6}

relative error) which achieves 96.15% prediction accuracy. Moreover, with our proposed early prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes, which is more than 100 times faster than LIBSVM

arXiv.org e-Print Archive

CiteSeerX

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Author: Bengio Samy
Chiang Wei-Lin
Hsieh Cho-Jui
Li Yang
Liu Xuanqing
Si Si
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/08/2019
Field of study

Graph convolutional network (GCN) has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph. This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). For training a 3-layer GCN on this data, Cluster-GCN is faster than the previous state-of-the-art VR-GCN (1523 seconds vs 1961 seconds) and using much less memory (2.2GB vs 11.2GB). Furthermore, for training 4 layer GCN on this data, our algorithm can finish in around 36 minutes while all the existing GCN training algorithms fail to train due to the out-of-memory issue. Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99.36 on the PPI dataset, while the previous best result was 98.71 by [16]. Our codes are publicly available at https://github.com/google-research/google-research/tree/master/cluster_gcn.Comment: In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD'19

arXiv.org e-Print Archive

Crossref

The Transformation of Trust in China’s Alternative Food Networks: Disruption, Reconstruction, and Development

Author: Ng Cho Nam
Scott Steffanie
Si Zhenzhong
Wang Raymond Yu
Publication venue: Scholars Commons @ Laurier
Publication date: 01/01/2015
Field of study

Food safety issues in China have received much scholarly attention, yet few studies systematically examined this matter through the lens of trust. More importantly, little is known about the transformation of different types of trust in the dynamic process of food production, provision, and consumption. We consider trust as an evolving interdependent relationship between different actors. We used the Beijing County Fair, a prominent ecological farmers’ market in China, as an example to examine the transformation of trust in China’s alternative food networks. We argue that although there has been a disruption of institutional trust among the general public since 2008 when the melamine-tainted milk scandal broke out, reconstruction of individual trust and development of organizational trust have been observed, along with the emergence and increasing popularity of alternative food networks. Based on more than six months of fieldwork on the emerging ecological agriculture sector in 13 provinces across China as well as monitoring of online discussions and posts, we analyze how various social factors—including but not limited to direct and indirect reciprocity, information, endogenous institutions, and altruism—have simultaneously contributed to the transformation of trust in China’s alternative food networks. The findings not only complement current social theories of trust, but also highlight an important yet understudied phenomenon whereby informal social mechanisms have been partially substituting for formal institutions and gradually have been building trust against the backdrop of the food safety crisis in China

Wilfrid Laurier University

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory

Author: Cui Justin
Hsieh Cho-Jui
Si Si
Wang Ruochen
Publication venue
Publication date: 18/11/2022
Field of study

Dataset distillation methods aim to compress a large dataset into a small set of synthetic samples, such that when being trained on, competitive performances can be achieved compared to regular training on the entire dataset. Among recently proposed methods, Matching Training Trajectories (MTT) achieves state-of-the-art performance on CIFAR-10/100, while having difficulty scaling to ImageNet-1k dataset due to the large memory requirement when performing unrolled gradient computation through back-propagation. Surprisingly, we show that there exists a procedure to exactly calculate the gradient of the trajectory matching loss with constant GPU memory requirement (irrelevant to the number of unrolled steps). With this finding, the proposed memory-efficient trajectory matching method can easily scale to ImageNet-1K with 6x memory reduction while introducing only around 2% runtime overhead than original MTT. Further, we find that assigning soft labels for synthetic images is crucial for the performance when scaling to larger number of categories (e.g., 1,000) and propose a novel soft label version of trajectory matching that facilities better aligning of model training trajectories on large datasets. The proposed algorithm not only surpasses previous SOTA on ImageNet-1K under extremely low IPCs (Images Per Class), but also for the first time enables us to scale up to 50 IPCs on ImageNet-1K. Our method (TESLA) achieves 27.9% testing accuracy, a remarkable +18.2% margin over prior arts.Comment: ICLR 2023 submission link: https://openreview.net/forum?id=dN70O8pmW

arXiv.org e-Print Archive

Automatic Engineering of Long Prompts

Author: Dhillon Inderjit S.
Hsieh Cho-Jui
Si Si
Yu Felix X.
Publication venue
Publication date: 16/11/2023
Field of study

Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts. However, these prompts can be lengthy, often comprising hundreds of lines and thousands of tokens, and their design often requires considerable human effort. Recent research has explored automatic prompt engineering for short prompts, typically consisting of one or a few sentences. However, the automatic design of long prompts remains a challenging problem due to its immense search space. In this paper, we investigate the performance of greedy algorithms and genetic algorithms for automatic long prompt engineering. We demonstrate that a simple greedy approach with beam search outperforms other methods in terms of search efficiency. Moreover, we introduce two novel techniques that utilize search history to enhance the effectiveness of LLM-based mutation in our search algorithm. Our results show that the proposed automatic long prompt engineering algorithm achieves an average of 9.2% accuracy gain on eight tasks in Big Bench Hard, highlighting the significance of automating prompt designs to fully harness the capabilities of LLMs

arXiv.org e-Print Archive