8,157 research outputs found

    PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks

    Full text link
    Unsupervised text embedding methods, such as Skip-gram and Paragraph Vector, have been attracting increasing attention due to their simplicity, scalability, and effectiveness. However, comparing to sophisticated deep learning architectures such as convolutional neural networks, these methods usually yield inferior results when applied to particular machine learning tasks. One possible reason is that these text embedding methods learn the representation of text in a fully unsupervised way, without leveraging the labeled information available for the task. Although the low dimensional representations learned are applicable to many different tasks, they are not particularly tuned for any task. In this paper, we fill this gap by proposing a semi-supervised representation learning method for text data, which we call the \textit{predictive text embedding} (PTE). Predictive text embedding utilizes both labeled and unlabeled data to learn the embedding of text. The labeled information and different levels of word co-occurrence information are first represented as a large-scale heterogeneous text network, which is then embedded into a low dimensional space through a principled and efficient algorithm. This low dimensional embedding not only preserves the semantic closeness of words and documents, but also has a strong predictive power for the particular task. Compared to recent supervised approaches based on convolutional neural networks, predictive text embedding is comparable or more effective, much more efficient, and has fewer parameters to tune.Comment: KDD 201

    Trade Coefficients and the Role of Elasticity in a Spatial CGE Model Based on the Armington Assumption

    Get PDF
    The Armington Assumption in the context of multi-regional CGE models is commonly interpreted as follows: Same commodities with different origins are imperfect substitutes for each other. In this paper, a static spatial CGE model that is compatible with this assumption and explicitly considers the transport sector and regional price differentials is formulated. Trade coefficients, which are derived endogenously from the optimization behaviors of firms and households, are shown to take the form of a potential function. To investigate how the elasticity of substitutions affects equilibrium solutions, a simpler version of the model that incorporates three regions and two sectors (besides the transport sector) is introduced. Results indicate: (1) if commodities produced in different regions are perfect substitutes, regional economies will be either autarkic or completely symmetric and (2) if they are imperfect substitutes, the impact of elasticity on the price equilibrium system as well as trade coefficients will be nonlinear and sometimes very sensitive.Armington Assumption, Spatial CGE, Elasticity of substitution, Trade coefficient, Econometric model

    Socially engaged art tourism, in-migrants micro-entrepreneurship, and peripheral island revitalzation

    Get PDF

    Towards combining deep learning and statistical relational learning for reasoning on graphs

    Full text link
    Cette thèse se focalise sur l'analyse de données structurées en graphes, un format de données répandu dans le monde réel. Le raisonnement dans ces données est un enjeu clé en apprentissage automatique, avec des applications allant de la classification de nœuds à la prédiction de liens. On distingue deux approches majeures pour le raisonnement dans les données en graphes : l'apprentissage relationnel statistique et l'apprentissage profond. L'apprentissage relationnel statistique construit des modèles graphiques probabilistes, efficaces pour capturer des dépendances complexes et intégrer des connaissances préexistantes, comme les règles logiques. Des méthodes notables incluent les réseaux logiques de Markov et les champs aléatoires conditionnels. L'apprentissage profond, quant à lui, se base sur l'apprentissage de représentations pertinentes des données observées pour une compréhension et un raisonnement rapides. Les réseaux neuronaux pour graphes (GNN) représentent un outil de pointe dans ce domaine. La combinaison de l'apprentissage relationnel statistique et de l'apprentissage profond offre une perspective enrichie sur le raisonnement, promettant un cadre plus robuste et efficace. Cette thèse explore cette combinaison, en développant des méthodes qui intègrent les deux approches. L'apprentissage profond renforce l'efficacité de l'apprentissage et de l'inférence dans l'apprentissage relationnel statistique, tandis que ce dernier affine les prédictions de l'apprentissage profond. Ce cadre intégré est appliqué à un éventail de tâches de raisonnement sur les graphes, démontrant son efficacité et ouvrant la voie à des recherches futures pour des cadres de raisonnement encore plus robustes.This thesis centers on the analysis of graph-structured data, a ubiquitous data format in the real world. Reasoning within graph-structured data has long been a fundamental problem in machine learning, with applications spanning from node classification to link prediction. There are two principal approaches to tackle reasoning within graph-structured data: statistical relational learning and deep learning. Statistical relational learning techniques construct probabilistic graphical models based on observed data, excelling at capturing intricate dependencies of available evidence while accommodating prior knowledge, such as logic rules. Notable methods include Markov logic networks (MLNs) and conditional random fields (CRFs). In contrast, deep learning models harness the capability to learn meaningful representations from observed data, using these representations to rapidly comprehend and reason over the data. Graph neural networks (GNNs) have emerged as prominent tools in the realm of deep learning, achieving state-of-the-art results across a spectrum of tasks. Statistical relational learning and deep learning offer distinct perspectives on reasoning. Intuitively, combining these paradigms promises to create a more robust framework that inherits expressive power, efficiency, and the ability to model joint dependencies while simultaneously acquiring representations for more effective reasoning. In pursuit of this vision, this thesis explores the concept, developing methods that seamlessly integrate deep learning and statistical relational learning. Specifically, deep learning enhances the efficiency of learning and inference within statistical relational learning, while statistical relational learning, in turn, refines the predictions generated by deep learning to improve the accuracy. This integrated paradigm is applied across a diverse range of reasoning tasks on graphs. Empirical results demonstrate the effectiveness of this paradigm, encouraging further exploration to yield more robust reasoning frameworks

    GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding

    Full text link
    Learning continuous representations of nodes is attracting growing interest in both academia and industry recently, due to their simplicity and effectiveness in a variety of applications. Most of existing node embedding algorithms and systems are capable of processing networks with hundreds of thousands or a few millions of nodes. However, how to scale them to networks that have tens of millions or even hundreds of millions of nodes remains a challenging problem. In this paper, we propose GraphVite, a high-performance CPU-GPU hybrid system for training node embeddings, by co-optimizing the algorithm and the system. On the CPU end, augmented edge samples are parallelly generated by random walks in an online fashion on the network, and serve as the training data. On the GPU end, a novel parallel negative sampling is proposed to leverage multiple GPUs to train node embeddings simultaneously, without much data transfer and synchronization. Moreover, an efficient collaboration strategy is proposed to further reduce the synchronization cost between CPUs and GPUs. Experiments on multiple real-world networks show that GraphVite is super efficient. It takes only about one minute for a network with 1 million nodes and 5 million edges on a single machine with 4 GPUs, and takes around 20 hours for a network with 66 million nodes and 1.8 billion edges. Compared to the current fastest system, GraphVite is about 50 times faster without any sacrifice on performance.Comment: accepted at WWW 201

    Application of the Input-Output Decomposition Technique to China\u27s Regional Economies

    Get PDF
    Structural decomposition techniques based on input-output table have become a widely used tool for analyzing long term economic growth. However, due to limitations of data, such techniques have never been applied to China\u27s regional economies. Fortunately, in 2003, China\u27s Interregional Input-Output Table for 1987 and Multi-regional Input-Output Table for 1997 were published, making decomposition analysis of China\u27s regional economies possible. This paper first estimates the interregional input-output table in constant price by using an alternative approach: the Grid-Search method, and then applies the standard input-output decomposition technique to China\u27s regional economies for 1987-97. Based on the decomposition results, the contributions to output growth of different factors are summarized at the regional and industrial level. Furthermore, interdependence between China\u27s regional economies is measured and explained by aggregating the decomposition factors into the intraregional multiplier-related effect, the feedback-related effect, and the spillover-related effect. Finally, the performance of China\u27s industrial and regional development policies implemented in the 1990s is briefly discussed based on the analytical results of the paper
    • …