103 research outputs found

    Sequential Foreign Investments, Regional Technology Platforms and the Evolution of Japanese Multinationals in East Asia

    Get PDF
    IVABSTRACTIn this paper, we investigate the firm-level mechanisms that underlie the sequential foreign direct investment (FDI) decisions of multinational corporations (MNCs). To understand inter-firm heterogeneity in the sequential FDI behaviors of MNCs, we develop a firm capability-based model of sequential FDI decisions. In the setting of Japanese electronics MNCs in East Asia, we empirically examine how prior investments in firm capabilities affect sequential investments into existingproduction bases in response to major environmental changes. In our empirical investigation, which isbased on descriptive statistical analysis, panel data regression analysis, and field studies, we find supporting evidence for our main argument that sequential FDIs are firm-specific, evolutionary processes in which prior investments in firm capabilities influence future sequential FDI behaviors

    Intellectual Property Regimes, Innovative Capabilities, and Patenting in Korea

    Get PDF
    In this paper, we empirically investigate whether and to what extent major changes in IPR contributed to subsequent upgrading of innovative capabilities and patenting in Korea. We found that major IPR changes in Korea in the 1980s led to the big increase in patenting, thereby supporting the friendly court hypothesis. Especially, the trend of substance patent applications by local residents seems to suggest that the IPR change in Korea encouraged local firms to focus more on developing innovative capabilities and patenting more actively. Based on the Korean experience, we offer an insight into the recent debate on the relationship between IPR and economic development in developing countries

    Search Behavior and Catch-up of Firms in Emerging Markets

    Get PDF
    This study investigates catch-up in the form of knowledge creation of firms in emerging markets by stressing two distinct types of search behaviors of an organization โ€“ horizontal search and vertical search. Based on an empirical analysis of 204 Chinese firms, this study provides new theoretical insights into and practical implications by emphasizing that in order to catch-up, firms in emerging markets should adopt idiosyncratic search strategies different from those of firms in more advanced countries. The regression results show that due to their under-developed absorptive capacity, firms in emerging markets should avoid searching in diverse knowledge fields, as established large firms in advanced countries are encouraged to do, in order to innovate successfully. Our findings also suggest that searching for recent and emerging knowledge helps firms in emerging markets overcome their learning curve disadvantage in the process of catch-up

    GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

    Full text link
    Graph neural networks (GNNs) are one of the most rapidly growing fields within deep learning. According to the growth in the dataset and the model size used for GNNs, an important problem is that it becomes nearly impossible to keep the whole network on GPU memory. Among numerous attempts, distributed training is one popular approach to address the problem. However, due to the nature of GNNs, existing distributed approaches suffer from poor scalability, mainly due to the slow external server communications. In this paper, we propose GraNNDis, an efficient distributed GNN training framework for training GNNs on large graphs and deep layers. GraNNDis introduces three new techniques. First, shared preloading provides a training structure for a cluster of multi-GPU servers. We suggest server-wise preloading of essential vertex dependencies to reduce the low-bandwidth external server communications. Second, we present expansion-aware sampling. Because shared preloading alone has limitations because of the neighbor explosion, expansion-aware sampling reduces vertex dependencies that span across server boundaries. Third, we propose cooperative batching to create a unified framework for full-graph and minibatch training. It significantly reduces redundant memory usage in mini-batch training. From this, GraNNDis enables a reasonable trade-off between full-graph and mini-batch training through unification especially when the entire graph does not fit into the GPU memory. With experiments conducted on a multi-server/multi-GPU cluster, we show that GraNNDis provides superior speedup over the state-of-the-art distributed GNN training frameworks

    Learning and innovation: Exploitation and exploration trade-offs

    Get PDF
    a b s t r a c t a r t i c l e i n f o This paper examines the relationship between learning and innovation outcomes, focusing on the trade-off between exploitation and exploration in learning and innovation. The study identifies two types of learning and two outcomes of innovation. Exploitation and exploration in learning are inversely associated with innovation rates and impact. While exploitative, localized learning is positively associated with innovation rates, but negatively associated with impact, exploratory learning-by-experimentation shows the opposite relationship. The study examines panel data of 103 companies in the global pharmaceutical industry over a 7-year period in an empirical test of our hypotheses. Results support the existence of the exploitation and exploration trade-off

    AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping

    Full text link
    With the advance in genome sequencing technology, the lengths of deoxyribonucleic acid (DNA) sequencing results are rapidly increasing at lower prices than ever. However, the longer lengths come at the cost of a heavy computational burden on aligning them. For example, aligning sequences to a human reference genome can take tens or even hundreds of hours. The current de facto standard approach for alignment is based on the guided dynamic programming method. Although this takes a long time and could potentially benefit from high-throughput graphic processing units (GPUs), the existing GPU-accelerated approaches often compromise the algorithm's structure, due to the GPU-unfriendly nature of the computational pattern. Unfortunately, such compromise in the algorithm is not tolerable in the field, because sequence alignment is a part of complicated bioinformatics analysis pipelines. In such circumstances, we propose AGAThA, an exact and efficient GPU-based acceleration of guided sequence alignment. We diagnose and address the problems of the algorithm being unfriendly to GPUs, which comprises strided/redundant memory accesses and workload imbalances that are difficult to predict. According to the experiments on modern GPUs, AGAThA achieves 18.8ร—\times speedup against the CPU-based baseline, 9.6ร—\times against the best GPU-based baseline, and 3.6ร—\times against GPU-based algorithms with different heuristics.Comment: Published at PPoPP 202

    Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System

    Full text link
    The recent huge advance of Large Language Models (LLMs) is mainly driven by the increase in the number of parameters. This has led to substantial memory capacity requirements, necessitating the use of dozens of GPUs just to meet the capacity. One popular solution to this is storage-offloaded training, which uses host memory and storage as an extended memory hierarchy. However, this obviously comes at the cost of storage bandwidth bottleneck because storage devices have orders of magnitude lower bandwidth compared to that of GPU device memories. Our work, Smart-Infinity, addresses the storage bandwidth bottleneck of storage-offloaded LLM training using near-storage processing devices on a real system. The main component of Smart-Infinity is SmartUpdate, which performs parameter updates on custom near-storage accelerators. We identify that moving parameter updates to the storage side removes most of the storage traffic. In addition, we propose an efficient data transfer handler structure to address the system integration issues for Smart-Infinity. The handler allows overlapping data transfers with fixed memory consumption by reusing the device buffer. Lastly, we propose accelerator-assisted gradient compression/decompression to enhance the scalability of Smart-Infinity. When scaling to multiple near-storage processing devices, the write traffic on the shared channel becomes the bottleneck. To alleviate this, we compress the gradients on the GPU and decompress them on the accelerators. It provides further acceleration from reduced traffic. As a result, Smart-Infinity achieves a significant speedup compared to the baseline. Notably, Smart-Infinity is a ready-to-use approach that is fully integrated into PyTorch on a real system. We will open-source Smart-Infinity to facilitate its use.Comment: Published at HPCA 2024 (Best Paper Award Honorable Mention
    • โ€ฆ
    corecore