7 research outputs found

    맵리듀스에서의 병렬 조인을 위한 다차원 범위 분할 기법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 8. 이상구.Joins are fundamental operations for many data analysis tasks, but are not directly supported by the MapReduce framework. This is because 1) the framework is basically designed to process a single input data set, and 2) MapReduce's key-equality based data grouping method makes it difficult to support complex join conditions. As a result, a large number of MapReduce-based join algorithms have been proposed. As in traditional shared-nothing systems, one of the major issues in join algorithms using MapReduce is handling of data skew. We propose a new skew handling method, called Multi-Dimensional Range Partitioning (MDRP), and show that the proposed method outperforms traditional skew handling methods: range-based and randomized methods. Specifically, the proposed method has the following advantages: 1) Compared to the range-based method, it considers the number of output tuples at each machine, which leads better handling of join product skew. 2) Compared with the randomized method, it exploits given join conditions before the actual join begins, so that unnecessary input duplication can be reduced. The MDRP method can be used to support advanced join operations such as theta-joins and multi-way joins. With extensive experiments using real and synthetic data sets, we evaluate the effectiveness of the proposed algorithm.Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 II. Backgrounds and RelatedWork . . . . . . . . . . . . . . . . 8 2.1 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Join Algorithms in MapReduce . . . . . . . . . . . . . . . . 11 2.2.1 Two-Way Join Algorithms . . . . . . . . . . . . . . 11 2.2.2 Multi-Way Join Algorithms . . . . . . . . . . . . . 17 2.3 Data Skew in Join Algorithms . . . . . . . . . . . . . . . . 18 2.4 Skew Handling Approaches in MapReduce . . . . . . . . . 22 2.4.1 Hash-Based Approach . . . . . . . . . . . . . . . . 22 2.4.2 Range-Based Approach . . . . . . . . . . . . . . . 24 2.4.3 Randomized Approach . . . . . . . . . . . . . . . . 26 III. Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1 Multi-Dimensional Range Partitioning . . . . . . . . . . . . 29 3.1.1 Creation of a Partitioning Matrix . . . . . . . . . . . 29 3.1.2 Identifying and Chopping of Heavy Cells . . . . . . 31 3.1.3 Assigning Cells to Reducers . . . . . . . . . . . . . 33 3.1.4 Join Processing using the Partitioning Matrix . . . . 35 3.2 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . 39 3.3 Complex Join Conditions . . . . . . . . . . . . . . . . . . . 41 3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.4.1 Scalar Skew Experiments . . . . . . . . . . . . . . . 44 3.4.2 Zipfs Distribution . . . . . . . . . . . . . . . . . . 49 3.4.3 Non-Equijoin Experiments . . . . . . . . . . . . . . 50 3.4.4 Scalability Experiments . . . . . . . . . . . . . . . 52 3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.2 Memory-Awareness . . . . . . . . . . . . . . . . . 58 3.5.3 Handling of Heavy Cells . . . . . . . . . . . . . . . 59 3.5.4 Existing Histograms . . . . . . . . . . . . . . . . . 60 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 IV. Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.1 Joining Multiple Relations in a MapReduce Job . . . . . . . 65 4.1.1 Example: SPARQL Basic Graph Pattern . . . . . . . 65 4.1.2 Example: Matrix Chain Multiplication . . . . . . . . 67 4.1.3 Single-Key Join and Multiple-Key Join Queries . . . 69 4.2 Skew Handling for Multi-Way Joins . . . . . . . . . . . . . 71 4.2.1 Skew Handling for SK-Join Queries . . . . . . . . . 71 4.2.2 Skew Handling for MK-Join Queires . . . . . . . . 72 4.3 Combinations of SK-Join and MK-Join . . . . . . . . . . . 74 4.3.1 Complex Queries . . . . . . . . . . . . . . . . . . . 74 4.3.2 Iteration-Based Algorithms . . . . . . . . . . . . . . 75 4.3.3 Replication-Based Algorithms . . . . . . . . . . . . 77 4.3.4 Iteration-Based vs. Replication-Based . . . . . . . . 78 4.4 Join-Key Selection Algorithms for Complex Queries . . . . 83 4.4.1 Greedy Key Selection . . . . . . . . . . . . . . . . 84 4.4.2 Multiple Key Selection . . . . . . . . . . . . . . . . 85 4.4.3 Hybrid Key Selection . . . . . . . . . . . . . . . . . 86 4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.5.1 SK-Join Experiments . . . . . . . . . . . . . . . . . 87 4.5.2 MK-Join Experiments . . . . . . . . . . . . . . . . 89 4.5.3 Analysis of TV Watching Logs . . . . . . . . . . . . 90 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 V. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.1 Algorithms for SPARQL Basic Graph Pattern . . . . . . . . 94 5.1.1 MR-Selection . . . . . . . . . . . . . . . . . . . . . 95 5.1.2 MR-Join . . . . . . . . . . . . . . . . . . . . . . . 98 5.1.3 Performance Evaluation . . . . . . . . . . . . . . . 101 5.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . 105 5.2 Algorithms for Matrix Chain Multiplication . . . . . . . . . 107 5.2.1 Serial Two-Way Join (S2) . . . . . . . . . . . . . . 109 5.2.2 Parallel M-Way Join (P2, PM) . . . . . . . . . . . . 111 5.2.3 Serial Two-Way vs. Parallel M-Way . . . . . . . . . 115 5.2.4 Performance Evaluation . . . . . . . . . . . . . . . 116 5.2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . 119 5.2.6 Extension: Embedded MapReduce . . . . . . . . . . 119 VI. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 초록 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133Docto

    Politics of Agricultural Upgrading: A comparative Analysis on Malaysian and Indonesian Palm Oil Industry

    No full text
    학위논문 (석사)-- 서울대학교 대학원 : 정치학과, 2012. 8. 백창재.전통적 발전연구는 경제구조변환을 강조하면서 농업부문의 축소를 처방해왔다. 그러나 경제성과에 대한 기여에 있어 농업이 제조업에 비해 항상 열등한 것은 아니다. 글로벌상품사슬론/가치사슬론에 따르면, 기존에 사양산업으로 간주되었던 열대농업조차도 고도화를 통해 더 많은 부가가치를 전유할 수 있다면 경제발전에 기여할 수 있다. 그렇다면 어떠한 조건에서 농업고도화가 가능한가? 본 논문의 목적은 개발도상국 농업고도화의 성공과 실패를 설명하는 것이다. 이를 위해 팜유산업에서 말레이시아와 인도네시아의 상이한 고도화 경험을 상품사슬론의 시각에서 비교분석한다. 1930년대 두 국가는 팜 원유를 원자재의 형태로 선진국 가공 · 제조업자에게 수출하는 수동적 재배자로서 팜유사슬에 편입되었다. 두 국가가 분기하는 것은 1950년대 이후이다. 말레이시아의 팜유산업은 전방통합을 통해 최종소비재 및 산업중간재를 생산하는 고부가가치가 단계까지 진출함으로써 기능적 고도화를 달성한 반면, 인도네시아는 여전히 가장 부가가치가 낮은 비가공 팜 원유의 수출확대에 긴박되어 있다. 동일한 사슬 내에서 상이하게 나타난 두 국가의 고도화 경험은 어떻게 설명할 수 있는가? 글로벌상품사슬론/가치사슬론에 따르면, 고도화 기회는 일차적으로 상품사슬의 형태와 거버넌스 구조에 따라 달라진다. 따라서 농업고도화에 대한 설명은 해당 농업 상품사슬의 형태와 거버넌스 구조를 규명하는 것에서 출발한다. 관련하여, 농업사슬에 관한 논쟁은 단일사슬론과 구획된 사슬론 간의 논쟁으로 정리할 수 있다. 전자는 농업사슬을 국제중개업자에 의해 주도되는 단일 사슬로 개념화하고, 중개업자의 요구에 조응하는 생산국의 역량차이에 따라 상이한 고도화가 나타난다고 주장한다. 반면 후자는 농업사슬을 생산, 교역, 가공 및 제조 단계별로 서로 다른 행위자에 의해 주도되는 분절된 사슬로 개념화하고, 부가가치가 더 높은 기능을 전유하는 전방통합의 성패에 따라 생산국의 고도화가 달라진다고 주장한다. 단일사슬론의 설명은 이론적 가정의 상충 때문에 설명력이 제한적일 뿐만 아니라, 도시편향적 정치연합이 붕괴한 이후의 아프리카 국가를 암묵적 모델로 삼고 있어 일반화가능성 역시 제한적이다. 반면 구획된 사슬론의 경우, 행위자간 역관계라는 분석틀을 일관되게 사용하고, 생산단계에서 국가의 영향력을 인정하며, 기능적 고도화의 가능성을 적극적으로 인정한다는 점에서 더 큰 설명력과 일반화가능성을 갖는다. 그러나 정책의 결과만을 주요 변수로 삼으면서도 정책의 형성을 간과하는 바, 농업고도화의 궁극적 동학을 설명하지 못한다는 한계가 있다. 본 논문에서는 구획된 사슬론의 설명을 정치연합적 접근으로 보완함으로서 농업고도화에 대한 대안적 분석틀을 구성했다. 이에 따르면, 첫째, 정치연합의 구성에 따라 도농편향, 국가-자본 연계가 달라진다. 둘째, 도농편향의 차이에 따라 농업정책이 달라진다. 셋째, 농업정책의 상이성과 국가-자본연계의 상이성에 따라 기능적 고도화 달성여부가 달라진다. 이 분석틀에 따르면, 정치연합이 농촌편향적 성격을 띠고 국가-자본이 비인격적 · 보편적으로 연계된 말레이시아의 경우, 국가가 농업부문 소득증가를 위한 기능적 고도화의 선호를 가지게 되어 현지자본에 정책적 인센티브를 제공하고, 이에 반응한 현지자본과 국영부문이 농업 전방산업을 순차적으로 현지화하면서 기능적 고도화에 성공한 사례이다. 반면 정치연합이 도시편향적 성격을 띠고 국가-자본이 개별적인 후원-수혜관계로 연계된 인도네시아의 경우, 국가는 필수식품을 도시부문에 저가로 원활히 공급하기 위한 수출억제선호, 도시부문으로의 소득이전을 위한 환금작물 수출촉진선호, 유력인사와 유착된 기업으로의 지대배분의 선호 등 정치연합 내 상호 배치되는 선호를 조정하는데 매진할 수밖에 없어 기능적 고도화가 아닌 산출량 증가에만 긴박되게 된다. 결국 정치연합의 도농편향은 농업고도화의 성패를 가르는 가장 중요한 변수인 동시에 가장 선행하는 변수이다. 본 논문의 이론적 함의는 기존 논의의 결점을 정치연합적 접근으로 보완해 농업고도화의 동학에 대한 설명틀을 제시했다는 것이다. 경험적 함의는 생태적 조건에 의해 불가피하게 거대한 열대농업부문을 가질 수밖에 없는 이른바 제 3세계 국가의 대안적 발전전략을 구체화 했다는 것이다. 특히 후자와 관련해 본 논문은, 농업부문을 짜내어 산업화에 매진할 것을 제언하는 전통적 발전연구와도, 산업구조의 인위적 변환을 반대하는 신자유주의적 발전연구와도 구별되는 입장을 갖는다. 상품사슬론이 함의한 바, 국가행동에 기반을 둔 농업고도화는 경제발전을 가져올 수 있다. 다만 이를 위해서는 농업사슬 말단의 소농에 이르기까지 성장의 과실을 고루 배분하는 정치연합이 필요할 뿐이다.The traditional development literature emphasizes structural transformation and reduction of the agriculture. The agriculture, however, is not necessarily inferior to the manufacturing in contribution to national economic performances. If the agriculture appropriates larger value-added through industrial upgrading, it can be a source of growth. Under what condition developing country achieve the agricultural upgrading? The purpose of this thesis is to explain success or failure of the agricultural upgrading. A comparative analysis with global commodity/value chain perspective applied on differing upgrading of Malaysian and Indonesian palm oil industry. Oil palm cultivation and crude palm oil export to the industrial countries began in 1930s in British Malaya(Malaysia) and Netherlands East Indies(Indonesia). Divergent paths emerged between the two countries after 1950s. Malaysia has successfully integrated forward high value-added functions such as production of industrial intermediates or consumer-goods, while Indonesia has been bound to low value-added functions like production of crude palm oil. In sum, Malaysia has accomplished the agricultural upgrading, but Indonesia has not. Global commodity / value chain theory claims that upgrading opportunities vary chiefly according to organizational traits and governance structure of a commodity chain. Hence, one should refine these to analyse the agricultural upgrading in question. Discussions on the agricultural chain can be encapsulate in a debate between a single chain theory and a segmented chain theory. The former conceptualize the agricultural commodity chain as a single, international traders-driven chain. Therefore, the upgrading depends on supplier's capacity to satisfy demands of international trader, the driving actor in the chain. The latter conceptualize the chain as a segmented chain which is driven by multiple actors at each segment respectively, e.g. production segment by state, trade segment by international trader, manufacturing segment by international manufacturer, retailing segment by international retailer. If this is the case, the upgrading depends on forward integration to higher value-added activities such as processing, trading, manufacturing. The single chain theory is inappropriate because the theory built on inconsistent assumptions and modeled implicitly African nations after breakdown of urban-bias coalition. The segmented chain theory is much more suitable due to coherent analytical framework of power-relations, incorporation of the role of the state into the framework, and recognition of a possibility of the functional upgrading. Nevertheless, the segmented chain theory is not sufficient to explain the dynamics of the agricultural upgrading since it takes Weberian approach which uses as a principal variables consequences of policies but overlooks formation of them. Therefore, the theory should be supplemented with a theory of policy formation to give a satisfactory explanation. In this thesis, the alternative analytical framework is elaborated by combining the segmented chain theory with political coalition theory. Specifically, first, composition of the political coalition determines urban or rural bias of the coalition and affects nature of relationship between state and local capital. Second, the bias of coalition translates into the agricultural policy. Third, the agricultural policy and the nature of relationship between state and capital influence the agricultural upgrading. According to this framework, Malaysia achieves the upgrading in the palm oil industry because it has had rural-bias coalition and its state has been linked with local capital universally and impersonally. The state has had preference to improve rural Malay constituency's income by forward integration of palm oil industry. The state has offered incentive to localization of palm oil production, processing, and trading, even manufacturing related products. As the local capital had responded this incentive, the Malaysian palm oil industry achieves the upgrading. Unlike Malaysia, Indonesia cannot accomplished the upgrading in palm oil industry. Indonesia has had urban-bias coalition and the state has been connected to local capital by particularistic patron-client network. The state had simultaneously pursued contradictory preferences such as a diversion of palm oil to domestic urban consumer from export, promotion of palm oil export to earn foreign exchanges, and protection of domestic refinery connected to powerful political figures. As the state had desperately strived to coordinate these preferences thereby ensuring survival of the authoritarian regime, it had been bound to increase palm oil production quantitatively, rather than the upgrading qualitatively. In sum, the most important antecedent variables for explaining agricultural upgrading is the nature of political coalition: rural-bias or urban-bias. Theoretical implication of the thesis is the supplementation of the theory of the agricultural upgrading by combining segmented agricultural chain theory with the political coalition approach. A empirical implication is the refinement of the conditions under which the agricultural upgrading can be achieved. In relation to the latter, this thesis carries different implications from traditional development literature suggesting radical industrialization by squeezing agriculture sector, or neoliberal literature prescribing Laissez-faire which is no more than 'do-nothing' policy. Agricultural upgrading by the state action can contribute to development of the tropical developing countries, but only with rural-bias coalition.제 1 장 서 론·················································1 제 1 절 문제제기·································································1 제 2 절 연구대상·································································4 제 2 장 이론적 고찰··········································14 제 1 절 기존연구································································14 1. 단일사슬론: 관행이론적 접근·················································16 2. 구획된 사슬론: 정치경제적 접근·············································20 제 2 절 비판적 검토···························································23 1. 단일사슬론··········································································23 2. 구획된 사슬론·····································································28 제 3 절 대안적 분석틀························································33 1. 분석틀················································································33 2. 예비분석·············································································34 제 3 장 팜유사슬··············································38 제 1 절 팜유사슬의 역사·····················································38 제 2 절 팜유사슬의 구조적 특성···········································43 1. 생산단계Ⅰ: 재배··································································44 2. 생산단계Ⅱ: 초기가공····························································44 3. 교역단계·············································································46 4. 가공 및 제조단계·································································48 제 4 장 비교사례연구········································51 제 1 절 말레이시아: 기능적 고도화·······································51 1. 1970년대 이전: 증산····························································51 2. 1970년대 이후: 기능적 고도화···············································55 제 2 절 인도네시아: 증산에 고착된 외형적 확장·····················68 1. 1970년대 이전: 정체····························································68 2. 1970년대 이후: 증산에 고착된 외형적 확장·····························72 제 3 절 요약 및 비교··························································83 제 5 장 토의 및 결론········································86Maste

    Varieties of Production Globalization 2: A Comparative Analysis of Automobile Industries in the US, Germany, and Japan

    No full text
    본 연구는 가치사슬론의 이론적 논의를 기반으로 생산 세계화의 과정과 양상을 경험적으로 분석한다. 생산 세계화와 생산체계의 재편이 가치사슬이론이 전망하듯이 가장 효율적인 모듈생산으로 수렴되어 가고 있는지를 검증하고, 생산체계의 재편과정에서 작동하는 비기능적 변수들의 존재와 그 영향을 평가하려는 것이다. 이를 위해 본 연구는 미국·독일·일본의 자동차 산업을 사례로 선택하여 그 생산체계 재편과정을 분석, 비교하였다. 경험적 비교분석 결과, 생산체계 재편의 구체적 양상과 결과는 국가별로상이하다는 것이 발견되었다. 기술발전에 따라 코드화된 지식을 기반으로 한 모듈생산의 최적관행으로 수렴될 것이라는 가치사슬론자들의 주장은 현실과 부합하지 않는다. 최소한 자동차 산업의 경우 아직 실현되지 않았다고 할 수 있다. 미국·독일·일본의 자동차 산업은 세계화와 경쟁의 압력 하에서 각기 다양한 방식의 생산 세계화 전략을 추구하고 있는 것이다. 본 연구는 국가별로 상이한 제도적 제약 아래 행위자들의 선택이 상이한 방식의 생산체계재편을 이끌었다는 입장을 제시한다. 기업 간 관계 같은 경로의 존적인 제도적 요인이 선도기업의 선택을 제약하고, 선도기업의 선택은 생산체계개편의 상이한 경로를 형성하며, 나아가 다양한 세계화로 귀결된다. 즉, 세계화는 가장 효율적인 체계로의 이행과정이 아닐뿐더러, 그 동학은 기술적 요인을 중심으로 한 환원적·기능주의적 접근으로 설명될 수 없다.Since 1990s, many industries in the advanced industrial countries have reorganized their production regime by creating new production networks on the global scale in order to meet the challenges of globalization. The degree and mode of the so-called globalization of production differ among industries and among countries. Yet global value chain theorists predict that readjustment of national production regimes will converge toward a single best practice―i.e. modularity. This study examines the validity of this proposition by analyzing and comparing automobile industries of the U. S., Germany, and Japan, and argues that the path dependence of existing economic institutions brings varieties of production globalization.이 논문은 2008년 정부(교육과학기술부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (KRF-2008-B00005)OAIID:oai:osos.snu.ac.kr:snu2012-01/102/0000043685/1SEQ:1PERF_CD:SNU2012-01EVAL_ITEM_CD:102USER_ID:0000043685ADJUST_YN:YEMP_ID:A077599DEPT_CD:216CITE_RATE:0FILENAME:첨부된 내역이 없습니다.DEPT_NM:정치외교학부EMAIL:[email protected]_YN:NCONFIRM:

    생산 세계화의 다양성 Ⅱ: 미국·독일·일본 자동차 산업의 비교분석

    No full text
    본 연구는 가치사슬론의 이론적 논의를 기반으로 생산 세계화의 과정과 양상을 경험적으로 분석한다. 생산 세계화와 생산체계의 재편이 가치사슬이론이 전망하듯이 가장 효율적인 모듈생산으로 수렴되어 가고 있는지를 검증하고, 생산체계의 재편과정에서 작동하는 비기능적 변수들의 존재와 그 영향을 평가하려는 것이다. 이를 위해 본 연구는 미국·독일·일본의 자동차 산업을 사례로 선택하여 그 생산체계 재편과정을 분석, 비교하였다. 경험적 비교분석 결과, 생산체계 재편의 구체적 양상과 결과는 국가별로 상이하다는 것이 발견되었다. 기술발전에 따라 코드화된 지식을 기반으로 한 모듈생산의 최적관행으로 수렴될 것이라는 가치사슬론자들의 주장은 현실과 부합하지 않는다. 최소한 자동차 산업의 경우 아직 실현되지 않았다고 할 수 있다. 미국, 독일, 일본의자동차 산업은 세계화와 경쟁의 압력 하에서 각기 다양한 방식의 생산 세계화 전략을추구하고 있는 것이다. 본 연구는 국가별로 상이한 제도적 제약 아래 행위자들의 선택이 상이한 방식의 생산체계재편을 이끌었다는 입장을 제시한다. 기업 간 관계 같은 경로의존적인 제도적 요인이 선도기업의 선택을 제약하고, 선도기업의 선택은 생산체계개편의 상이한 경로를 형성하며, 나아가 다양한 세계화로 귀결된다. 즉, 세계화는 가장 효율적인 체계로의 이행과정이 아닐 뿐더러, 그 동학은 기술적 요인을 중심으로 한 환원적·기능주의적 접근으로 설명될 수 없다.N
    corecore