8 research outputs found

    Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

    Get PDF
    Specialized dictionaries are used to understand concepts in specific domains, especially where those concepts are not part of the general vocabulary, or having meanings that differ from ordinary languages. The first step in creating a specialized dictionary involves detecting the characteristic vocabulary of the domain in question. Classical methods for detecting this vocabulary involve gathering a domain corpus, calculating statistics on the terms found there, and then comparing these statistics to a background or general language corpus. Terms which are found significantly more often in the specialized corpus than in the background corpus are candidates for the characteristic vocabulary of the domain. Here we present two tools, a directed crawler, and a distributional semantics package, that can be used together, circumventing the need of a background corpus. Both tools are available on the web

    Open Directory Project based universal taxonomy for Personalization of Online (Re)sources

    Get PDF
    Content personalization reflects the ability of content classification into (predefined) thematic units or information domains. Content nodes in a single thematic unit are related to a greater or lesser extent. An existing connection between two available content nodes assumes that the user will be interested in both resources (but not necessarily to the same extent). Such a connection (and its value) can be established through the process of automatic content classification and labeling. One approach for the classification of content nodes is the use of a predefined classification taxonomy. With the help of such classification taxonomy it is possible to automatically classify and label existing content nodes as well as create additional descriptors for future use in content personalization and recommendation systems. For these purposes existing web directories can be used in creating a universal, purely content based, classification taxonomy. This work analyzes Open Directory Project (ODP) web directory and proposes a novel use of its structure and content as the basis for such a classification taxonomy. The goal of a unified classification taxonomy is to allow for content personalization from heterogeneous sources. In this work we focus on the overall quality of ODP as the basis for such a classification taxonomy and the use of its hierarchical structure for automatic labeling. Due to the structure of data in ODP different grouping schemes are devised and tested to find the optimal content and structure combination for a proposed classification taxonomy as well as automatic labeling processes. The results provide an in-depth analysis of ODP and ODP based content classification and automatic labeling models. Although the use of ODP is well documented, this question has not been answered to date

    the case of Mexico

    Get PDF
    Thesis(Master) --KDI School:Master of Public Policy,2019Partisan influence on public policy has been studied cross-nationally in macro level by examining variables such as GDP and democracy level. Political scientist like Tufte (1992) have argued parties do matter in OECD countries as leftist parties are inclined to spend more on social policy and equality, while Huber et al. (2008) refuted partisanship does not matter in social policy spending in Latin American countries. With regards to Latin American OECD countries, partiesdo-matter and parties-do-not-matter hypotheses were continuously debated as both seemed to correspond. As Mexico went through a huge political partisanship movement from non-left party to left party in the 100 years after the Mexican revolution, this paper chooses micro level countrybased approach research on partisan influence on public policies with the case of Mexico. Under this sudden shift, this paper questions if partisan influence in shaping and implementing public policy is attainable through examining both incumbents’ speeches. Unlike the traditional methodology identifying partisan influence, the purpose of the research is to introduce a new methodological approach analyzing partisan influence by using stenographic records of presidential speeches and discourses as a data set. Also, for the policy categorization, policies were aligned based on Sustainable Development Goals (SDGs)’ social, economic, and environmental pillars. By sorting each left and non-left presidential speeches and discourses as per 17 different categories of SDGs through computer-based unsupervised learning text analysis, the study has figured out that left incumbent in Mexico tends to put more emphasis on social policies than nonleft party in rhetoric level. This is the first such analysis by using a new methodological tool scrutinizing political party’s public policy direction aligned with SDGs, based on presidential speeches and discourses as a data set.I. Introduction II. Sustainable Development Goals III. Theoretical Framework IV. Hypothesis Development V. Methodology VI. Empirical Analysis VII. ConclusionmasterpublishedYoomin LEE
    corecore