116 research outputs found

    Computational Generalization in Taxonomies Applied to: (1) Analyze Tendencies of Research and (2) Extend User Audiences

    Get PDF
    D.F. and B.M. acknowledge continuing support by the Academic Fund Program at the NRU HSE (grant-19-04-019 in 2018?2019) and by the DECAN Lab NRU HSE, in the framework of a subsidy granted to the HSE by the Government of the Russian Federation for the implementation of the Russian Academic Excellence Project ?5-100?. S.N. acknowledges the support by FCT/MCTES, NOVA LINCS (UID/CEC/04516/2019).We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors referred to as “gaps” and “offshoots”. Our method, ParGenFS, globally minimizes a penalty function combining the numbers of head subjects and gaps and offshoots, differently weighted. Two applications are considered: (1) analysis of tendencies of research in Data Science; (2) audience extending for programmatic targeted advertising online. The former involves a taxonomy of Data Science derived from the celebrated ACM Computing Classification System 2012. Based on a collection of research papers published by Springer 1998–2017, and applying in-house methods for text analysis and fuzzy clustering, we derive fuzzy clusters of leaf topics in learning, retrieval and clustering. The head subjects of these clusters inform us of some general tendencies of the research. The latter involves publicly available IAB Tech Lab Content Taxonomy. Each of about 25 mln users is assigned with a fuzzy profile within this taxonomy, which is generalized offline using ParGenFS. Our experiments show that these head subjects effectively extend the size of targeted audiences at least twice without loosing quality.authorsversionpublishe

    Using Taxonomy Tree to Generalize a Fuzzy Thematic Cluster

    Get PDF
    D.F. and B.M. acknowledge continuing support by the Academic Fund Program at the National Research University Higher School of Economics (grant 19-04-019 in 2018-2019) and by the International Decision Choice and Analysis Laboratory (DECAN) NRU HSE, in the framework of a subsidy granted to the HSE by the Government of the Russian Federation for the implementation of the the Russian Academic Excellence Project “5-100”. S.N. acknowledges the support by FCT/MCTES, NOVA LINCS (UID/CEC/04516/2019).This paper presents an algorithm, ParGenFS, for generalizing, or 'lifting', a fuzzy set of topics to higher ranks of a hierarchical taxonomy of a research domain. The algorithm ParGenFS finds a globally optimal generalization of the topic set to minimize a penalty function, by balancing the number of introduced 'head subjects' and related errors, the 'gaps' and 'offshoots', differently weighted. This leads to a generalization of the topic set in the taxonomy. The usefulness of the method is illustrated on a set of 17685 abstracts of research papers on Data Science published in Springer journals for the past 20 years. We extracted a taxonomy of Data Science from the international Association for Computing Machinery Computing Classification System 2012 (ACM-CCS). We find fuzzy clusters of leaf topics over the text collection, lift them in the taxonomy, and interpret found head subjects to comment on the tendencies of current research.authorsversionpublishe

    Reflexive Space. A Constructionist Model of the Russian Reflexive Marker

    Get PDF
    This study examines the structure of the Russian Reflexive Marker ( ся/-сь) and offers a usage-based model building on Construction Grammar and a probabilistic view of linguistic structure. Traditionally, reflexive verbs are accounted for relative to non-reflexive verbs. These accounts assume that linguistic structures emerge as pairs. Furthermore, these accounts assume directionality where the semantics and structure of a reflexive verb can be derived from the non-reflexive verb. However, this directionality does not necessarily hold diachronically. Additionally, the semantics and the patterns associated with a particular reflexive verb are not always shared with the non-reflexive verb. Thus, a model is proposed that can accommodate the traditional pairs as well as for the possible deviations without postulating different systems. A random sample of 2000 instances marked with the Reflexive Marker was extracted from the Russian National Corpus and the sample used in this study contains 819 unique reflexive verbs. This study moves away from the traditional pair account and introduces the concept of Neighbor Verb. A neighbor verb exists for a reflexive verb if they share the same phonological form excluding the Reflexive Marker. It is claimed here that the Reflexive Marker constitutes a system in Russian and the relation between the reflexive and neighbor verbs constitutes a cross-paradigmatic relation. Furthermore, the relation between the reflexive and the neighbor verb is argued to be of symbolic connectivity rather than directionality. Effectively, the relation holding between particular instantiations can vary. The theoretical basis of the present study builds on this assumption. Several new variables are examined in order to systematically model variability of this symbolic connectivity, specifically the degree and strength of connectivity between items. In usage-based models, the lexicon does not constitute an unstructured list of items. Instead, items are assumed to be interconnected in a network. This interconnectedness is defined as Neighborhood in this study. Additionally, each verb carves its own niche within the Neighborhood and this interconnectedness is modeled through rhyme verbs constituting the degree of connectivity of a particular verb in the lexicon. The second component of the degree of connectivity concerns the status of a particular verb relative to its rhyme verbs. The connectivity within the neighborhood of a particular verb varies and this variability is quantified by using the Levenshtein distance. The second property of the lexical network is the strength of connectivity between items. Frequency of use has been one of the primary variables in functional linguistics used to probe this. In addition, a new variable called Constructional Entropy is introduced in this study building on information theory. It is a quantification of the amount of information carried by a particular reflexive verb in one or more argument constructions. The results of the lexical connectivity indicate that the reflexive verbs have statistically greater neighborhood distances than the neighbor verbs. This distributional property can be used to motivate the traditional observation that the reflexive verbs tend to have idiosyncratic properties. A set of argument constructions, generalizations over usage patterns, are proposed for the reflexive verbs in this study. In addition to the variables associated with the lexical connectivity, a number of variables proposed in the literature are explored and used as predictors in the model. The second part of this study introduces the use of a machine learning algorithm called Random Forests. The performance of the model indicates that it is capable, up to a degree, of disambiguating the proposed argument construction types of the Russian Reflexive Marker. Additionally, a global ranking of the predictors used in the model is offered. Finally, most construction grammars assume that argument construction form a network structure. A new method is proposed that establishes generalization over the argument constructions referred to as Linking Construction. In sum, this study explores the structural properties of the Russian Reflexive Marker and a new model is set forth that can accommodate both the traditional pairs and potential deviations from it in a principled manner.Siirretty Doriast

    From Mind to Text

    Get PDF
    From Mind to Text: Continuities and Breaks Between Cognitive, Aesthetic and Textualist Approaches to Literature explores the historical context of theory formation and of its contemporary status, including an overview of debates about theory’s role in literary studies provided both by representatives of theory itself, as well as by those who distance themselves from it

    Design and Instantiation of an Interactive Multidimensional Ontology for Game Design Elements – a Design and Behavioral Approach

    Get PDF
    While games and play are commonly perceived as leisure tools, focus on the strategic implementation of isolated gameful elements outside of games has risen in recent years under the term gamification. Given their ease of implementation and impact in competitive games, a small set of game design elements, namely points, badges, and leaderboards, initially dominated research and practice. However, these elements reflect only a small group of components that game designers use to achieve positive outcomes in their systems. Current research has shifted towards focusing on the game design process instead of the isolated implementation of single elements under the term gameful design. But the problem of a tendency toward a monocultural selection of prominent design elements persists in-game and gameful design, preventing the method from reaching its full potential. This dissertation addresses this problem by designing and developing a digital, interactive game design element ontology that scholars and practitioners can use to make more informed and inspired decisions in creating gameful solutions to their problems. The first part of this work is concerned with the collation and development of the digital ontology. First, two datasets were collated from game design and gamification literature (game design elements and playing motivations). Next, four explorative studies were conducted to add user-relevant metadata and connect their items into an ontological structure. The first two studies use card sorting to assess game theory frameworks regarding their suitability as foundational categories for the game design element dataset and to gain an overview of different viewpoints from which categorizations can be derived. The second set of studies builds on an explorative method of matching dataset entries via their descriptive keywords to arrive at a connected graph. The first of these studies connects items of the playing motivations dataset with themselves, while the second connects them with an additional dataset of human needs. The first part closes with the documentation of the design and development of the tool Kubun, reporting on the outcome of its evaluation via iterative expert interviews and a field study. The results suggest that the tool serves its preset goals of affording intuitive browsing for dedicated searches and serendipitous findings. While the first part of this work reports on the top-down development process of the ontology and related navigation tool, the second part presents an in-depth research of specific learning-oriented game design elements to complement the overall research goal through a complementary bottom-up approach. Therein, two studies on learning-oriented game design elements are reported regarding their effect on performance, long-term learning outcome, and knowledge transfer. The studies are conducted with a game dedicated to teaching correct waste sorting. The first study focuses on a reward-based game design element in terms of its motivatory effect on perfect play. The second study evaluates two learning-enhancing game design elements, repeat, and look-up, in terms of their contribution to a long-term learning outcome. The comprehensive insights gained through the in-depth research manifest in the design of a module dedicated to reporting research outcomes in the ontology. The dissertation concludes with a discussion on the studies’ varying limitations and an outlook on pathways for future research

    The changing medium of instruction policies of state-schools in recently formed states: a comparative analysis

    Get PDF
    In this thesis I analyse changes to the medium of instruction (MOI) policies of primary and secondary schools of new states which gained independence after the end of the Second World War up to 2015. In it I view MOI policies as drivers of linguistic state building, with decisions to use additional languages for teaching and learning being evaluated in terms of the threat that they may pose to the status of official languages and established patterns of social opportunity and status associated with knowledge of them. I develop and use an expanded version of Bourdieu’s theory of the national linguistic market as a conceptual framework to capture the interaction of factors both inside and outside of the state which may influence MOI policy decisions. The existing comparative literature consists mainly of descriptive studies of individual states or geographical regions. My study is distinctive because it focuses on new states and uses a large, longitudinal, sample to provide a global perspective on the choices that new states have made about MOI policies in primary and secondary schools and how these policies have changed. My methodological approach is distinct, using qualitative comparative analysis (QCA), an approach which is currently underutilized in comparative education research, particularly in studies with a temporal component. I develop a novel MOI typology, identifying four distinctive models: Purist (only the state language(s) are used); Pragmatic (community languages are used in primary schooling); Accommodating (high status community languages are used in secondary school); and Opportunistic (new, high status, languages are introduced as MOI). I argue that, whilst Bourdieu’s concept of linguistic markets provides a powerful basis for understanding MOI policy decisions, the interaction of national (internal) linguistic markets with the international (external) linguistic market needs to be considered to fully understand patterns of MOI policy change over time
    corecore