690 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationThe explosion of structured Web data (e.g., online databases, Wikipedia infoboxes) creates many opportunities for integrating and querying these data that go far beyond the simple search capabilities provided by search engines. Although much work has been devoted to data integration in the database community, the Web brings new challenges: the Web-scale (e.g., the large and growing volume of data) and the heterogeneity in Web data. Because there are so much data, scalable techniques that require little or no manual intervention and that are robust to noisy data are needed. In this dissertation, we propose a new and effective approach for matching Web-form interfaces and for matching multilingual Wikipedia infoboxes. As a further step toward these problems, we propose a general prudent schema-matching framework that matches a large number of schemas effectively. Our comprehensive experiments for Web-form interfaces and Wikipedia infoboxes show that it can enable on-the-fly, automatic integration of large collections of structured Web data. Another problem we address in this dissertation is schema discovery. While existing integration approaches assume that the relevant data sources and their schemas have been identified in advance, schemas are not always available for structured Web data. Approaches exist that exploit information in Wikipedia to discover the entity types and their associate schemas. However, due to inconsistencies, sparseness, and noise from the community contribution, these approaches are error prone and require substantial human intervention. Given the schema heterogeneity in Wikipedia infoboxes, we developed a new approach that uses the structured information available in infoboxes to cluster similar infoboxes and infer the schemata for entity types. Our approach is unsupervised and resilient to the unpredictable skew in the entity class distribution. Our experiments, using over one hundred thousand infoboxes extracted from Wikipedia, indicate that our approach is effective and produces accurate schemata for Wikipedia entities

    Austronesian and other languages of the Pacific and South-east Asia : an annotated catalogue of theses and dissertations

    Get PDF

    Implementing a pedagogical improvement proposal on listening skill through the support of audiovisual material and cooperative learning: watching the film Shrek as as a helping audiovisual tool to teach adjectives and descriptions on 2nd year students of Compulsory Secondary Education

    Get PDF
    Treball Final de Màster Universitari en Professor/a d'Educació Secundària Obligatòria i Batxillerat, Formació Professional i Ensenyaments d'Idiomes. Codi SAP419. Curs: 2018/2019English, with the increasing globalisation and the development of Internet, has become an indispensable language for communication and for having access to job positions. Spain has adapted to this reality by incorporating structural and legislative changes in its education system. Specifically, the Organic Law 8/2013 for the improvement of the educative quality (LOMCE) proposes the creation of multilingual centres with the aim of fostering the working future of students and their inclusion in a globalised society. These changes are reflected in the classrooms with the use of new technologies as well as with the emeregence of new educational methodologies and audiovisual resources. The current Final Master’s Degree Dissertation evaluates the results of an implementation of a research project on two groups of 2nd year of Compulsory Secondary Education (ESO) students with the goal of improving the listening skills of English language through the use of audio-visual material and cooperative work. The sample was composed by 48 students from a Secondary school of Castellón de la Plana. The analysis concludes that the use of audio-visual media enhances the listening skill of English. Likewise, the study stresses that students like to work in the classroom with innovative and collaborative methodologies

    Child Labor

    Get PDF
    In recent years, there has been an astonishing proliferation of empirical work on child labor. An Econlit search of keywords "child lab*r" reveals a total of 6 peer reviewed journal articles between 1980 and 1990, 65 between 1990 and 2000, and 143 in the first five years of the present decade. The purpose of this essay is to provide a detailed overview of the state of the recent empirical literature on why and how children work as well as the consequences of that work. Section 1 defines terms commonly used in the study of child time allocation and provides a descriptive overview of how children spend their time in low income countries today. Section 2 reviews the case for attention to the most common types of work in which children participate, focusing on that work's impact on schooling, health, as well as externalities associated with that work. Section 3 considers the literature on the determinants of child time allocation such as the influence of local labor markets, family interactions, the net return to schooling, and poverty. Section 5 discusses the limited evidence on different policy options aimed at influencing child labor. Section 6 concludes by emphasizing important research questions requiring additional research such as child and parental agency, the effectiveness of child labor policies, and the determinants of participation in the "worst forms" of child labor.

    How State Capacity Matters: A Study of the Cooptation and Coercion of Religious Organizations in Southeast Asia and Beyond

    Get PDF
    This dissertation examines the complex relationship between state capacity, authoritarian regimes and religious organizations in Southeast Asia and beyond. Through an interdisciplinary synthesis of secondary literatures in Comparative Politics, Sociology, and Religious Studies, complemented by archival research conducted at Stanford University’s Hoover Institution, this dissertation argues that relative state capacity endowment shapes the strategies that authoritarian regime elites employ against domestic religious organizations as a means of ensuring regime survival. Through typological theory-building and a comparative case-study methodology, I argue that state capacity, imagined in terms of both bureaucratic/administrative and coercive components, influences whether authoritarian regime elites decide to pursue policies of cooptation (bribery, patronage, and political appointments) or coercion (incarceration, threats, violence) vis-à-vis religious organizations. Comparative case- study analysis of the relationship between authoritarian regimes and religious organizations in Burma, Thailand, Malaysia, Cambodia, Laos, Vietnam, the Democratic Republic of the Congo, Poland, and Nicaragua reveals clear variations in regime elite strategies across time and space. My findings demonstrate that authoritarian regime elites in states with strong bureaucratic/ administrative capacity and strong coercive capacity have relied on cooptation as their preferred strategy for containing threats posed by religious organizations, while regime elites in states with weak bureaucratic/ administrative capacity and strong coercive capacity have instead tended to employ violence against these groups. Finally, regime elites in states with weak bureaucratic/administrative capacity and weak coercive capacity have cycled, unsuccessfully, between policies of cooptation and coercion in the hopes of containing powerful domestic religious organizations. The comparative analysis in this dissertation provides a nuanced explanation for how authoritarian regime elites leverage state resources to counter threats posed by symbolically powerful religious groups and contributes a new mid-range theory of state-society relations with implications for authoritarian regimes far beyond the region

    Ecological modernisation of industrial estates in Viet Nam

    Get PDF
    This research provides insights into environmental policy-making and management in Viet Nam, with special reference to industrial estates. It analyses the reasons behind the weaknesses of environmental policy-making and management in dealing with contemporary industrialisation and recommends a model for the greening of industrial estates. This contributes to the efforts to compromise economic and environmental goals in the pursuit of a national development strategy in transitional economy.The research contributes to the theory of Ecological Modernisation, which has been subject to many critiques, notably on its Eurocentric basis. The value and applicability of Ecological Modernisation Theory for developing or industrialising economies is therefore often questioned. This research shows the relevance of Ecological Modernisation Theory for the developing and industrialising economy of Viet Nam

    Dynamic conceptions of input, output and interaction: Vietnamese EFL lecturers learning second language acquisition theory

    Get PDF
    Although research into language teacher learning and cognition and teaching innovations oriented to communicative tasks has been abundant, little has addressed EFL teachers’ learning and conceiving of SLA principles underlying task-based language teaching. The study reported in the present thesis aims to fill this gap, specifically investigating teachers’ learning and conceiving of the notions of rich comprehensible language input, and authentic output and interaction, referred to as ‘SLA facilitating conditions’. The study explores three issues: teachers’ conceptions of the SLA facilitating conditions based on their practices in the tertiary English classroom; teachers’ perceptions of implementing the conditions, including factors affecting the implementation; and teachers’ perceived learning or change as a result of the process. Data for the study were obtained from six Vietnamese EFL lecturers who voluntarily participated in two short professional development workshops focusing on language input, and output and interaction. The data collection process was cumulative, beginning with pre-workshop interviews, followed by collection of lesson plans, lesson-based interviews, reflective writing, observation of lesson recordings, and a questionnaire. Analysis and interpretation followed a process of triangulation, and drew on the author’s knowledge of the context and the teachers’ backgrounds. The results showed that the six teachers held contextualised conceptions of language input, and output and interaction. Although they believed that these conditions are important for language learning, their conceptions based on their implementation of the conditions reflected a synthetic product-oriented view of language learning and teaching. The teachers demonstrated an accommodation of the notion of comprehensible input into their existing pedagogical understanding, and revealed a conception of language output oriented to accuracy and fluency of specific target language items. Tasks and activities for interaction were mainly to provide students with contexts to use the target language items meaningfully rather than to communicate meaning. Most teachers delayed communicative tasks until their students were acquainted with the language content of the day. Such conceptions and practices had a connection with both conceptual/experiential and contextual factors, namely their prior training and experience, time limitations, syllabus, and students’ characteristics. The study also showed that although the teachers’ perceptions of the feasibility of promoting rich language input and authentic output and interaction were neutral, they thought promoting these conditions was relevant to students’ learning, congruent with their pre-existing beliefs about teaching English, and this granted them a sense of agency. The teachers also reported they became more aware of input, and output and interaction in teaching, confident, and purposeful in actions, and some reported a widened view of English language teaching. The study confirms that teacher learning and cognition is conceptually and contextually conditioned (Borg, 2006). In terms of this, it provides a model of how EFL teachers’ learning SLA is constrained by prior pedagogical beliefs and contextual conditions. In conjunction with previous research, the study provided evidence to suggest that communicative and task-based language teaching would appear to run counter to existing beliefs about teaching and practical conditions in Asian EFL situations. This lends support to a more flexible organic approach to employing tasks, perhaps considering the extent to which and in what ways communicative tasks are pedagogically useful to the EFL classroom. An implication is that for any new approaches like task-based language teaching to be incorporated into teachers’ existing repertoire, teachers’ conceptions of language input and interaction, and the conceptual and practical constraints influencing their thinking and practice should be considered and addressed. In a broader sense, approaches to teacher education and development should take a constructivist perspective on teacher learning, taking into account the local context of teaching and teachers’ existing cognition

    Random Sequence Perception Amongst Finance and Accounting Personnel: Can We Measure Illusion Of Control, A Type I Error, or Illusion Of Chaos, A Type II Error?

    Get PDF
    The purpose of this dissertation was to determine if finance and accounting personnel could distinguish between random and non-random time-series strings and to determine what types of errors they would make. These individuals averaging 13 years of experience were unable to distinguish non-random patterns from random strings in an assessment composed of statistical process control (SPC) charts. Respondents scored no better than guessing which was also assessed with a series of true-false questions. Neither over-alternation (oscillation) nor under-alternation (trend) strategies were able to predict type I or type II error rates, i.e. illusion of control or illusion of chaos. Latent class analysis methods within partial least squares structural equation modeling (PLS-SEM) were successful in uncovering segments or groups of respondents with large explained variance and significant path models. Relationships between desirability of control, personal fear of invalidity, and error rates were more varied than expected. Yet, some segments tended to illusion of control while others to illusion of chaos. Similar effects were also observed when substituting a true-false guessing assessment for the SPC assessment with some loss of explained variance and weaker path coefficients. Respondents also provided their perceptions and thoughts of randomness for both SPC and true-false assessments

    Wiktionary: The Metalexicographic and the Natural Language Processing Perspective

    Get PDF
    Dictionaries are the main reference works for our understanding of language. They are used by humans and likewise by computational methods. So far, the compilation of dictionaries has almost exclusively been the profession of expert lexicographers. The ease of collaboration on the Web and the rising initiatives of collecting open-licensed knowledge, such as in Wikipedia, caused a new type of dictionary that is voluntarily created by large communities of Web users. This collaborative construction approach presents a new paradigm for lexicography that poses new research questions to dictionary research on the one hand and provides a very valuable knowledge source for natural language processing applications on the other hand. The subject of our research is Wiktionary, which is currently the largest collaboratively constructed dictionary project. In the first part of this thesis, we study Wiktionary from the metalexicographic perspective. Metalexicography is the scientific study of lexicography including the analysis and criticism of dictionaries and lexicographic processes. To this end, we discuss three contributions related to this area of research: (i) We first provide a detailed analysis of Wiktionary and its various language editions and dictionary structures. (ii) We then analyze the collaborative construction process of Wiktionary. Our results show that the traditional phases of the lexicographic process do not apply well to Wiktionary, which is why we propose a novel process description that is based on the frequent and continual revision and discussion of the dictionary articles and the lexicographic instructions. (iii) We perform a large-scale quantitative comparison of Wiktionary and a number of other dictionaries regarding the covered languages, lexical entries, word senses, pragmatic labels, lexical relations, and translations. We conclude the metalexicographic perspective by finding that the collaborative Wiktionary is not an appropriate replacement for expert-built dictionaries due to its inconsistencies, quality flaws, one-fits-all-approach, and strong dependence on expert-built dictionaries. However, Wiktionary's rapid and continual growth, its high coverage of languages, newly coined words, domain-specific vocabulary and non-standard language varieties, as well as the kind of evidence based on the authors' intuition provide promising opportunities for both lexicography and natural language processing. In particular, we find that Wiktionary and expert-built wordnets and thesauri contain largely complementary entries. In the second part of the thesis, we study Wiktionary from the natural language processing perspective with the aim of making available its linguistic knowledge for computational applications. Such applications require vast amounts of structured data with high quality. Expert-built resources have been found to suffer from insufficient coverage and high construction and maintenance cost, whereas fully automatic extraction from corpora or the Web often yields resources of limited quality. Collaboratively built encyclopedias present a viable solution, but do not cover well linguistically oriented knowledge as it is found in dictionaries. That is why we propose extracting linguistic knowledge from Wiktionary, which we achieve by the following three main contributions: (i) We propose the novel multilingual ontology OntoWiktionary that is created by extracting and harmonizing the weakly structured dictionary articles in Wiktionary. A particular challenge in this process is the ambiguity of semantic relations and translations, which we resolve by automatic word sense disambiguation methods. (ii) We automatically align Wiktionary with WordNet 3.0 at the word sense level. The largely complementary information from the two dictionaries yields an aligned resource with higher coverage and an enriched representation of word senses. (iii) We represent Wiktionary according to the ISO standard Lexical Markup Framework, which we adapt to the peculiarities of collaborative dictionaries. This standardized representation is of great importance for fostering the interoperability of resources and hence the dissemination of Wiktionary-based research. To this end, our work presents a foundational step towards the large-scale integrated resource UBY, which facilitates a unified access to a number of standardized dictionaries by means of a shared web interface for human users and an application programming interface for natural language processing applications. A user can, in particular, switch between and combine information from Wiktionary and other dictionaries without completely changing the software. Our final resource and the accompanying datasets and software are publicly available and can be employed for multiple different natural language processing applications. It particularly fills the gap between the small expert-built wordnets and the large amount of encyclopedic knowledge from Wikipedia. We provide a survey of previous works utilizing Wiktionary, and we exemplify the usefulness of our work in two case studies on measuring verb similarity and detecting cross-lingual marketing blunders, which make use of our Wiktionary-based resource and the results of our metalexicographic study. We conclude the thesis by emphasizing the usefulness of collaborative dictionaries when being combined with expert-built resources, which bears much unused potential
    corecore