6 research outputs found

    Wide-coverage parsing for Turkish

    Get PDF
    Wide-coverage parsing is an area that attracts much attention in natural language processing research. This is due to the fact that it is the first step tomany other applications in natural language understanding, such as question answering. Supervised learning using human-labelled data is currently the best performing method. Therefore, there is great demand for annotated data. However, human annotation is very expensive and always, the amount of annotated data is much less than is needed to train well-performing parsers. This is the motivation behind making the best use of data available. Turkish presents a challenge both because syntactically annotated Turkish data is relatively small and Turkish is highly agglutinative, hence unusually sparse at the whole word level. METU-Sabancı Treebank is a dependency treebank of 5620 sentences with surface dependency relations and morphological analyses for words. We show that including even the crudest forms of morphological information extracted from the data boosts the performance of both generative and discriminative parsers, contrary to received opinion concerning English. We induce word-based and morpheme-based CCG grammars from Turkish dependency treebank. We use these grammars to train a state-of-the-art CCG parser that predicts long-distance dependencies in addition to the ones that other parsers are capable of predicting. We also use the correct CCG categories as simple features in a graph-based dependency parser and show that this improves the parsing results. We show that a morpheme-based CCG lexicon for Turkish is able to solve many problems such as conflicts of semantic scope, recovering long-range dependencies, and obtaining smoother statistics from the models. CCG handles linguistic phenomena i.e. local and long-range dependencies more naturally and effectively than other linguistic theories while potentially supporting semantic interpretation in parallel. Using morphological information and a morpheme-cluster based lexicon improve the performance both quantitatively and qualitatively for Turkish. We also provide an improved version of the treebank which will be released by kind permission of METU and Sabancı

    CSS Minification via Constraint Solving

    Get PDF
    Minification is a widely-accepted technique which aims at reducing the size of the code transmitted over the web. We study the problem of minifying Cascading Style Sheets (CSS) --- the de facto language for styling web documents. Traditionally, CSS minifiers focus on simple syntactic transformations (e.g. shortening colour names). In this paper, we propose a new minification method based on merging similar rules in a CSS file. We consider safe transformations of CSS files, which preserve the semantics of the CSS file. The semantics of CSS files are sensitive to the ordering of rules in the file. To automatically identify a rule merging opportunity that best minimises file size, we reduce the rule-merging problem to a problem on CSS-graphs, i.e., node-weighted bipartite graphs with a dependency ordering on the edges, where weights capture the number of characters (e.g. in a selector or in a property declaration). Roughly speaking, the corresponding CSS-graph problem concerns minimising the total weight of a sequence of bicliques (complete bipartite subgraphs) that covers the CSS-graph and respects the edge order. We provide the first full formalisation of CSS3 selectors and reduce dependency detection to satisfiability of quantifier-free integer linear arithmetic, for which highly-optimised SMT-solvers are available. To solve the above NP-hard graph optimisation problem, we show how Max-SAT solvers can be effectively employed. We have implemented our algorithms using Max-SAT and SMT-solvers as backends, and tested against approximately 70 real-world examples (including the top 20 most popular websites). In our benchmarks, our tool yields larger savings than six well-known minifiers (which do not perform rule-merging, but support many other optimisations). Our experiments also suggest that better savings can be achieved in combination with one of these six minifiers

    Handling metadata in the scope of coreference detection in data collections

    Get PDF

    Chi-Thinking: Chiasmus and Cognition

    Get PDF
    The treatise proposes chiasmus is a dominant instrument that conducts processes and products of human thought. The proposition grows out of work in cognitive semantics and cognitive rhetoric. These disciplines establish that conceptualization traces to embodied image schematic knowledge. The Introduction sets out how this knowledge gathers from perceptions, experiences, and memories of the body's commonplace engagements in space. With these ideas as suppositional foundation, the treatise contends that chiastic instrumentation is a function of a corporeal mind steeped in elementary, nonverbal spatial forms or gestalts. It shows that chiasmus is a space shape that lends itself to cognition via its simple, but unique architecture and critically that architecture's particular meaning affordances. We profile some chiastic meanings over others based on local conditions. Chiastic iconicity ('lending') devolves from LINE CROSSING in 2-D and PATH CROSSING in 3-D space and from other image schemas (e.g., BALANCE, PART-TO-WHOLE) that naturally syndicate with CROSSING. Profiling and iconicity are cognitive activities. The spatio-physical and the visual aspects of cross diagonalization are discussed under the Chapter Two heading 'X-ness.' Prior to this technical discussion, Chapter One surveys the exceptional versatility and universality of chiasmus across verbal spectra, from radio and television advertisements to the literary arts. The purposes of this opening section are to establish that chiasticity merits more that its customary status as mere rhetorical figure or dispensable stylistic device and to give a foretaste of the complexity, yet automaticity of chi-thinking. The treatise's first half describes the complexity, diversity, and structural inheritance of chiasmus. The second half treats individual chiasma, everything from the most mundane instantiations to the sublime and virtuosic. Chapter Three details the cognitive dimensions of the macro chiasm, which are appreciable in the micro. It builds on the argument that chiasmus secures two cognitive essentials: association and dissociation. Chapter Four, advantaged by Kenneth Burke's "psychology of form," elects chiasmus an instrument of inordinate form and then explores the issue of Betweenity, i.e., how chiasma, like crisscrosses, direct notice to an intermediate region. The study ends on the premise that chiasmus executes form-meaning pairings with which humans are highly fluent

    A complex systems approach to education in Switzerland

    Get PDF
    The insights gained from the study of complex systems in biological, social, and engineered systems enables us not only to observe and understand, but also to actively design systems which will be capable of successfully coping with complex and dynamically changing situations. The methods and mindset required for this approach have been applied to educational systems with their diverse levels of scale and complexity. Based on the general case made by Yaneer Bar-Yam, this paper applies the complex systems approach to the educational system in Switzerland. It confirms that the complex systems approach is valid. Indeed, many recommendations made for the general case have already been implemented in the Swiss education system. To address existing problems and difficulties, further steps are recommended. This paper contributes to the further establishment complex systems approach by shedding light on an area which concerns us all, which is a frequent topic of discussion and dispute among politicians and the public, where billions of dollars have been spent without achieving the desired results, and where it is difficult to directly derive consequences from actions taken. The analysis of the education system's different levels, their complexity and scale will clarify how such a dynamic system should be approached, and how it can be guided towards the desired performance
    corecore