98 research outputs found
A functional perspective on machine learning via programmable induction and abduction
We present a programming language for machine learning
based on the concepts of ‘induction’ and ‘abduction’ as encountered in
Peirce’s logic of science. We consider the desirable features such a language
must have, and we identify the ‘abductive decoupling’ of parameters
as a key general enabler of these features. Both an idealised abductive
calculus and its implementation as a PPX extension of OCaml are
presented, along with several simple examples
Operators in the lexicon : on the negative logic of natural language
LEI Universiteit LeidenTheoretical and Experimental Linguistic
30th International Conference on Information Modelling and Knowledge Bases
Information modelling is becoming more and more important topic for researchers, designers, and users of information systems. The amount and complexity of information itself, the number of abstraction levels of information, and the size of databases and knowledge bases are continuously growing. Conceptual modelling is one of the sub-areas of information modelling. The aim of this conference is to bring together experts from different areas of computer science and other disciplines, who have a common interest in understanding and solving problems on information modelling and knowledge bases, as well as applying the results of research to practice. We also aim to recognize and study new areas on modelling and knowledge bases to which more attention should be paid. Therefore philosophy and logic, cognitive science, knowledge management, linguistics and management science are relevant areas, too. In the conference, there will be three categories of presentations, i.e. full papers, short papers and position papers
Eesti keele üldvaldkonna tekstide laia kattuvusega automaatne sündmusanalüüs
Seoses tekstide suuremahulise digitaliseerimisega ning digitaalse tekstiloome järjest laiema levikuga on tohutul hulgal loomuliku keele tekste muutunud ja muutumas masinloetavaks. Masinloetavus omab potentsiaali muuta tekstimassiivid inimeste jaoks lihtsamini hallatavaks, nt lubada rakendusi nagu automaatne sisukokkuvõtete tegemine ja tekstide põhjal küsimustele vastamine, ent paraku ei ulatu praegused automaatanalüüsi võimalused tekstide sisu tegeliku mõistmiseni. Oletatakse, tekstide sisu mõistvale automaatanalüüsile viib meid lähemale sündmusanalüüs – kuna paljud tekstid on narratiivse ülesehitusega, tõlgendatavad kui „sündmuste kirjeldused”, peaks tekstidest sündmuste eraldamine ja formaalsel kujul esitamine pakkuma alust mitmete „teksti mõistmist” nõudvate keeletehnoloogia rakenduste loomisel.
Käesolevas väitekirjas uuritakse, kuivõrd saab eestikeelsete tekstide sündmusanalüüsi käsitleda kui avatud sündmuste hulka ja üldvaldkonna tekste hõlmavat automaatse lingvistilise analüüsi ülesannet. Probleemile lähenetakse eesti keele automaatanalüüsi kontekstis uudsest, sündmuste ajasemantikale keskenduvast perspektiivist. Töös kohandatakse eesti keelele TimeML märgendusraamistik ja luuakse raamistikule toetuv automaatne ajaväljendite tuvastaja ning ajasemantilise märgendusega (sündmusviidete, ajaväljendite ning ajaseoste märgendusega) tekstikorpus; analüüsitakse korpuse põhjal inimmärgendajate kooskõla sündmusviidete ja ajaseoste määramisel ning lõpuks uuritakse võimalusi ajasemantika-keskse sündmusanalüüsi laiendamiseks geneeriliseks sündmusanalüüsiks sündmust väljendavate keelendite samaviitelisuse lahendamise näitel.
Töö pakub suuniseid tekstide ajasemantika ja sündmusstruktuuri märgenduse edasiarendamiseks tulevikus ning töös loodud keeleressurssid võimaldavad nii konkreetsete lõpp-rakenduste (nt automaatne ajaküsimustele vastamine) katsetamist kui ka automaatsete märgendustööriistade edasiarendamist.
Due to massive scale digitalisation processes and a switch from traditional means of written communication to digital written communication, vast amounts of human language texts are becoming machine-readable. Machine-readability holds a potential for easing human effort on searching and organising large text collections, allowing applications such as automatic text summarisation and question answering. However, current tools for automatic text analysis do not reach for text understanding required for making these applications generic. It is hypothesised that automatic analysis of events in texts leads us closer to the goal, as many texts can be interpreted as stories/narratives that are decomposable into events.
This thesis explores event analysis as broad-coverage and general domain automatic language analysis problem in Estonian, and provides an investigation starting from time-oriented event analysis and tending towards generic event analysis. We adapt TimeML framework to Estonian, and create an automatic temporal expression tagger and a news corpus manually annotated for temporal semantics (event mentions, temporal expressions, and temporal relations) for the language; we analyse consistency of human annotation of event mentions and temporal relations, and, finally, provide a preliminary study on event coreference resolution in Estonian news.
The current work also makes suggestions on how future research can improve Estonian event and temporal semantic annotation, and the language resources developed in this work will allow future experimentation with end-user applications (such as automatic answering of temporal questions) as well as provide a basis for developing automatic semantic analysis tools
Methodological Tools for Linguistic Description and Typology.
International audienc
Scientific Conjectures and the Growth of Knowledge
A collective understanding that traces a debate between 'what is science?’ and
‘what is a science about?’ has an extraction to the notion of scientific knowledge.
The debate undertakes the pursuit of science that hardly extravagance the dogma
of pseudo-science. Scientific conjectures invoke science as an intellectual activity
poured by experiences and repetition of the objects that look independent of any
idealist views (believes in the consensus of mind-dependence reality). The realistic
machinery employs in an empiricist exposition of the objective phenomenon by
synchronizing the general method to make observational predictions that cover all
the phenomena of the particular entity without any exception. The formation of science
encloses several epistemological purviews and a succession of conjectures cum
refutation that a newer theorem could reinstate. My attempt is to advocate a holistic
plea of scientific conjectures that outruns the restricted regulation of experience or
testable hypothesis to render the validity of a chain of logical reasoning (deductive
or inductive) of basic scientific statements. The milieu of scientific intensification
integrates speculation that loads efficiency towards a new experimental dimension
where the reality is not itself objective or observers relative; in fact the observed
phenomenon divulges in the constructive progression of preferred methods of falsifiability
and uncertainty
Issues in Spanish Verbal Inflection: A Distributed Morphology Approach
This dissertation analyzes various issues in the morphology of Spanish’s seven simple verb forms in a syntax-centric morphological framework known as Distributed Morphology (DM). In the extant DM literature, scholars have primarily analyzed verbal inflection as a linear arrangement of morphemes (e.g., Madrid Servín, 2005; Oltra-Massuet and Arregi, 2005). However, failing to account for the interpretation of a given verbal form is problematic. A focus on the semantics of each verbal form is required to understand how several seemingly disparate forms, such as the future and the subjunctive or the conditional and the imperfect subjunctive, are related to each other and what this relationship reveals about their structure. Thus, a major claim made in this dissertation is that a fairly robust understanding of the semantics of each of the seven verbal forms considered is required to (i) link the structure of these verbal forms to their meanings, (ii) to account for contrasts that are not currently accounted for in the literature, and (iii) to make connections between forms that would not otherwise be obvious. Additionally, for the future and conditional forms, in particular, it is argued that the historical analysis, which consists of an infinitive followed by a form of the verb haber ’have’, is superior to proposed reanalysis-based approaches. This historically informed approach demonstrates that we cannot dismiss historical analyses wholesale. Throughout the dissertation, I also demonstrate that the morphosyntax of these seven simple Spanish verbal forms can be accounted for with less conceptual machinery than previously argued for in several DM analyses while covering more empirical ground. Specifically, it is argued that the employment of lexical diacritics and morphological readjustment rules, among other analytical devices, are unnecessary for the analysis of Spanish verbs. In addition to these broad concerns, the dissertation proposes several novel solutions to data that have proven recalcitrant in prior analyses thus making an important contribution to the theoretical literature on Spanish verbal morphology
Issues in Spanish Verbal Inflection: A Distributed Morphology Approach
This dissertation analyzes various issues in the morphology of Spanishs seven simple verb forms in a syntax-centric morphological framework known as Distributed Morphology (DM). In the extant DM literature, scholars have primarily analyzed verbal inflection as a linear arrangement of morphemes (e.g., Madrid Servn, 2005; Oltra-Massuet and Arregi, 2005). However, failing to account for the interpretation of a given verbal form is problematic. A focus on the semantics of each verbal form is required to understand how several seemingly disparate forms, such as the future and the subjunctive or the conditional and the imperfect subjunctive, are related to each other and what this relationship reveals about their structure. Thus, a major claim made in this dissertation is that a fairly robust understanding of the semantics of each of the seven verbal forms considered is required to (i) link the structure of these verbal forms to their meanings, (ii) to account for contrasts that are not currently accounted for in the literature, and (iii) to make connections between forms that would not otherwise be obvious. Additionally, for the future and conditional forms, in particular, it is argued that the historical analysis, which consists of an infinitive followed by a form of the verb haber have, is superior to proposed reanalysis-based approaches. This historically informed approach demonstrates that we cannot dismiss historical analyses wholesale. Throughout the dissertation, I also demonstrate that the morphosyntax of these seven simple Spanish verbal forms can be accounted for with less conceptual machinery than previously argued for in several DM analyses while covering more empirical ground. Specifically, it is argued that the employment of lexical diacritics and morphological readjustment rules, among other analytical devices, are unnecessary for the analysis of Spanish verbs. In addition to these broad concerns, the dissertation proposes several novel solutions to data that have proven recalcitrant in prior analyses thus making an important contribution to the theoretical literature on Spanish verbal morphology
Time, events and temporal relations: an empirical model for temporal processing of Italian texts
The aim of this work is the elaboration a computational model for the identification of temporal relations in text/discourse to be used as a component in more complex systems for Open-Domain Question-Answers, Information Extraction and Summarization. More specifically, the thesis will concentrate on the relationships between the various elements which signal temporal relations in Italian texts/discourses, on their roles and how they can be exploited.
Time is a pervasive element of human life. It is the primary element thanks to which we are able to observe, describe and reason about what surrounds us and the world. The absence of a correct identification of the temporal ordering of what is narrated and/or described may result in a bad comprehension, which can lead to a misunderstanding. Normally, texts/discourses present situations standing in a particular temporal ordering. Whether these situations precede, or overlap or are included one within the other is inferred during the general process of reading and understanding. Nevertheless, to perform this seemingly easy task, we are taking into account a set of complex information involving different linguistic entities and sources of knowledge. A wide variety of devices is used in natural languages to convey temporal information. Verb tense, temporal prepositions, subordinate conjunctions, adjectival phrases are some of the most obvious. Nevertheless even these obvious devices have different degrees of temporal transparency, which may sometimes be not so obvious as it can appear at a quick and superficial analysis.
One of the main shortcomings of previous research on temporal relations is represented by the fact that they concentrated only on a particular discourse segment, namely narrative discourse, disregarding the fact that a text/discourse is composed by different types of discourse segments and relations. A good theory or framework for temporal analysis must take into account all of them. In this work, we have concentrated on the elaboration of a framework which could be applied to all text/discourse segments, without paying too much attention to their type, since we claim that temporal relations can be recovered in every kind of discourse segments and not only in narrative ones.
The model we propose is obtained by mixing together theoretical assumptions and empirical data, collected by means of two tests submitted to a total of 35 subjects with different backgrounds. The main results we have obtained from these empirical studies are: (i.) a general evaluation of the difficulty of the task of recovering temporal relations; (ii.) information on the level of granularity of temporal relations; (iii.) a saliency-based order of application of the linguistic devices used to express the temporal relations between two eventualities; (iv.) the proposal of tense temporal polysemy, as a device to identify the set of preferences which can assign unique values to possibly multiple temporal relations. On the basis of the empirical data, we propose to enlarge the set of classical finely grained interval relations (Allen, 1983) by including also coarse-grained temporal relations (Freska, 1992). Moreover, there could be cases in which we are not able to state in a reliable way if there exists a temporal relation or what the particular relation between two entities is. To overcome this issue we have adopted the proposal by Mani (2007) which allows the system to have differentiated levels of temporal representation on the basis of the temporal granularity associated with each discourse segment. The lack of an annotated corpus for eventualities, temporal expressions and temporal relations in Italian represents the biggest shortcomings of this work which has prevented the implementation of the model and its evaluation. Nevertheless, we have been able to conduct a series of experiments for the validation of procedures for the further realization of a working prototype. In addition to this, we have been able to implement and validate a working prototype for the spotting of temporal expressions in texts/discourses
The Quantum Monadology
The modern theory of functional programming languages uses monads for
encoding computational side-effects and side-contexts, beyond bare-bone program
logic. Even though quantum computing is intrinsically side-effectful (as in
quantum measurement) and context-dependent (as on mixed ancillary states),
little of this monadic paradigm has previously been brought to bear on quantum
programming languages.
Here we systematically analyze the (co)monads on categories of parameterized
module spectra which are induced by Grothendieck's "motivic yoga of operations"
-- for the present purpose specialized to HC-modules and further to set-indexed
complex vector spaces. Interpreting an indexed vector space as a collection of
alternative possible quantum state spaces parameterized by quantum measurement
results, as familiar from Proto-Quipper-semantics, we find that these
(co)monads provide a comprehensive natural language for functional quantum
programming with classical control and with "dynamic lifting" of quantum
measurement results back into classical contexts.
We close by indicating a domain-specific quantum programming language (QS)
expressing these monadic quantum effects in transparent do-notation, embeddable
into the recently constructed Linear Homotopy Type Theory (LHoTT) which
interprets into parameterized module spectra. Once embedded into LHoTT, this
should make for formally verifiable universal quantum programming with linear
quantum types, classical control, dynamic lifting, and notably also with
topological effects.Comment: 120 pages, various figure
- …