Search CORE

994 research outputs found

Nodalida 2005 - proceedings of the 15th NODALIDA conference

Author
Publication venue: University of Joensuu
Publication date
Field of study

Foundation, Implementation and Evaluation of the MorphoSaurus System: Subword Indexing, Lexical Learning and Word Sense Disambiguation for Medical Cross-Language Information Retrieval

Author: Markó Kornél Géza
Publication venue
Publication date: 05/03/2009
Field of study

Im medizinischen Alltag, zu welchem viel Dokumentations- und Recherchearbeit gehört, ist mittlerweile der überwiegende Teil textuell kodierter Information elektronisch verfügbar. Hiermit kommt der Entwicklung leistungsfähiger Methoden zur effizienten Recherche eine vorrangige Bedeutung zu. Bewertet man die Nützlichkeit gängiger Textretrievalsysteme aus dem Blickwinkel der medizinischen Fachsprache, dann mangelt es ihnen an morphologischer Funktionalität (Flexion, Derivation und Komposition), lexikalisch-semantischer Funktionalität und der Fähigkeit zu einer sprachübergreifenden Analyse großer Dokumentenbestände. In der vorliegenden Promotionsschrift werden die theoretischen Grundlagen des MorphoSaurus-Systems (ein Akronym für Morphem-Thesaurus) behandelt. Dessen methodischer Kern stellt ein um Morpheme der medizinischen Fach- und Laiensprache gruppierter Thesaurus dar, dessen Einträge mittels semantischer Relationen sprachübergreifend verknüpft sind. Darauf aufbauend wird ein Verfahren vorgestellt, welches (komplexe) Wörter in Morpheme segmentiert, die durch sprachunabhängige, konzeptklassenartige Symbole ersetzt werden. Die resultierende Repräsentation ist die Basis für das sprachübergreifende, morphemorientierte Textretrieval. Neben der Kerntechnologie wird eine Methode zur automatischen Akquise von Lexikoneinträgen vorgestellt, wodurch bestehende Morphemlexika um weitere Sprachen ergänzt werden. Die Berücksichtigung sprachübergreifender Phänomene führt im Anschluss zu einem neuartigen Verfahren zur Auflösung von semantischen Ambiguitäten. Die Leistungsfähigkeit des morphemorientierten Textretrievals wird im Rahmen umfangreicher, standardisierter Evaluationen empirisch getestet und gängigen Herangehensweisen gegenübergestellt

Digitale Bibliothek Thüringen

From Frequency to Meaning: Vector Space Models of Semantics

Author: Pantel Patrick
Turney Peter D.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2010
Field of study

Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

A Task-based Evaluation of French Morphological Resources and Tools

Author: Bernhard Delphine
Cartoni Bruno
Tribout Delphine
Publication venue: Stanford Calif.: CSLI Publications
Publication date: 01/01/2011
Field of study

Morphology is a key component for many Language Technology applications. However, morphological relations, especially those relying on the derivation and compounding processes, are often addressed in a superﬁcial manner. In this article, we focus on assessing the relevance of deep and motivated morphological knowledge in Natural Language Processing applications. We ﬁrst describe an annotation experiment whose goal is to evaluate the role of morphology for one task, namely Question Answering (QA). We then highlight the kind of linguistic knowledge that is necessary for this particular task and propose a qualitative analysis of morphological phenomena in order to identify the morphological processes that are most relevant. Based on this study, we perform an intrinsic evaluation of existing tools and resources for French morphology, in order to quantify their coverage. Our conclusions provide helpful insights for using and building appropriate morphological resources and tools that could have a signiﬁcant impact on the application performance

Hal-Diderot

International Conference on Modern Greek Dialects and Linguistic Theory 9, Leonidio, Tsakonia, Greece, 4-5 June 2021 : Abstracts

Author: Janse Mark
Joseph Brian D.
Kisilier Maxim
Ralli Angela
Publication venue: University of Patras
Publication date: 01/01/2021
Field of study

Ghent University Academic Bibliography

Ti and ki in Pharasiot Greek

Author: Bagriacik Metin
Sampanis Konstantinos
Publication venue: University of Patras
Publication date: 01/01/2021
Field of study

Ghent University Academic Bibliography

Vowel variation in the Mišótika Cappadocian of Mandra (Larisa)

Author: Janse Mark
Papazachariou Dimitris
Vassalou Nikoleta
Publication venue: University of Patras
Publication date: 01/01/2021
Field of study

Ghent University Academic Bibliography

Gender, definiteness and word order in Ulağaç Cappadocian

Author: Daveloose Eline
Janse Mark
Publication venue: University of Patras
Publication date: 01/01/2021
Field of study

Of all the Cappadocian dialects, Ulağaç Cappadocian is considered the most ‘corrupt’ by Dawkins: “Nowhere is the vocabulary so filled with Turkish words or the syntax so Turkish” (1916: 18). Kesisoglou singles out the following as being characteristic: the loss of grammatical gender distinctions and the resulting neuterisation of nouns, including the the generalized use of the neuter article do, pl. da (1951: 4). In the case of transitive clauses this results in potential ambiguity, as nominative and accusative NPs are not distinguished morphologically. Kesisoglou quotes the following example: itó do néka do ándra-t páasen do do xorjó, which could either mean ‘that woman led her husband to the village’ or ‘that woman, her husband led her to the village’ (1951: 49). To disambiguate such cases, the article is often omitted under the second interpretation according to Kesisoglou (ibid.): itó do néka ándra-t páasen do do xorjó. Likewise, itó do peí vavá-t çórsen do ‘that child, its father saw it’ vs. itó do peí do vavá-t çórsen do ‘that child saw its father’ (ibid.). This suggests that the article is omitted in the case of subject NPs, but not in the case of object NPs (Janse 2019: 100). Upon closer scrutiny, however, it turns out that the article can only be omitted if the noun is historically masculine or feminine, but not neuter. In this paper, I investigate the use of the article in transitive clauses containing two overt NPs in connection with the word order and information structure of these clauses as means of distinguishing subject from object NP

Ghent University Academic Bibliography

Subject contact relatives in Asia Minor Greek

Author: Bagriacik Metin
Publication venue: University of Patras
Publication date: 01/01/2021
Field of study

Ghent University Academic Bibliography