16 research outputs found
VIT â Venice Italian Treebank: Syntactic and Quantitative Features
Proceedings of the Sixth International Workshop on Treebanks and
Linguistic Theories.
Editors: Koenraad De Smedt, Jan HajiÄ and Sandra KĂŒbler.
NEALT Proceedings Series, Vol. 1 (2007), 43-54.
© 2007 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/4476
Towards Automatic Dialogue Understanding
In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it to be used in dialogue understanding. The current goal is that of extracting automatically argumentative information in order to build argumentative structure. The long term goal is using argumentative structure to produce automatic summarization of spoken dialogues.
Very much like other deep linguistic processing systems (see Allen et al, 2007), our system is a generic text/dialogue understanding system that can be used in connection with an ontology â WordNet â and other similar repositories of commonsense knowledge. Word sense disambiguation takes place at the level of semantic interpretation and is represented in the Discourse Model. We will present the adjustments we made in order to cope with transcribed spoken dialogues like those produced in the ICSI Berkely project. The low level component is organized according to LFG theory; at this level, the system does pronominal binding, quantifier raising and temporal interpretation. The high level component is where the Discourse Model is created from the Logical Form. For longer sentences the system switches from the top-down to the bottom-up system. In case of failure it will back off to the partial system which produces a very lean and shallow semantics with no inference rules.
In a final section, we present preliminary evaluation of the system on two tasks: the task of automatic argumentative labelling and another frequently addressed task: referential vs. non-referential pronominal detection. Results obtained fair much higher than those reported in similar experiments with machine learning approaches
Deep Linguistic Processing with GETARUNS for Spoken Dialogue Understanding
In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it
to be used in dialogue understanding. The current goal is that of extracting automatically argumentative information in
order to build argumentative structure. The long term goal is using argumentative structure to produce automatic
summarization of spoken dialogues. Very much like other deep linguistic processing systems, our system is a generic
text/dialogue understanding system that can be used in connection with an ontology â WordNet - and other similar
repositories of commonsense knowledge. We will present the adjustments we made in order to cope with transcribed
spoken dialogues like those produced in the ICSI Berkeley project. In a final section we present preliminary evaluation of
the system on two tasks: the task of automatic argumentative labeling and another frequently addressed task: referential vs.
non-referential pronominal detection. Results obtained fair much higher than those reported in similar experiments with
machine learning approaches
English/Veneto Resource Poor Machine Translation with STILVEN
The paper reports ongoing work for the
implementation of a system for automatic translation
from English-to-Veneto and viceversa. The system does
not have parallel texts to work on because of the
almost inexistence of such manual translations. The
project is called STILVEN and is financed by the
Regional Authorities of Veneto Region in Italy. After
the first year of activities, we managed to produce a
prototype which handles Venetian questions that have
a structure very close to English. We will present
problems related to Veneto, basic ideas, their
implementatiion and results obtained
Advanced age, time to treatment and long-term mortality: single centre data from the FAST-STEMI network
Background. Optimization of the techniques and larger accessibility to mechanical reperfusion have significantly improved the outcomes of patients with ST-segment elevation myocardial infarction (STEMI). However, suboptimal results have been observed in certain higher-risk subsets of patients, as in advanced age, where the benefits of primary PCI are more debated. We evaluated the impact of systematic primary percutaneous coronary intervention (PCI) and an optimized STEMI network on the long-term prognosis from a single centre experience.Methods. We included STEMI patients included in the FAST-STEMI network between 2016 and 2019. Ischemia duration was defined as the time from symptoms onset to coronary reopening (pain-to-balloon, PTB). The primary study endpoint (PE) was a composite of mortality and recurrent MI at long-term follow-up. Indywidual outcome endpoints were also assessed.Results. We included 253 patients undergoing primary PCI and discharged alive. Mean age was 67.2 ± 12.5 years, 75.1% males and 19.8% diabetics. At a median follow-up of 581 [307â922] days, the primary endpoint occurred in 24 patients (7.9%), of whom 5.5% died. The occurrence of a cardiovascular event was significantly associated with advanced age (p < 0.001), renal failure (p = 0.03), lower ejection fraction at discharge (p = 0.04) and longer in-hospital stay (p = 0.01). The median PTB was 198 minutes [IQR: 125â340 min], that was significantly longer among patients experiencing the PE (p = 0.01). A linear relationship was observed between age and PTB (r = 0.13, p = 0.009). However, both age â„ 75 years and PTB above the median emerged as independent predictors of the primary endpoint (age: HR [95%CI] = 5.56 [2.26â13.7], p < 0.001, PTB: HR [95%CI] = 3.59 [1.39â9.3], p = 0.01). Similar results were observed for overall mortality.Conclusion. The present study shows that among STEMI patients undergoing primary PCI in a single centre, the duration of ischemia and advance age are independently associated to long-term mortality and recurrent myocardial infarction. However, longer time to reperfusion was observed among elderly patients
IL BOIN NO MUSEIKA in giapponese: desonorizzazione, devocalizzazione o elisione vocalica?
Il museika, Ăš un fenomeno che interessa principalmente le vocali alte giapponesi [i] e [}], ed Ăš riscontrato essenzialmente nel linguaggio informale della parlata di Tokyo. Dal punto di vista fonologico, quando queste vocali si trovano tra due consonanti ostruenti [-sonoro], o a fine morfema precedute da consonante [-sonoro], esse sono sottoposte a âdevoicingâ, rappresentate foneticamente come [iâ€] e [}â€]. In questo lavoro proporremo una diversificazione graduata in tre livelli del âdevoicingâ a seconda che vi siano oppure no tracce formantiche nello spettro, che definiremo âdesonorizzazioneâ e âdevocalizzazioneâ, oltre ovviamente ad individuare i casi di vera elisione vocalica.
Lâesperimento che abbiamo compiuto prevede uno studio comparativo che mette a confronto termini che costituiscono prestiti lessicali provenienti dalla lingua inglese, denominati âgairaigoâ in giapponese e termini preesistenti nel giapponese standard, allo scopo di verificare se lâapplicazione della regola fonologica di museika venga estesa automaticamente a questi nuovi elementi lessicali. Vale la pena notare che lâapplicazione della regola modifica comunque la pronuncia originale delle parole gairaigo che per poter essere introdotte nel giapponese devono rispettare in tutto e per tutto la fonologia del giapponese. Le parole sono state pronunciate da parlanti di sesso maschile e femminile allâinterno di una frase quadro. Lâanalisi spettrografica delle parole interessate dal museika ci ha permesso di distinguere chiaramente tra diversi tipi o gradi di âdevoicingâ dei vocoidi giapponesi. Dalle analisi compiute, la natura acustica e articolatoria della consonante che precede le vocali alte Ăš fondamentale per la realizzazione del museika, ma vi sono altri fattori importanti che interagiscono con questo fenomeno
Enriching the Venice Italian Treebank with dependency and grammatical relations
In this paper we propose a rule-based approach to extract dependency and grammatical relations from the Venice Italian Treebank (VIT) (Delmonte et al., 2007) with bracketed tree structure. To our knowledge, the only dependency annotated corpus for Italian available is the Turin University Treebank (Lesmo et al., 2002), which has 25,000 tokens and is about 1/10 of VIT. As manual corpus annotation is expensive and time-consuming, we decided to exploit an existing constituency-based treebank, the VIT, to derive dependency structures with lower effort. After describing the procedure to extract heads and dependents, based on a head percolation table for Italian, we introduce the rules adopted to add grammatical relation labels. To this purpose, we manually relabeled all non-canonical arguments, which are very frequent in Italian, then we automatically labeled the remaining complements or arguments following some syntactic restrictions based on the position of the constituents w.r.t to parent and sibling nodes. The final section of the paper describes evaluation results, carried out in two steps, one for dependency relations and one for grammatical roles. Since results are promising, we plan to use the dependency treebank to train a dependency-based parser and eventually a semantic role labelling system
VIT : Venice Italian Treebank : caratteristiche sintattico-semantiche e quantitative
In questo articolo descriveremo il VIT, Treebank (Sintattico) dellâItaliano (dellâUniversitaÌ) di Venezia (Venice Italian Treebank) di 320.000 parole, creato dal Laboratorio di Linguistica Computazionale del Dipartimento di Scienze del Linguaggio. Focalizzeremo la nostra attenzione sulle caratteristiche sintattico- semantiche del treebank che sono in parte legate al tagset adottato, in parte sono dovute alla teoria linguistica di riferimento, e infine sono, come per ogni treebank, legate alla lingua prescelta, lâitaliano. Con esempi presi anche da treebank dispo- nibili per altre lingue, mostreremo quali sono le differenze e le motivazioni teori- che e pratiche dietro le scelte fatte. Dedicheremo infine una parte della nostra pre- sentazione allâanalisi quantitativa dei dati del nostro treebank confrontandoli con gli altri. In generale si cercheraÌ di dimostrare come lâapprendimento di una gram- matica o di un parser in maniera automatica da un treebank, non possa dare gli stessi risultati passando da un treebank allâaltro, e come questo processo sia dipendente da fattori sostanziali come il quadro linguistico di riferimento adotta- to per la descrizione strutturale noncheÌ in ultima analisi, la lingua descritta