Search CORE

1,490 research outputs found

ATLAS: A flexible and extensible architecture for linguistic annotation

Author: Bird Steven
Day David
Garofolo John
Henderson John
Laprun Christophe
Liberman Mark
Publication venue
Publication date: 01/01/2000
Field of study

We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on ``Annotation Graphs,'' a graph model for annotations on linear signals (such as text and speech) indexed by intervals, for which efficient database storage and querying techniques are applicable. We note how a wide range of existing annotated corpora can be mapped to this annotation graph model. This model is then generalized to encompass a wider variety of linguistic ``signals,'' including both naturally occuring phenomena (as recorded in images, video, multi-modal interactions, etc.), as well as the derived resources that are increasingly important to the engineering of natural language processing systems (such as word lists, dictionaries, aligned bilingual corpora, etc.). We conclude with a review of the current efforts towards implementing key pieces of this architecture.Comment: 8 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

Author
Publication venue: European Language Resources Association
Publication date: 01/01/2016
Field of study

No abstract available

Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

Author
Publication venue: European Language Resources Association
Publication date: 01/01/2016
Field of study

No abstract available

Enlighten

Learning Transfers over Several Programming Languages

Author: Baltaji Razan
Buratti Luca
Hirzel Martin
Mandel Louis
Pujar Saurabh
Varshney Lav
Publication venue
Publication date: 25/10/2023
Field of study

Large language models (LLMs) have recently become remarkably good at improving developer productivity for high-resource programming languages. These models use two kinds of data: large amounts of unlabeled code samples for pretraining and relatively smaller amounts of labeled code samples for fine-tuning or in-context learning. Unfortunately, many programming languages are low-resource, lacking labeled samples for most tasks and often even lacking unlabeled samples. Therefore, users of low-resource languages (e.g., legacy or new languages) miss out on the benefits of LLMs. Cross-lingual transfer learning uses data from a source language to improve model performance on a target language. It has been well-studied for natural languages, but has received little attention for programming languages. This paper reports extensive experiments on four tasks using a transformer-based LLM and 11 to 41 programming languages to explore the following questions. First, how well cross-lingual transfer works for a given task across different language pairs. Second, given a task and target language, how to best choose a source language. Third, the characteristics of a language pair that are predictive of transfer performance, and fourth, how that depends on the given task.Comment: 16 pages, 5 figures, 5 table

arXiv.org e-Print Archive

E(nhanced)-research and the future role and tasks of research libraries

Author: Lossau Norbert
Publication venue
Publication date: 04/11/2008
Field of study

Ettekanne TÜ raamatukogus Saksa-Eesti akadeemilise nädala Academica raames 04.11.2008

DSpace at Tartu University Library

Web services for distributed and interoperable hydro-information systems

Author: Horák Jiří
Orlík Antonín
Stromský Josef
Publication venue: 'Index Copernicus'
Publication date: 01/06/2007
Field of study

Web services support the integration and interoperability of Web-based applications and enable machineto- machine interaction. The concepts of web services and open distributed architecture were applied to the development of T-DSS, the prototype customised for web based hydro-information systems. T-DSS provides mapping services, database related services and access to remote components, with special emphasis placed on the output flexibility (e.g. multilingualism), where SOAP web services are mainly used for communication. The remote components are represented above all by remote data and mapping services (e.g. meteorological predictions), modelling and analytical systems (currently HEC-HMS, MODFLOW and additional utilities), which support decision making in water management

Directory of Open Access Journals

HAL-INSU

DSpace at VSB Technical University of Ostrava

AXMEDIS 2007 Conference Proceedings

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The AXMEDIS International Conference series has been established since 2005 and is focused on the research, developments and applications in the cross-media domain, exploring innovative technologies to meet the challenges of the sector. AXMEDIS2007 deals with all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, interoperability, protection and rights management. It addresses the latest developments and future trends of the technologies and their applications, their impact and exploitation within academic, business and industrial communities

Directory of Open Access Books (DOAB)

Selected proceedings of the 50th Linguistic Symposium on Romance Languages

Author
Publication venue
Publication date: 01/01/2023
Field of study

Synopsis: The present volume presents a selection of the revised and peer-reviewed proceedings articles of the 50th Linguistic Symposium on Romance Languages (LSRL 50) which was hosted virtually by the faculty and students from the University of Texas at Austin. With contributions from rising and senior scholars from Europe and the Americas, the volume demonstrates the breadth of research in contemporary Romance linguistics with articles that apply corpus-based and laboratory methods, as well as theory, to explore the structure, use, and development of the Romance languages. The articles cover a wide range of fields including morphosyntax, semantics, language variation and change, sociophonetics, historical linguistics, language acquisition, and computational linguistics. In an introductory article, the editors document the sudden transition of LSRL 50 to a virtual format and acknowledge those who helped them to ensure the continuity of this annual scholarly meeting

Institutional Repository of the Freie Universität Berlin

Recommended from our members

Roadmap for Music Information ReSearch

Author: Benetos E.
Chudy M.
Dixon S.
Flexer A.
Gomez E.
Gouyon F.
Herrera P.
Jorda S.
Magas M.
Paytuvi O.
Peeters G.
Schlüter J.
Serra X.
Vinet H.
Widmer G.
Publication venue: MIRES Consortium
Publication date: 01/01/2013
Field of study

City Research Online

UPF Digital Repository