104 research outputs found
Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004
International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants
Dynamic Network Notation: A Graphical Modeling Language to Support the Visualization and Management of Network Effects in Service Platforms
Service platforms have moved into the center of interest in both academic research and the IT industry due to their economic and technical impact. These multitenant platforms provide own or third party software as metered, on-demand services. Corresponding service offers exhibit network effects. The present work introduces a graphical modeling language to support service platform design with focus on the exploitation of these network effects
Proceedings of JAC 2010. Journées Automates Cellulaires
The second Symposium on Cellular Automata “Journ´ees Automates Cellulaires” (JAC 2010) took place in Turku, Finland, on December 15-17, 2010. The first two conference days were held in the Educarium building of the University of Turku, while the talks of the third day were given onboard passenger ferry boats in the beautiful Turku archipelago, along the route Turku–Mariehamn–Turku. The conference was organized by FUNDIM, the Fundamentals of Computing and Discrete Mathematics research center at the mathematics department of the University of Turku.
The program of the conference included 17 submitted papers that were selected by the international program committee, based on three peer reviews of each paper. These papers form the core of these proceedings. I want to thank the members of the program committee and the external referees for the excellent work that have done in choosing the papers to be presented in the conference. In addition to the submitted papers, the program of JAC 2010 included four distinguished invited speakers: Michel Coornaert (Universit´e de Strasbourg, France), Bruno Durand (Universit´e de Provence, Marseille, France), Dora Giammarresi (Universit` a di Roma Tor Vergata, Italy) and Martin Kutrib (Universit¨at Gie_en, Germany). I sincerely thank the invited speakers for accepting our invitation to come and give a plenary talk in the conference. The invited talk by Bruno Durand was eventually given by his co-author Alexander Shen, and I thank him for accepting to make the presentation with a short notice. Abstracts or extended abstracts of the invited presentations appear in the first part of this volume.
The program also included several informal presentations describing very recent developments and ongoing research projects. I wish to thank all the speakers for their contribution to the success of the symposium. I also would like to thank the sponsors and our collaborators: the Finnish Academy of Science and Letters, the French National Research Agency project EMC (ANR-09-BLAN-0164), Turku Centre for Computer Science, the University of Turku, and Centro Hotel. Finally, I sincerely thank the members of the local organizing committee for making the conference possible.
These proceedings are published both in an electronic format and in print. The electronic proceedings are available on the electronic repository HAL, managed by several French research agencies. The printed version is published in the general publications series of TUCS, Turku Centre for Computer Science. We thank both HAL and TUCS for accepting to publish the proceedings.Siirretty Doriast
Recommended from our members
Extraction of chemical structures and reactions from the literature
The ever increasing quantity of chemical literature necessitates
the creation of automated techniques for extracting relevant information.
This work focuses on two aspects: the conversion of chemical names to
computer readable structure representations and the extraction of chemical
reactions from text.
Chemical names are a common way of communicating chemical structure
information. OPSIN (Open Parser for Systematic IUPAC Nomenclature), an
open source, freely available algorithm for converting chemical names to
structures was developed. OPSIN employs a regular grammar to direct
tokenisation and parsing leading to the generation of an XML parse tree.
Nomenclature operations are applied successively to the tree with many
requiring the manipulation of an in-memory connection table representation
of the structure under construction. Areas of nomenclature supported are
described with attention being drawn to difficulties that may be
encountered in name to structure conversion. Results on sets of generated
names and names extracted from patents are presented. On generated names,
recall of between 96.2% and 99.0% was achieved with a lower bound of 97.9%
on precision with all results either being comparable or superior to the
tested commercial solutions. On the patent names OPSIN s recall was 2-10%
higher than the tested solutions when the patent names were processed as
found in the patents. The uses of OPSIN as a web service and as a tool for
identifying chemical names in text are shown to demonstrate the direct
utility of this algorithm.
A software system for extracting chemical reactions from the text of
chemical patents was developed. The system relies on the output of
ChemicalTagger, a tool for tagging words and identifying phrases of
importance in experimental chemistry text. Improvements to this tool
required to facilitate this task are documented. The structure of chemical
entities are where possible determined using OPSIN in conjunction with a
dictionary of name to structure relationships. Extracted reactions are
atom mapped to confirm that they are chemically consistent. 424,621 atom
mapped reactions were extracted from 65,034 organic chemistry USPTO
patents. On a sample of 100 of these extracted reactions chemical entities
were identified with 96.4% recall and 88.9% precision. Quantities could be
associated with reagents in 98.8% of cases and 64.9% of cases for products
whilst the correct role was assigned to chemical entities in 91.8% of
cases. Qualitatively the system captured the essence of the reaction in
95% of cases. This system is expected to be useful in the creation of
searchable databases of reactions from chemical patents and in
facilitating analysis of the properties of large populations of reactions
- …