Search CORE

16,923 research outputs found

Ontology learning from Italian legal texts

Author: Lenci Alessandro
Montemagni Simonetta
Pirrelli Vito
Venturi Giulia
Publication venue: New IOS Press Publication
Publication date
Field of study

The paper reports on the methodology and preliminary results of a case study in automatically extracting ontological knowledge from Italian legislative texts. We use a fully-implemented ontology learning system (T2K) that includes a battery of tools for Natural Language Processing (NLP), statistical text analysis and machine language learning. Tools are dynamically integrated to provide an incremental representation of the content of vast repositories of unstructured documents. Evaluated results, however preliminary, show the great potential of NLP-powered incremental systems like T2K for accurate large-scale semi-automatic extraction of legal ontologies

PUblication MAnagement

Dal testo alla conoscenza e ritorno: estrazione terminologica e annotazione semantica di basi documentali di dominio.

Author: Dell\u27Orletta Felice
Lenci Alessandro
Marchi Simone
Montemagni Simonetta
Pirrelli Vito
Venturi Giulia
Publication venue: Associazione Italiana per la Documentazione Avanzata
Publication date: 01/01/2008
Field of study

The paper focuses on the automatic extraction of domain knowledge from Italian legal texts and presents a fully-implemented ontology learning system (T2K, Text-2-Knowledge) that includes a battery of tools for Natural Language Processing, statistical text analysis and machine learning. Evaluated results show the considerable potential of systems like T2K, exploiting an incremental interleaving of NLP and machine learning techniques for accurate large-scale semi-automatic extraction and structuring of domain-specific knowledge

Archivio della Ricerca - Università di Pisa

PUblication MAnagement

Acquiring Legal Ontologies from Domain-specific Texts

Author: Dell\u27Orletta Felice
Lenci Alessandro
Marchi Simone
Montemagni Simonetta
Pirrelli Vito
Venturi Giulia
Publication venue
Publication date
Field of study

The paper reports on methodology and preliminary results of a case study in automatically extracting ontological knowledge from Italian legislative texts in the environmental domain. We use a fully-implemented ontology learning system (T2K) that includes a battery of tools for Natural Language Processing (NLP), statistical text analysis and machine language learning. Tools are dynamically integrated to provide an incremental representation of the content of vast repositories of unstructured documents. Evaluated results, however preliminary, are very encouraging, showing the great potential of NLP-powered incremental systems like T2K for accurate large-scale semi?automatic extraction of legal ontologies

PUblication MAnagement

An iterative approach for lexicon characterization in juridical context

Author: AMATO FLORA
MAZZEO ANTONINO
ROMANO SARA
SCIPPACERCOLA SERGIO
Publication venue: place:Roma
Publication date: 01/01/2010
Field of study

In the juridical context, knowledge management applications have a central role. In order to improve the effectiveness of document management procedures, techniques for automatic comprehension of textual content are required. In this work, a methodology for semi-automatic derivation of knowledge from document collections is proposed. In order to extract relevant information from document text, a process integrating both statistical and lexical approaches is applied. Moreover, we propose a system for the evaluation of the extracted peculiar lexicon quality. The system is used for the processing of heterogeneous documents corpus issued by Italy’s juridical domain

Archivio della ricerca - Università degli studi di Napoli Federico II

Ontology population for open-source intelligence: A GATE-based solution

Author: Ganino Giulio
Lembo Domenico
Mecella Massimo
Scafoglieri Federico
Publication venue: 'Wiley'
Publication date: 01/01/2018
Field of study

Open-Source INTelligence is intelligence based on publicly available sources such as news sites, blogs, forums, etc. The Web is the primary source of information, but once data are crawled, they need to be interpreted and structured. Ontologies may play a crucial role in this process, but because of the vast amount of documents available, automatic mechanisms for their population are needed, starting from the crawled text. This paper presents an approach for the automatic population of predefined ontologies with data extracted from text and discusses the design and realization of a pipeline based on the General Architecture for Text Engineering system, which is interesting for both researchers and practitioners in the field. Some experimental results that are encouraging in terms of extracted correct instances of the ontology are also reported. Furthermore, the paper also describes an alternative approach and provides additional experiments for one of the phases of our pipeline, which requires the use of predefined dictionaries for relevant entities. Through such a variant, the manual workload required in this phase was reduced, still obtaining promising results

Archivio della ricerca- Università di Roma La Sapienza

Web 2.0, language resources and standards to automatically build a multilingual named entity lexicon

Author: Ferrández Sergio
Monachini Monica
Muñoz Rafael
Toral Antonio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/06/2011
Field of study

This paper proposes to advance in the current state-of-the-art of automatic Language Resource (LR) building by taking into consideration three elements: (i) the knowledge available in existing LRs, (ii) the vast amount of information available from the collaborative paradigm that has emerged from the Web 2.0 and (iii) the use of standards to improve interoperability. We present a case study in which a set of LRs for diﬀerent languages (WordNet for English and Spanish and Parole-Simple-Clips for Italian) are extended with Named Entities (NE) by exploiting Wikipedia and the aforementioned LRs. The practical result is a multilingual NE lexicon connected to these LRs and to two ontologies: SUMO and SIMPLE. Furthermore, the paper addresses an important problem which aﬀects the Computational Linguistics area in the present, interoperability, by making use of the ISO LMF standard to encode this lexicon. The diﬀerent steps of the procedure (mapping, disambiguation, extraction, NE identiﬁcation and postprocessing) are comprehensively explained and evaluated. The resulting resource contains 974,567, 137,583 and 125,806 NEs for English, Spanish and Italian respectively. Finally, in order to check the usefulness of the constructed resource, we apply it into a state-of-the-art Question Answering system and evaluate its impact; the NE lexicon improves the system’s accuracy by 28.1%. Compared to previous approaches to build NE repositories, the current proposal represents a step forward in terms of automation, language independence, amount of NEs acquired and richness of the information represented

DCU Online Research Access Service

Singling out Legal Knowledge from World Knowledge. An NLP-based Approach

Author: Bonin Francesca
Dell\u27Orletta Felice
Montemagni Simonetta
Venturi Giulia
Publication venue: Edizioni Scientifiche Italiane S.p.A.
Publication date
Field of study

1. Introduction - 2. Background and Motivation - 3. The Term Extraction Approach - 4. Experiments and Results - 5. Evaluation - 5.1. General Evaluation Criteria - 5.2. Discussion of Results - 6. Conclusio

PUblication MAnagement

Semi-automatic knowledge population in a legal document management system

Author: Boella G.
Di Caro L.
Leone V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Institutional Research Information System University of Turin

Thirty years of artificial intelligence and law : the third decade

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Jagiellonian Univeristy Repository