3,913 research outputs found
Exploring the State of the Art in Legal QA Systems
Answering questions related to the legal domain is a complex task, primarily
due to the intricate nature and diverse range of legal document systems.
Providing an accurate answer to a legal query typically necessitates
specialized knowledge in the relevant domain, which makes this task all the
more challenging, even for human experts. QA (Question answering systems) are
designed to generate answers to questions asked in human languages. They use
natural language processing to understand questions and search through
information to find relevant answers. QA has various practical applications,
including customer service, education, research, and cross-lingual
communication. However, they face challenges such as improving natural language
understanding and handling complex and ambiguous questions. Answering questions
related to the legal domain is a complex task, primarily due to the intricate
nature and diverse range of legal document systems. Providing an accurate
answer to a legal query typically necessitates specialized knowledge in the
relevant domain, which makes this task all the more challenging, even for human
experts. At this time, there is a lack of surveys that discuss legal question
answering. To address this problem, we provide a comprehensive survey that
reviews 14 benchmark datasets for question-answering in the legal field as well
as presents a comprehensive review of the state-of-the-art Legal Question
Answering deep learning models. We cover the different architectures and
techniques used in these studies and the performance and limitations of these
models. Moreover, we have established a public GitHub repository where we
regularly upload the most recent articles, open data, and source code. The
repository is available at:
\url{https://github.com/abdoelsayed2016/Legal-Question-Answering-Review}
Engineering Agile Big-Data Systems
To be effective, data-intensive systems require extensive ongoing customisation to reflect changing user requirements, organisational policies, and the structure and interpretation of the data they hold. Manual customisation is expensive, time-consuming, and error-prone. In large complex systems, the value of the data can be such that exhaustive testing is necessary before any new feature can be added to the existing design. In most cases, the precise details of requirements, policies and data will change during the lifetime of the system, forcing a choice between expensive modification and continued operation with an inefficient design.Engineering Agile Big-Data Systems outlines an approach to dealing with these problems in software and data engineering, describing a methodology for aligning these processes throughout product lifecycles. It discusses tools which can be used to achieve these goals, and, in a number of case studies, shows how the tools and methodology have been used to improve a variety of academic and business systems
Automatic rule verification for digital building permits
Dissertação de mestrado em Modelação de Informação na Construção de Edifícios BIM A+O sector da construção está a enfrentar grandes mudanças nas exigências do cliente e do mercado,
empurrando para a transformação digital e para uma indústria orientada para os dados. Os governos
tomaram parte ativa nesta mudança, apoiando a digitalização de processos como o das licenças de
construção, introduzindo a utilização de modelos de informação de construção (BIM). A investigação
sobre a digitalização do licenciamento municipal de construções mostrou grandes avanços no que diz
respeito à extração de regras de forma interpretável e à automatização de verificações; contudo, a
conciliação entre as definições semânticas do modelo de construção e os conceitos definidos nos
regulamentos está ainda em discussão. Além disso, a validação da acuidade das informações incluídas
nos modelos de construção relativamente às definições do regulamento é importante para garantir a
qualidade ao longo do processo de licença de construção.
Esta dissertação visa propor um fluxo de trabalho híbrido para verificar a informação extraída
explicitamente do modelo BIM e a informação implicitamente derivada das relações entre elementos,
seguindo as disposições contidas nos regulamentos no contexto de Portugal. Com base em alguma
revisão de literatura, foi proposto um novo processo, e foi desenvolvido um código Python utilizando a
biblioteca IfcOpenshell para apoiar a automatização do processo de verificação, tradicionalmente
realizada por técnicos nos gabinetes de licenciamento municipal. Os elementos desenvolvidos neste
documento foram comprovados num estudo de caso, demonstrando que a validação híbrida pode ajudar
a detetar erros de modelação e melhorar a acuidade da informação durante a apresentação inicial de
modelos para um processo de licença de construção.
Os resultados indicam que a inclusão de uma validação automática do modelo contra definições
regulamentares pode ser introduzida para melhorar o grau de certeza da qualidade da informação contida
no Modelo de Informação, além disso, a proposta de métodos que produzem resultados a partir de
informação implícita pode alargar as capacidades do esquema IFC. Contudo, os esquemas
desenvolvidos neste trabalho estão ainda em constante revisão e desenvolvimento e têm limitações de
aplicabilidade em relação a certas classes do IFC.The construction sector is facing major changes in the client and market requirements, pushing towards
the digital transformation and a data driven industry. Governments have taken an active part in this
change by supporting the digitalization of processes such as the one for building permits by introducing
the use of building information models (BIM). The research on the digitalization of the building permit
has shown great advancements in regarding the rule extraction in interpretable ways and the automation
of the verification; however, the conciliation between the building model semantic definitions and the
concepts defined in the regulations is still in discussion. Moreover, the validation of the correctness of
the information included in building models regarding the regulation definitions is important to
guarantee the quality along the digital building permit process.
This dissertation aims to propose a hybrid workflow to check the information extracted explicitly from
the BIM model and the information implicitly derived from relationships between elements by following
the provisions contained in the regulations in the context of Portugal. Based on some context and
literature review, a process reengineering was proposed, and a Python code was developed using the
IfcOpenShell library to support the automation of the verification process, traditionally carried out by
technicians in the building permit offices. The elements developed in this document were proven in a
case-study, demonstrating that the hybrid validation can help to detect modelling errors and improve the
certainty of correctness of information during the initial submission of models for a building permit
process.
The results indicate that the inclusion of an automated validation of the model against regulation
definitions can be introduced to improve the degree of certainty of the quality of the information
contained in the Building Information Model, moreover the proposal of methods that produce results
from implicit information can extend the capabilities of the IFC schema. However, the scripts developed
in this work are still under constant review and development and have limitations of applicability in
relation to certain IFC classes.Erasmus Mundus Joint Master Degree Programme – ERASMUS
Large Language Models and Knowledge Graphs: Opportunities and Challenges
Large Language Models (LLMs) have taken Knowledge Representation -- and the
world -- by storm. This inflection point marks a shift from explicit knowledge
representation to a renewed focus on the hybrid representation of both explicit
knowledge and parametric knowledge. In this position paper, we will discuss
some of the common debate points within the community on LLMs (parametric
knowledge) and Knowledge Graphs (explicit knowledge) and speculate on
opportunities and visions that the renewed focus brings, as well as related
research topics and challenges.Comment: 30 page
Engineering Agile Big-Data Systems
To be effective, data-intensive systems require extensive ongoing customisation to reflect changing user requirements, organisational policies, and the structure and interpretation of the data they hold. Manual customisation is expensive, time-consuming, and error-prone. In large complex systems, the value of the data can be such that exhaustive testing is necessary before any new feature can be added to the existing design. In most cases, the precise details of requirements, policies and data will change during the lifetime of the system, forcing a choice between expensive modification and continued operation with an inefficient design.Engineering Agile Big-Data Systems outlines an approach to dealing with these problems in software and data engineering, describing a methodology for aligning these processes throughout product lifecycles. It discusses tools which can be used to achieve these goals, and, in a number of case studies, shows how the tools and methodology have been used to improve a variety of academic and business systems
Web observations: analysing Web data through automated data extraction
In this thesis, a generic architecture for Web observations is introduced. Beginning with fundamental data aspects and technologies for building Web observations, requirements and architectural designs are outlined. Because Web observations are basic tools to collect information from any Web resource, legal perspectives are discussed in order to give an understanding of recent regulations, e.g. General Data Protection Regulation (GDPR). The general idea of Web observatories, its concepts, and experiments are presented to identify the best solution for Web data collections and based thereon, visualisation from any kind of Web resource. With the help of several Web observation scenarios, data sets were collected, analysed and eventually published in a machine-readable or visual form for users to be interpreted. The main research goal was to create a Web observation based on an architecture that is able to collect information from any given Web resource to make sense of a broad amount of yet untapped information sources. To find this generally applicable architectural structure, several research projects with different designs have been conducted. Eventually, the container based building block architecture emerged from these initial designs as the most flexible architectural structure. Thanks to these considerations and architectural designs, a flexible and easily adaptable architecture was created that is able to collect data from all kinds of Web resources. Thanks to such broad Web data collections, users can get a more comprehensible understanding and insight of real-life problems, the efficiency and profitability of services as well as gaining valuable information on the changes of a Web resource
Semantic Systems. The Power of AI and Knowledge Graphs
This open access book constitutes the refereed proceedings of the 15th International Conference on Semantic Systems, SEMANTiCS 2019, held in Karlsruhe, Germany, in September 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 88 submissions. They cover topics such as: web semantics and linked (open) data; machine learning and deep learning techniques; semantic information management and knowledge integration; terminology, thesaurus and ontology management; data mining and knowledge discovery; semantics in blockchain and distributed ledger technologies
JURI SAYS:An Automatic Judgement Prediction System for the European Court of Human Rights
In this paper we present the web platform JURI SAYS that automatically predicts decisions of the European Court of Human Rights based on communicated cases, which are published by the court early in the proceedings and are often available many years before the final decision is made. Our system therefore predicts future judgements of the court. The platform is available at jurisays.com and shows the predictions compared to the actual decisions of the court. It is automatically updated every month by including the prediction for the new cases. Additionally, the system highlights the sentences and paragraphs that are most important for the prediction (i.e. violation vs. no violation of human rights)
Serviços de integração de dados para aplicações biomédicas
Doutoramento em Informática (MAP-i)In the last decades, the field of biomedical science has fostered
unprecedented scientific advances. Research is stimulated by the
constant evolution of information technology, delivering novel and
diverse bioinformatics tools. Nevertheless, the proliferation of new and
disconnected solutions has resulted in massive amounts of resources
spread over heterogeneous and distributed platforms. Distinct
data types and formats are generated and stored in miscellaneous
repositories posing data interoperability challenges and delays in
discoveries. Data sharing and integrated access to these resources
are key features for successful knowledge extraction.
In this context, this thesis makes contributions towards accelerating
the semantic integration, linkage and reuse of biomedical resources.
The first contribution addresses the connection of distributed and
heterogeneous registries. The proposed methodology creates a
holistic view over the different registries, supporting semantic
data representation, integrated access and querying. The second
contribution addresses the integration of heterogeneous information
across scientific research, aiming to enable adequate data-sharing
services. The third contribution presents a modular architecture to
support the extraction and integration of textual information, enabling
the full exploitation of curated data. The last contribution lies
in providing a platform to accelerate the deployment of enhanced
semantic information systems. All the proposed solutions were
deployed and validated in the scope of rare diseases.Nas últimas décadas, o campo das ciências biomédicas proporcionou
grandes avanços científicos estimulados pela constante evolução das
tecnologias de informação. A criação de diversas ferramentas na
área da bioinformática e a falta de integração entre novas soluções
resultou em enormes quantidades de dados distribuídos por diferentes
plataformas. Dados de diferentes tipos e formatos são gerados
e armazenados em vários repositórios, o que origina problemas de
interoperabilidade e atrasa a investigação. A partilha de informação
e o acesso integrado a esses recursos são características fundamentais
para a extração bem sucedida do conhecimento científico.
Nesta medida, esta tese fornece contribuições para acelerar a
integração, ligação e reutilização semântica de dados biomédicos. A
primeira contribuição aborda a interconexão de registos distribuídos e
heterogéneos. A metodologia proposta cria uma visão holística sobre
os diferentes registos, suportando a representação semântica de dados
e o acesso integrado. A segunda contribuição aborda a integração
de diversos dados para investigações científicas, com o objetivo de
suportar serviços interoperáveis para a partilha de informação. O
terceiro contributo apresenta uma arquitetura modular que apoia a
extração e integração de informações textuais, permitindo a exploração
destes dados. A última contribuição consiste numa plataforma web
para acelerar a criação de sistemas de informação semânticos. Todas
as soluções propostas foram validadas no âmbito das doenças raras
- …