1,066 research outputs found
Thematic Annotation: extracting concepts out of documents
Contrarily to standard approaches to topic annotation, the technique used in
this work does not centrally rely on some sort of -- possibly statistical --
keyword extraction. In fact, the proposed annotation algorithm uses a large
scale semantic database -- the EDR Electronic Dictionary -- that provides a
concept hierarchy based on hyponym and hypernym relations. This concept
hierarchy is used to generate a synthetic representation of the document by
aggregating the words present in topically homogeneous document segments into a
set of concepts best preserving the document's content.
This new extraction technique uses an unexplored approach to topic selection.
Instead of using semantic similarity measures based on a semantic resource, the
later is processed to extract the part of the conceptual hierarchy relevant to
the document content. Then this conceptual hierarchy is searched to extract the
most relevant set of concepts to represent the topics discussed in the
document. Notice that this algorithm is able to extract generic concepts that
are not directly present in the document.Comment: Technical report EPFL/LIA. 81 pages, 16 figure
AI-assisted patent prior art searching - feasibility study
This study seeks to understand the feasibility, technical complexities and effectiveness of using artificial intelligence (AI) solutions to improve operational processes of registering IP rights. The Intellectual Property Office commissioned Cardiff University to undertake this research. The research was funded through the BEIS Regulatorsโ Pioneer Fund (RPF). The RPF fund was set up to help address barriers to innovation in the UK economy
- The Cases of Japan, South Korea, and China -
ํ์๋
ผ๋ฌธ(์์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ํ์ ๋ํ์ ํ์ ํ๊ณผ(์ ์ฑ
ํ์ ๊ณต), 2022.2. ๊ตฌ๋ฏผ๊ต.Although the international community faces challenges with the rise of protectionism, World Trade Organization contributed to the expansion and stabilization of the world economy as a center of the international trade system. Among the key institutional pillars of the WTO, the Dispute Settlement Mechanism (DSM) has attracted major scholarly attention in contemporary research on trade organizations. Yet, this study focuses on the least studied Trade Policy Review Mechanism (TPRM), another key function to safeguard against protectionism. TPRM is a mechanism that imposes peer pressure, a social criticism related to 'naming and shaming' rather than oppressive sanctions, which can help raise awareness of member states' trade practices and polices and increase responsibility and transparency. Despite its significance, the main reasons for the lack of attention by trade scholars are semantic complexity of review reports and a vast amount of text.
To overcome the existing limitations, this study analyzed TPR reports using information extraction (IE) techniques. A total of 18 TPR reports on the three East Asian trading partners (Japan, Korea, and China) were analyzed by Rapid Automation Keyword Extraction (RAKE) and TextRank algorithms. Based on this, major trade issues of the three countries were extracted. In the second phase, for an in-depth understanding and rich interpretation of the issue, a qualitative method of case study was conducted in accordance with peer pressure formation stages.๋ณดํธ์ฃผ์ ์๋ ฅ์ด ์ปค์ง๋ ์๋ ์์์ ๊ตญ์ ์ฌํ๋ ์ฌ์ ํ ๊ฑฐ๋ํ ๋์ ์ ์ง๋ฉดํด ์์ง๋ง, WTO๋ ๊ตญ์ ๋ฌด์ญ ์์คํ
์ ์ค์ฌ์ผ๋ก์ ์ธ๊ณ ๊ฒฝ์ ์ ํ๋์ ์์ ์ ์๋นํ ๊ธฐ์ฌ๋ฅผ ํ๋ค. ๋ณธ ์ฐ๊ตฌ๋ WTO์ ํต์ฌ ์ ๋์ ์ถ ๊ฐ์ด๋ฐ, ๋ฌด์ญ ์ ์ฑ
๊ฒํ ๋ฉ์ปค๋์ฆ(TPRM)์ด WTO์ ๋ ๋ค๋ฅธ ์ ๋์ ์ถ์ธ ๋ถ์ ํด๊ฒฐ ๋ฉ์ปค๋์ฆ(DSM)์ ๋นํด ๋ณดํธ๋ฌด์ญ์ฃผ์์ ๋ํญํ ์ ์๋ ๋ณดํธ ์๋จ์ด๋ผ๋ ์ค์์ฑ์๋ ๋ถ๊ตฌํ๊ณ ๊ฐ์ฅ ์ ๊ฒ ์ฐ๊ตฌ๋์ด ์๋ค๋ ์ ์ ์ฃผ๋ชฉํ๋ค. TPRM์ ๋ฌผ๋ฆฌ์ ์ธ ์ ์ฌ๊ฐ ์๋ ์ฌํ์ ๋นํ์ ํต์ฌ์ผ๋ก ํ๋ ๋ฉ์ปค๋์ฆ์ธ peer pressure๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ๊ตญ์ ์ฌํ๊ฐ ๋ฌด์ญ ์ ์ฑ
๊ณผ ๊ดํ์ ๋ํด ์ฑ
์๊ณผ ํฌ๋ช
์ฑ์ ๋์ด๋ ๋ฐ ๋์์ ์ค ์ ์๋ค. ๊ทธ๋ผ์๋ ํ์ ์ ๊ด์ฌ์ด ๋ถ์กฑํ ์ฃผ๋ ์ด์ ๋ ๊ฒํ ๋ณด๊ณ ์์ ๋ฏธ๋ฌํ ๊ธฐ์ ๊ณผ ๋ฐฉ๋ํ ์์ ํ์คํธ์ ์๋ค.
๊ธฐ์กด ํ๊ณ๋ฅผ ๋ฐ์ด๋๊ณ ์, ๋ณธ ์ฐ๊ตฌ๋ 1๋จ๊ณ์์ ์ ๋ณด ์ถ์ถ(IE) ๊ธฐ๋ฒ์ ์ฌ์ฉํ์ฌ TPR์ ๋ถ์ํ๋ค. RAKE(Rapid Automation Keyword ์ถ์ถ) ๋ฐ TextRank ์๊ณ ๋ฆฌ์ฆ์ ์ด์ฉํ์ฌ ๋์์์์ 3๋ ๊ต์ญ๊ตญ(ํ๊ตญ, ์ค๊ตญ, ์ผ๋ณธ)์ ๋ํ ์ด 18๊ฑด์ TPR ๋ณด๊ณ ์๋ฅผ ๋ถ์ํ์ผ๋ฉฐ, ์ด๋ฅผ ๋ฐํ์ผ๋ก 3๊ตญ์ ์ฃผ์ ํต์ ์ด์๋ฅผ ์ถ์ถํ์๋ค. ํด๋น ์ด์์ ๋ํ ์ฌ์ธต์ ์ดํด์ ํ๋ถํ ํด์์ ์ํด 2๋จ๊ณ์์๋ peer pressure ํ์ฑ ๋จ๊ณ์ ๋ฐ๋ผ ์ฌ๋ก๋ถ์์ ์งํํ๋ค. ์ฐ๊ตฌ ๊ฒฐ๊ณผ๋ ์ผ๋ณธ, ํ๊ตญ, ์ค๊ตญ์ ์ฃผ์ ํต์์ ์ฑ
ํจํด๊ณผ TPR์ ์ํฅ, ๊ทธ๋ฆฌ๊ณ ๊ทธ ๊ณผ์ ์์ ๋ฐ์ํ peer pressure์ ํํ ๋ฐ ๊ฒฐ๊ณผ๋ฅผ ์ฌ์ธต์ ์ผ๋ก ๋ณด์ฌ์ฃผ๋ฉฐ, ๊ตญ์ ๋ฌด์ญ ์ฌํ์ ์๋ก์ด ๋ฐฉํฅ์ฑ์ ์ ์ํ๊ณ ์ ํ๋ค.Chapter 1. Introduction 1
1.1 Study Background 1
1.2 Purpose of Research 2
Chapter 2. Theoretical Background and Literature Review 5
2.1 Transparency in Trade Environment 5
2.2 Trade Policy Review Mechanism 6
2.3 Relationship Between Three East Asian States 8
Chapter 3. Research Design 10
3.1 Conceptual Framework 10
3.2 Peer pressure mechanism 11
3.3 Data and Methodology 13
Chapter 4. Result of Text Mining 17
4.1 Analysis of Japanโs Text Mining Results 17
4.2 Analysis of South Koreaโs Text Mining Results 20
4.3 Analysis of Chinaโs Text Mining Results 23
Chapter 5. Case Study on the Trade Policy: Stages of Peer Pressure Formation 26
5.1 Japan's Trade Issue: Change in Position towards Regional Economic Integration 26
5.2 Korea's Trade Issue: Moratorium on Rice Tarrification 31
5.3 China's Trade Issue: The Government's Market Intervention via SOES 37
Chapter 6. Conclusion and Implications 42
References 45
Abstract in Korean 54์
Constructing a Knowledge Graph for Vietnamese Legal Cases with Heterogeneous Graphs
This paper presents a knowledge graph construction method for legal case
documents and related laws, aiming to organize legal information efficiently
and enhance various downstream tasks. Our approach consists of three main
steps: data crawling, information extraction, and knowledge graph deployment.
First, the data crawler collects a large corpus of legal case documents and
related laws from various sources, providing a rich database for further
processing. Next, the information extraction step employs natural language
processing techniques to extract entities such as courts, cases, domains, and
laws, as well as their relationships from the unstructured text. Finally, the
knowledge graph is deployed, connecting these entities based on their extracted
relationships, creating a heterogeneous graph that effectively represents legal
information and caters to users such as lawyers, judges, and scholars. The
established baseline model leverages unsupervised learning methods, and by
incorporating the knowledge graph, it demonstrates the ability to identify
relevant laws for a given legal case. This approach opens up opportunities for
various applications in the legal domain, such as legal case analysis, legal
recommendation, and decision support.Comment: ISAILD@KSE 202
Exploring the State of the Art in Legal QA Systems
Answering questions related to the legal domain is a complex task, primarily
due to the intricate nature and diverse range of legal document systems.
Providing an accurate answer to a legal query typically necessitates
specialized knowledge in the relevant domain, which makes this task all the
more challenging, even for human experts. QA (Question answering systems) are
designed to generate answers to questions asked in human languages. They use
natural language processing to understand questions and search through
information to find relevant answers. QA has various practical applications,
including customer service, education, research, and cross-lingual
communication. However, they face challenges such as improving natural language
understanding and handling complex and ambiguous questions. Answering questions
related to the legal domain is a complex task, primarily due to the intricate
nature and diverse range of legal document systems. Providing an accurate
answer to a legal query typically necessitates specialized knowledge in the
relevant domain, which makes this task all the more challenging, even for human
experts. At this time, there is a lack of surveys that discuss legal question
answering. To address this problem, we provide a comprehensive survey that
reviews 14 benchmark datasets for question-answering in the legal field as well
as presents a comprehensive review of the state-of-the-art Legal Question
Answering deep learning models. We cover the different architectures and
techniques used in these studies and the performance and limitations of these
models. Moreover, we have established a public GitHub repository where we
regularly upload the most recent articles, open data, and source code. The
repository is available at:
\url{https://github.com/abdoelsayed2016/Legal-Question-Answering-Review}
Keywords at Work: Investigating Keyword Extraction in Social Media Applications
This dissertation examines a long-standing problem in Natural Language Processing (NLP) -- keyword extraction -- from a new angle. We investigate how keyword extraction can be formulated on social media data, such as emails, product reviews, student discussions, and student statements of purpose. We design novel graph-based features for supervised and unsupervised keyword extraction from emails, and use the resulting system with success to uncover patterns in a new dataset -- student statements of purpose. Furthermore, the system is used with new features on the problem of usage expression extraction from product reviews, where we obtain interesting insights. The system while used on student discussions, uncover new and exciting patterns.
While each of the above problems is conceptually distinct, they share two key common elements -- keywords and social data. Social data can be messy, hard-to-interpret, and not easily amenable to existing NLP resources. We show that our system is robust enough in the face of such challenges to discover useful and important patterns. We also show that the problem definition of keyword extraction itself can be expanded to accommodate new and challenging research questions and datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145929/1/lahiri_1.pd
Summarization of COVID-19 news documents deep learning-based using transformer architecture
Facing the news on the internet about the spreading of Corona virus disease 2019 (COVID-19) is challenging because it is required a long time to get valuable information from the news. Deep learning has a significant impact on NLP research. However, the deep learning models used in several studies, especially in document summary, still have a deficiency. For example, the maximum output of long text provides incorrectly. The other results are redundant, or the characters repeatedly appeared so that the resulting sentences were less organized, and the recall value obtained was low. This study aims to summarize using a deep learning model implemented to COVID-19 news documents. We proposed transformer as base language models with architectural modification as the basis for designing the model to improve results significantly in document summarization. We make a transformer-based architecture model with encoder and decoder that can be done several times repeatedly and make a comparison of layer modifications based on scoring. From the resulting experiment used, ROUGE-1 and ROUGE-2 show the good performance for the proposed model with scores 0.58 and 0.42, respectively, with a training time of 11438 seconds. The model proposed was evidently effective in improving result performance in abstractive document summarization
- โฆ