1,522 research outputs found
A Machine Learning Approach For Opinion Holder Extraction In Arabic Language
Opinion mining aims at extracting useful subjective information from reliable
amounts of text. Opinion mining holder recognition is a task that has not been
considered yet in Arabic Language. This task essentially requires deep
understanding of clauses structures. Unfortunately, the lack of a robust,
publicly available, Arabic parser further complicates the research. This paper
presents a leading research for the opinion holder extraction in Arabic news
independent from any lexical parsers. We investigate constructing a
comprehensive feature set to compensate the lack of parsing structural
outcomes. The proposed feature set is tuned from English previous works coupled
with our proposed semantic field and named entities features. Our feature
analysis is based on Conditional Random Fields (CRF) and semi-supervised
pattern recognition techniques. Different research models are evaluated via
cross-validation experiments achieving 54.03 F-measure. We publicly release our
own research outcome corpus and lexicon for opinion mining community to
encourage further research
Strategies Used in Translation of Scientific Texts to Cope with Lexical Gaps (Case of Biomass Gasification and Pyrolysis Book)
Lexical gap in translation is deeply debated during the history of translation studies. Many theories have been put forward to explain the possible strategies of filling lexical gaps. This descriptive study aims to investigate some of these strategies used in the translation of a special technical book, Biomass Gasification & Pyrolysis (Practical Design and Theory), this book was a suitable option, since it has modern topics, some of which have not been discussed or even existed in the target language (Persian). In this study, seventy new terms which have not been employed in Persian (in that field) were selected and examined, the qualitative and quantitative analysis of the words indicated that loan word, loan translation, loanblend were the most prominent strategies to cope with new lexicons; in addition, it also showed that loan translation had the highest rate of usage (68.5%) among other techniques and in scientific contexts it is widely preferred
Undergraduates’ interest towards learning genetics concepts through integrated stemproblem based learning approach
Scientific and innovative society can be produced by giving priorities in Science, Technology, Engineering, and Mathematics (STEM) as emphasized by Malaysian Higher Education Blueprint (2015-2025). STEM need to be implemented at higher education because universities need to produce competent graduates to support economy growth and sustainable development. Learning STEM through Problem Based Learning might allow the undergraduates to become more enthusiastic when problem-based instruction is incorporated with STEM by implementing teamwork and problem-solving techniques to engage the first-year undergraduates fully with the learning. This study was conducted to investigate whether Integrated STEM Problem Based Learning module could enhance and retain the interest towards genetics concepts among first-year undergraduates. Topics in genetics was considered difficult not only to teach but also to learn. In this research, to overcome the genetic concepts learning difficulties, genetic related topics were chosen to introduce STEM through problem-based learning approach, which might help first-year undergraduates to acquire deep genetic content knowledge. This is very vital for the first-year undergraduates, as the knowledge gained in their first semester will be applied in the upcoming courses in their entire undergraduates’ programs of study. A Pre-Experimental research design with one group-posttest design was applied. A total of 50 participants who are first-year undergraduates from Faculty of Biology from one of the public universities in Malaysia were involved. The Genetics Interest Questionnaire used to study if the STEM Problem Based Learning module could enhance and retain the interest towards genetics concepts. The research has proven that Integrated STEM through problem-based learning approach could enhance and retains the interest in learning genetics concepts among first-year undergraduates
A review of sentiment analysis research in Arabic language
Sentiment analysis is a task of natural language processing which has
recently attracted increasing attention. However, sentiment analysis research
has mainly been carried out for the English language. Although Arabic is
ramping up as one of the most used languages on the Internet, only a few
studies have focused on Arabic sentiment analysis so far. In this paper, we
carry out an in-depth qualitative study of the most important research works in
this context by presenting limits and strengths of existing approaches. In
particular, we survey both approaches that leverage machine translation or
transfer learning to adapt English resources to Arabic and approaches that stem
directly from the Arabic language
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
One Model to Rule them all: Multitask and Multilingual Modelling for Lexical Analysis
When learning a new skill, you take advantage of your preexisting skills and
knowledge. For instance, if you are a skilled violinist, you will likely have
an easier time learning to play cello. Similarly, when learning a new language
you take advantage of the languages you already speak. For instance, if your
native language is Norwegian and you decide to learn Dutch, the lexical overlap
between these two languages will likely benefit your rate of language
acquisition. This thesis deals with the intersection of learning multiple tasks
and learning multiple languages in the context of Natural Language Processing
(NLP), which can be defined as the study of computational processing of human
language. Although these two types of learning may seem different on the
surface, we will see that they share many similarities.
The traditional approach in NLP is to consider a single task for a single
language at a time. However, recent advances allow for broadening this
approach, by considering data for multiple tasks and languages simultaneously.
This is an important approach to explore further as the key to improving the
reliability of NLP, especially for low-resource languages, is to take advantage
of all relevant data whenever possible. In doing so, the hope is that in the
long term, low-resource languages can benefit from the advances made in NLP
which are currently to a large extent reserved for high-resource languages.
This, in turn, may then have positive consequences for, e.g., language
preservation, as speakers of minority languages will have a lower degree of
pressure to using high-resource languages. In the short term, answering the
specific research questions posed should be of use to NLP researchers working
towards the same goal.Comment: PhD thesis, University of Groninge
Models to represent linguistic linked data
As the interest of the Semantic Web and computational linguistics communities in linguistic linked data (LLD) keeps increasing and the number of contributions that dwell on LLD rapidly grows, scholars (and linguists in particular) interested in the development of LLD resources sometimes find it difficult to determine which mechanism is suitable for their needs and which challenges have already been addressed. This review seeks to present the state of the art on the models, ontologies and their extensions to represent language resources as LLD by focusing on the nature of the linguistic content they aim to encode. Four basic groups of models are distinguished in this work: models to represent the main elements of lexical resources (group 1), vocabularies developed as extensions to models in group 1 and ontologies that provide more granularity on specific levels of linguistic analysis (group 2), catalogues of linguistic data categories (group 3) and other models such as corpora models or service-oriented ones (group 4). Contributions encompassed in these four groups are described, highlighting their reuse by the community and the modelling challenges that are still to be faced
Design of a Controlled Language for Critical Infrastructures Protection
We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates
from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically
represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of
traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an
analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen
- …