12,684 research outputs found
Machine Translation using Semantic Web Technologies: A Survey
A large number of machine translation approaches have recently been developed
to facilitate the fluid migration of content across languages. However, the
literature suggests that many obstacles must still be dealt with to achieve
better automatic translations. One of these obstacles is lexical and syntactic
ambiguity. A promising way of overcoming this problem is using Semantic Web
technologies. This article presents the results of a systematic review of
machine translation approaches that rely on Semantic Web technologies for
translating texts. Overall, our survey suggests that while Semantic Web
technologies can enhance the quality of machine translation outputs for various
problems, the combination of both is still in its infancy.Comment: 23 pages, 2 figures, 4 table
Cross-language Citation Recommendation via Hierarchical Representation Learning on Heterogeneous Graph
While the volume of scholarly publications has increased at a frenetic pace,
accessing and consuming the useful candidate papers, in very large digital
libraries, is becoming an essential and challenging task for scholars.
Unfortunately, because of language barrier, some scientists (especially the
junior ones or graduate students who do not master other languages) cannot
efficiently locate the publications hosted in a foreign language repository. In
this study, we propose a novel solution, cross-language citation recommendation
via Hierarchical Representation Learning on Heterogeneous Graph (HRLHG), to
address this new problem. HRLHG can learn a representation function by mapping
the publications, from multilingual repositories, to a low-dimensional joint
embedding space from various kinds of vertexes and relations on a heterogeneous
graph. By leveraging both global (task specific) plus local (task independent)
information as well as a novel supervised hierarchical random walk algorithm,
the proposed method can optimize the publication representations by maximizing
the likelihood of locating the important cross-language neighborhoods on the
graph. Experiment results show that the proposed method can not only outperform
state-of-the-art baseline models, but also improve the interpretability of the
representation model for cross-language citation recommendation task.Comment: The 41st International ACM SIGIR Conference on Research & Development
in Information Retrieval (SIGIR 2018), 635--64
AppTechMiner: Mining Applications and Techniques from Scientific Articles
This paper presents AppTechMiner, a rule-based information extraction
framework that automatically constructs a knowledge base of all application
areas and problem solving techniques. Techniques include tools, methods,
datasets or evaluation metrics. We also categorize individual research articles
based on their application areas and the techniques proposed/improved in the
article. Our system achieves high average precision (~82%) and recall (~84%) in
knowledge base creation. It also performs well in application and technique
assignment to an individual article (average accuracy ~66%). In the end, we
further present two use cases presenting a trivial information retrieval system
and an extensive temporal analysis of the usage of techniques and application
areas. At present, we demonstrate the framework for the domain of computational
linguistics but this can be easily generalized to any other field of research.Comment: JCDL 2017, 6th International Workshop on Mining Scientific
Publications. arXiv admin note: substantial text overlap with
arXiv:1608.0638
Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding
Entity alignment is the task of finding entities in two knowledge bases (KBs)
that represent the same real-world object. When facing KBs in different natural
languages, conventional cross-lingual entity alignment methods rely on machine
translation to eliminate the language barriers. These approaches often suffer
from the uneven quality of translations between languages. While recent
embedding-based techniques encode entities and relationships in KBs and do not
need machine translation for cross-lingual entity alignment, a significant
number of attributes remain largely unexplored. In this paper, we propose a
joint attribute-preserving embedding model for cross-lingual entity alignment.
It jointly embeds the structures of two KBs into a unified vector space and
further refines it by leveraging attribute correlations in the KBs. Our
experimental results on real-world datasets show that this approach
significantly outperforms the state-of-the-art embedding approaches for
cross-lingual entity alignment and could be complemented with methods based on
machine translation
Towards an Arabic-English Machine-Translation Based on Semantic Web
Communication tools make the world like a small village and as a consequence
people can contact with others who are from different societies or who speak
different languages. This communication cannot happen effectively without
Machine Translation because they can be found anytime and everywhere. There are
a number of studies that have developed Machine Translation for the English
language with so many other languages except the Arabic it has not been
considered yet. Therefore we aim to highlight a roadmap for our proposed
translation machine to provide an enhanced Arabic English translation based on
Semantic.Comment: 6 pages, 4 figures, Conference pape
Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction
In this paper, we consider the problem of open information extraction (OIE)
for extracting entity and relation level intermediate structures from sentences
in open-domain. We focus on four types of valuable intermediate structures
(Relation, Attribute, Description, and Concept), and propose a unified
knowledge expression form, SAOKE, to express them. We publicly release a data
set which contains more than forty thousand sentences and the corresponding
facts in the SAOKE format labeled by crowd-sourcing. To our knowledge, this is
the largest publicly available human labeled data set for open information
extraction tasks. Using this labeled SAOKE data set, we train an end-to-end
neural model using the sequenceto-sequence paradigm, called Logician, to
transform sentences into facts. For each sentence, different to existing
algorithms which generally focus on extracting each single fact without
concerning other possible facts, Logician performs a global optimization over
all possible involved facts, in which facts not only compete with each other to
attract the attention of words, but also cooperate to share words. An
experimental study on various types of open domain relation extraction tasks
reveals the consistent superiority of Logician to other states-of-the-art
algorithms. The experiments verify the reasonableness of SAOKE format, the
valuableness of SAOKE data set, the effectiveness of the proposed Logician
model, and the feasibility of the methodology to apply end-to-end learning
paradigm on supervised data sets for the challenging tasks of open information
extraction
Sentiment/Subjectivity Analysis Survey for Languages other than English
Subjective and sentiment analysis have gained considerable attention
recently. Most of the resources and systems built so far are done for English.
The need for designing systems for other languages is increasing. This paper
surveys different ways used for building systems for subjective and sentiment
analysis for languages other than English. There are three different types of
systems used for building these systems. The first (and the best) one is the
language specific systems. The second type of systems involves reusing or
transferring sentiment resources from English to the target language. The third
type of methods is based on using language independent methods. The paper
presents a separate section devoted to Arabic sentiment analysis.Comment: This is an accepted version in Social Network Analysis and Mining
journal. The final publication will be available at Springer via
http://dx.doi.org/10.1007/s13278-016-0381-
Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation
Attention-based sequence-to-sequence model has proved successful in Neural
Machine Translation (NMT). However, the attention without consideration of
decoding history, which includes the past information in the decoder and the
attention mechanism, often causes much repetition. To address this problem, we
propose the decoding-history-based Adaptive Control of Attention (ACA) for the
NMT model. ACA learns to control the attention by keeping track of the decoding
history and the current information with a memory vector, so that the model can
take the translated contents and the current information into consideration.
Experiments on Chinese-English translation and the English-Vietnamese
translation have demonstrated that our model significantly outperforms the
strong baselines. The analysis shows that our model is capable of generating
translation with less repetition and higher accuracy. The code will be
available at https://github.com/lancopk
Filling Knowledge Gaps in a Broad-Coverage Machine Translation System
Knowledge-based machine translation (KBMT) techniques yield high quality in
domains with detailed semantic models, limited vocabulary, and controlled input
grammar. Scaling up along these dimensions means acquiring large knowledge
resources. It also means behaving reasonably when definitive knowledge is not
yet available. This paper describes how we can fill various KBMT knowledge
gaps, often using robust statistical techniques. We describe quantitative and
qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT
system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9
Human Translation Vs Machine Translation: the Practitioner Phenomenology
The paper aimed at exploring the current phenomenon regarding human translation with machine translation. Human translation (HT), by definition, is when a human translator—rather than a machine—translate text. It's the oldest form of translation, relying on pure human intelligence to convert one way of saying things to another. The person who performs language translation. Learn more about using technology to reduce healthcare disparity. A person who performs language translation. The translation is necessary for the spread of information, knowledge, and ideas. It is absolutely necessary for effective and empathetic communication between different cultures. Translation, therefore, is critical for social harmony and peace. Only a human translation can tell the difference because the machine translator will just do the direct word to word translation. This is a hindrance to machines because they are not advanced to the level of rendering these nuances accurately, but they can only do word to word translations. There are different translation techniques, diverse theories about translation and eight different translation services types, including technical translation, judicial translation and certified translation. The translation is the process of translating the sequence of a messenger RNA (mRNA) molecule to a sequence of amino acids during protein synthesis. The genetic code describes the relationship between the sequence of base pairs in a gene and the corresponding amino acid sequence that it encodes
- …