78 research outputs found
Recommended from our members
Cross-Lingual and Low-Resource Sentiment Analysis
Identifying sentiment in a low-resource language is essential for understanding opinions internationally and for responding to the urgent needs of locals affected by disaster incidents in different world regions. While tools and resources for recognizing sentiment in high-resource languages are plentiful, determining the most effective methods for achieving this task in a low-resource language which lacks annotated data is still an open research question. Most existing approaches for cross-lingual sentiment analysis to date have relied on high-resource machine translation systems, large amounts of parallel data, or resources only available for Indo-European languages.
This work presents methods, resources, and strategies for identifying sentiment cross-lingually in a low-resource language. We introduce a cross-lingual sentiment model which can be trained on a high-resource language and applied directly to a low-resource language. The model offers the feature of lexicalizing the training data using a bilingual dictionary, but can perform well without any translation into the target language.
Through an extensive experimental analysis, evaluated on 17 target languages, we show that the model performs well with bilingual word vectors pre-trained on an appropriate translation corpus. We compare in-genre and in-domain parallel corpora, out-of-domain parallel corpora, in-domain comparable corpora, and monolingual corpora, and show that a relatively small, in-domain parallel corpus works best as a transfer medium if it is available. We describe the conditions under which other resources and embedding generation methods are successful, and these include our strategies for leveraging in-domain comparable corpora for cross-lingual sentiment analysis.
To enhance the ability of the cross-lingual model to identify sentiment in the target language, we present new feature representations for sentiment analysis that are incorporated in the cross-lingual model: bilingual sentiment embeddings that are used to create bilingual sentiment scores, and a method for updating the sentiment embeddings during training by lexicalization of the target language. This feature configuration works best for the largest number of target languages in both untargeted and targeted cross-lingual sentiment experiments.
The cross-lingual model is studied further by evaluating the role of the source language, which has traditionally been assumed to be English. We build cross-lingual models using 15 source languages, including two non-European and non-Indo-European source languages: Arabic and Chinese. We show that language families play an important role in the performance of the model, as does the morphological complexity of the source language.
In the last part of the work, we focus on sentiment analysis towards targets. We study Arabic as a representative morphologically complex language and develop models and morphological representation features for identifying entity targets and sentiment expressed towards them in Arabic open-domain text. Finally, we adapt our cross-lingual sentiment models for the detection of sentiment towards targets. Through cross-lingual experiments on Arabic and English, we demonstrate that our findings regarding resources, features, and language also hold true for the transfer of targeted sentiment
Recommended from our members
Automatic annotation of error types for grammatical error correction
Grammatical Error Correction (GEC) is the task of automatically detecting and correcting
grammatical errors in text. Although previous work has focused on developing systems that
target specific error types, the current state of the art uses machine translation to correct all error
types simultaneously. A significant disadvantage of this approach is that machine translation
does not produce annotated output and so error type information is lost. This means we can only
evaluate a system in terms of overall performance and cannot carry out a more detailed analysis
of different aspects of system performance.
In this thesis, I develop a system to automatically annotate parallel original and corrected
sentence pairs with explicit edits and error types. In particular, I first extend the Damerau-
Levenshtein alignment algorithm to make use of linguistic information when aligning parallel
sentences, and supplement this alignment with a set of merging rules to handle multi-token
edits. The output from this algorithm surpasses other edit extraction approaches in terms of
approximating human edit annotations and is the current state of the art. Having extracted the
edits, I next classify them according to a new rule-based error type framework that depends only
on automatically obtained linguistic properties of the data, such as part-of-speech tags. This
framework was inspired by existing frameworks, and human judges rated the appropriateness
of the predicted error types as ‘Good’ (85%) or ‘Acceptable’ (10%) in a random sample of 200
edits. The whole system is called the ERRor ANnotation Toolkit (ERRANT) and is the first
toolkit capable of automatically annotating parallel sentences with error types.
I demonstrate the value of ERRANT by applying it to the system output produced by the participants of the CoNLL-2014 shared task, and carry out a detailed error type analysis of
system performance for the first time. I also develop a simple language model based approach
to GEC, that does not require annotated training data, and show how it can be improved using
ERRANT error types
Recommended from our members
The Making of a Muslim Reformer: Muḥammad al-Ghazālī (1917-1996) and Islam in Postcolonial Egypt, 1947-1967
This is an intellectual biography of the classically trained Egyptian Muslim scholar, Muḥammad al-Ghazālī (1917-1996). A one-time leading intellectual of Egypt’s influential Islamic organization, the Muslim Brotherhood, Ghazālī was a popular author with a vast public following. Although his ideas have shaped the trajectories of various Islamic groups that emerged in Egypt during the 1970s “Islamic Revival,” he remains understudied. Through an analysis of his writings, this study presents a novel account on modern Islamic political thought, arguing that its sources extend well beyond what the secondary literature, as well as Muslims today, portray as the mainstays of the Islamic tradition—that is, the Qur’ān, the Sunna (Prophetic traditions), and fiqh (Islamic jurisprudence). In contrast, it places Sufism and Islamic philosophy, or more specifically Islamic philosophical ethics, at the heart of Ghazālī’s modern-day political critiques. Additionally, it moves beyond the scholarly narrative that depicts contemporary Islamic political thought as simply Islamic reformulations of concepts and categories derived from modern Western social thought. By examining Ghazālī’s considerable interest in Euro-American self-help, spiritualism, and psychical research, it shows how his engagement with these new forms of religion was mediated by Islamic theological concepts, which he deployed to not only make sense of his interlocutors’ claims, but also correct and build upon their work. In highlighting the corrective and productive impulse behind his engagement with Euro-American thought, it demonstrates that Ghazālī was not merely an assimilator of Western ideas, but rather a contributor to a global project of rethinking the human potential
Sacred Orientation: The Qibla As Ritual, Metaphor, And Identity Marker In Early Islam
Scholars of early Islam often take for granted the title of this study—that facing the qibla (i.e. the geographic direction of worship) is an important Islamic ritual and that Muḥammad’s turn toward the Kaʿba after facing Jerusalem for prayer marked the identity of his nascent community. This postulate is rarely questioned, but the mechanisms by which the qibla expressed and inscribed a collective Islamic identity remain largely unexplored. Rather, study of Islam’s sacred direction tends to focus on either historical reconstruction of Islamic origins or on the science of qibla-calculation. The former seeks to question or establish the location of the original qibla, while the latter examines the mathematics, astronomy, and cartography used to ascertain the direction of prayer with growing precision from around the Muslim oikumene. This dissertation probes, instead, the discursive and ritual processes through which qibla-rhetoric and qibla-practice fostered a sense of group belonging and marked boundaries between Islam and other religious communities (mainly Christians and Jews). Through four interlocking projects—spanning Islam’s emergence in Late Antiquity through the Early Middle Ages—this study explicates the subtle ways in which the qibla served as a potent and durable symbol in the construction of Islamic collective identity.
Chapter 1 considers the Qurʿān’s presentation of the qibla (Q Baqara 2:142-150) as part of the late antique discourse around liturgical orientation and group identity in the Near East. Chapter 2 explores the semantic usage of the term “People of the Qibla” (ahl al-qibla) to express a kind of “big-tent” view of Islamic community, and traces its earliest recorded usage to Iraq in the late Umayyad period. Chapter 3 studies scholarly (and often polemical discussions of abrogation (naskh) among Muslims, Christians, and Jews in the tenth century, where a change in the qibla became a metaphor for divine election of one people over others. The final chapter takes up the interpretive challenge of supposedly misaligned mosques and what they may tell us about the formative period of Islam. This study concludes by reflecting on the challenges of examining collective identity in premodern societies, and we propose three lenses for doing so that can benefit scholars of early Islam: namely, that we study identity as imagined, identity as a process, and identity as inexhaustible
India in the Persian World of Letters
This study traces the development of philology (the analysis of literary language) in the Persian tradition in India, concentrating on its socio-political ramifications. The most influential Indo-Persian philologist of the eighteenth century was Sirāj al-Dīn ʿAlī Ḳhān (d. 1756), whose pen-name was Ārzū. Besides being a respected poet, Ārzū was a rigorous theoretician of language whose intellectual legacy was side-lined by colonialism. His conception of language accounted for literary innovation and historical change in part to theorize the tāzah-goʾī [literally, “fresh-speaking”] movement in Persian literary culture. Although later scholarship has tended to frame this debate in anachronistically nationalist terms (Iranian native speakers versus Indian imitators), the primary sources show that contemporary concerns had less to do with geography than with the question of how to assess innovative “fresh-speaking” poetry, a situation analogous to the Quarrel of the Ancients and the Moderns in early modern Europe. Ārzū used historical reasoning to argue that as a cosmopolitan language Persian could not be the property of one nation or be subject to one narrow kind of interpretation. Ārzū also shaped attitudes about reḳhtah, the Persianized form of vernacular poetry that would later be renamed and reconceptualized as Urdu, helping the vernacular to gain acceptance in elite literary circles in northern India. This study puts to rest the persistent misconception that Indians started writing the vernacular because they were ashamed of their poor grasp of Persian at the twilight of the Mughal Empire
From madness to eternity: Psychiatry and Sufi healing in the postmodern world
Problem: Academic study of religious healing has recognised its symbolic aspects, but has tended to frame practice as ritual, knowledge as belief. In contrast, studies of scientific psychiatry recognise that discipline as grounded in intellectual tradition and naturalistic empiricism. This asymmetry can be addressed if: (a) psychiatry is recognised as a form of “religious healing”; (b) religious healing can be shown to have an intellectual tradition which, although not naturalistic, is grounded in experience. Such an analysis may help to reveal why globalisation has meant the worldwide spread not only of modern scientific medicine, but of religious healing. An especially useful form of religious healing to contrast with scientific medicine is Sufi healing, as practised by the Naqshbandi-Haqqani order, which has become remarkable for its spread in the “West” and its adaptation to vernacular cultures. / Research questions: (1) How is knowledge generated and transmitted in the NaqshbandiHaqqani order? (2) How is healing understood and done in the Order? (3) How does the Order find a role in the modern world, and in the West in particular? / Methods: Anthropological analysis of psychiatry as religious healing; review of previous studies of Sufi healing and the Naqshbandi-Haqqani order; ethnographic participant observation in the Naqshbandi-Haqqani order, with a special focus on healing. Ethnography was done at many sites, over a period of 11 years. / Findings: (1) Knowledge is generated by means of the individual’s contact with Shaykh Nazim, who, in turn, is said to be in contact with the Prophet. Knowledge is therefore personalised, situational, and ever-changing. Purification of the nafs (psyche, soul) is held to increase the capacity for knowledge. (2) Discourse in the Naqshbandi-Haqqani order centres around healing of the soul, which is held to be a salvific and intellectual exercise. Activities and intellectual disciplines are subsumed into soul-healing. Healing techniques are eclectic and universally applied, ultimately under the perceived direction of Shaykh Nazim. (3) The Order attracts followers through charisma and personal contacts; adapts to local vernaculars; creates alternative social networks; makes everyday activities part of soul-healing; provides low-cost personalised healing; and reflects postmodern concerns and ecumenism. / Implications: Healing that reflects pre-modern, religious models of the intellect, and a medical science that is not merely naturalistic, has encompassed scientific narratives and gained adherents in the postmodern world
Islamic Ethics and the Genome Question
Islamic Ethics and the Genome Question is one of the first academic works, which examine the field of genomics from an Islamic perspective. The contributions in the volume also accommodate and interact with critical insights from outside the Islamic tradition. Readership: Researchers and students specialized in ethics, bioethics and Islamic studies. Additionally, this volume will be a source of important information for geneticists, genomicists and social scientists who are interested in the ethical discourse about genomics in the Muslim world
- …