7 research outputs found

    English and Malay cross-lingual sentiment lexicon acquisition and analysis

    Get PDF
    Sentiment analysis finds opinions, sentiments or emotions in user-generated contents. Most efforts are focusing on the English language, for which a large amount of sources and tools for sentiment analysis are available. The objective of this paper is to introduce a cross-lingual sentiment lexicon acquisition method for the Malay and English languages and further being test on a set of news test collections. Several part of speech tags are being experimented using the Word Score Summation technique in order to classify the sentiment of the news articles. This method records up to 50% as experimental accuracy result and works better for verbs and negations in both the English and Malay news articles

    Feature-based similarity method for aligning the Malay and English news documents

    Get PDF
    Corpus-based translation approach can be used to obtain reliable translation knowledge in addition to the use of dictionaries or machine translation. But the availability of such corpus is very limited especially for the low-resources languages. Many works have been reported for the alignments of multilingual documents especially among the European languages, but less focusing on the languages with less linguistics resources. One of the challenges is to align the available multilingual documents for the creation of comparable corpus for these kinds of languages. This article describes an alignment method that utilized the statistical features of the documents such as the documents’ titles, texts of the contents, and also the named entities present in each document. This method will be focusing on the English and Malay news documents, in which in which the Malay language is considered as a low-resource language. Source and target documents were then compared in a pair. Accuracy, precision, and recall measurements were used in evaluating the results with the inclusion of three relevance scales; Same story, Shared aspect and Unrelated, to assess the alignment pairs. The results indicate that the method performed well in aligning the news documents with the accuracy of 96% and average precision of 81%

    A review on building bilingual comparable corpora for resource-limited languages

    Get PDF
    Information retrieval tasks on certain Asian languages have the problem of limited knowledge resources such as the bilingual and multilingual dictionaries and corpora. Thus, there is a need to create multilingual resources for these languages. One of the ways is to automatically align document by identifying the chances that two documents are related to each other and these documents are not necessarily in one language. Multilingual corpora can then be automatically developed from these aligned documents. Numerous approaches for document alignment have been developed to date. In this paper, we gave an overview of recent progress made for bilingual and multilingual document alignments within the last 5 years. In addition, we also discussed the current progress made in developing bilingual comparable corpus especially on the Malay language, which is one of the resource-limited languages in Asia

    A framework for English and Malay cross-lingual document alignment method

    Get PDF
    Issues of information divide in multilingual information retrieval are usually being solved by translating users’ queries to a language that the users understand. But dictionaries or other translation knowledge in some of the Asian languages are scarce. The objective of this study was to automatically align the English and Malay news documents to become a comparable corpus, which could contribute as a translation resource to improve the query translation in cross-lingual information retrieval. This study proposes a direct alignment framework by utilizing the textual features similarity of each document itself while attempting a novel approach of using the similarity of the documents sentiment in improving the effectiveness of the alignment method. The proposed sentiment-based approach outperformed existing alignment methods and improved the effectiveness in differentiating the related and unrelated documents. These aligned comparable documents can further be utilised in translation research for the English and Malay cross-lingual information retrieval tasks

    Embedded gateway services for Internet of Things applications in ubiquitous healthcare

    Get PDF
    The continuous advancement in computer and communication technologies has made personalized healthcare monitoring a rapidly growing area of interest. New features and services are envisaged, raising users' expectations in healthcare services. The emergence of Internet of Things brings people closer to connect the physical world to the Internet. In this paper, we present embedded services that are part of a ubiquitous healthcare system that allows automated and intelligent monitoring. The system uses IP connectivity and the Internet for end-to-end communication, from each 6LoWPAN sensor nodes to the web user interface on the Internet. The proposed algorithm in the Gateway performs multithreaded processing on the gathered medical signals for conversion to real data, feature extraction and wireless display. The user interface at the server allows users to access and view the medical data from mobile and portable devices. The ubiquitous system is exploring possibilities in connecting Internet with things and people for health services

    A review on the cross-lingual information retrieval

    Get PDF
    Information retrieval involves finding some required information in a collection of information or in database. The collection not necessarily be in one language only as information does not limited to language. The simplest way to search for the information is to look at every item in the collection and when the need to translate the languages being used arises, this is where the techniques and methods that were being developed for the cross-lingual retrieval system will take place. This article reviews some recent researches focusing on topics in cross-lingual information retrieval and their role in current research directions in the wide area of information retrieval
    corecore