44 research outputs found

    Cross-lingual Information Retrieval State-of-the-Art

    Get PDF
    Information retrieval involves finding some required information in a collection of information or in databases. The information or database need not necessarily be in one language. In other words, language should not limit the finding of information. The way to search for the information is by looking at every item in the collection and when the need to translate the language arises, the techniques and methods developed for the cross-lingual retrieval system is used. This paper reviews some recent researches focusing on topics in cross-lingual information retrieval and their role in current research directions which include new models and paradigms in the wide area of information retrieval

    English and Malay cross-lingual sentiment lexicon acquisition and analysis

    Get PDF
    Sentiment analysis finds opinions, sentiments or emotions in user-generated contents. Most efforts are focusing on the English language, for which a large amount of sources and tools for sentiment analysis are available. The objective of this paper is to introduce a cross-lingual sentiment lexicon acquisition method for the Malay and English languages and further being test on a set of news test collections. Several part of speech tags are being experimented using the Word Score Summation technique in order to classify the sentiment of the news articles. This method records up to 50% as experimental accuracy result and works better for verbs and negations in both the English and Malay news articles

    Deep learning mango fruits recognition based on tensorflow lite

    Get PDF
    Agricultural images such as fruits and vegetables have previously been recognised and classified using image analysis and computer vision techniques. Mangoes are currently being classified manually, whereby mango sellers must laboriously identify mangoes by hand. This is time-consuming and tedious. In this work, TensorFlow Lite was used as a transfer learning tool. Transfer learning is a fast approach in resolving classification problems effectively using small datasets. This work involves six categories, where four mango types are classified (Harum Manis, Langra, Dasheri and Sindhri), categories for other types of mangoes, and a non-mango category. Each category dataset comprises 100 images, and is split 70/30 between the training and testing set, respectively. This work was undertaken with a mobile-based application that can be used to distinguish various types of mangoes based on the proposed transfer learning method. The results obtained from the conducted experiment show that adopted transfer learning can achieve an accuracy of 95% for mango recognition. A preliminary user acceptance survey was also carried out to investigate the user’s requirements, the effectiveness of the proposed functionalities, and the ease of use of its proposed interfaces, with promising results

    A colour-based building recognition using support vector machine

    Get PDF
    Many applications apply the concept of image recognition to help human in recognising objects simply by just using digital images. A content-based building recognition system could solve the problem of using just text as search input. In this paper, a building recognition system using colour histogram is proposed for recognising buildings in Ipoh city, Perak, Malaysia. The colour features of each building image will be extracted. A feature vector combining the mean, standard deviation, variance, skewness and kurtosis of gray level will be formed to represent each building image. These feature values are later used to train the system using supervised learning algorithm, which is Support Vector Machine (SVM). Lastly, the accuracy of the recognition system is evaluated using 10-fold cross validation. The evaluation results show that the building recognition system is well trained and able to effectively recognise the building images with low misclassification rate

    Durian recognition based on multiple features and linear discriminant analysis

    Get PDF
    Many fruit recognition approaches today are designed to classify different type of fruits but there is little effort being done for content-based fruit recognition specifically focuses on durian species. Durian, known as the king of tropical fruits, have few similar characteristics between different species where the skin have almost the same colour from green to yellowish brown with just slightly different shape and pattern of thorns. Therefore, it is hard to differentiate them with the current methods. It would be valuable to have an automated content-based recognition framework that can automatically represent and recognise a durian species given a durian image as the input. Therefore, this work aims to contribute to a new representation method based on multiple features for effective durian recognition. Two features based on shape and texture is considered in this work. Simple shape signatures which include area, perimeter, and circularity are used to determine the shape of the fruit durian and its base while the texture of the fruit is constructed based on Local Binary Pattern. We extracted these features from 240 durian images and trained this proposed method using few classifiers. Based on 10-fold cross validation, it is found that Logistic Regression, Gaussian Naïve Bayesian, and Linear Discriminant Analysis classifiers performed equally well with 100% achievement of accuracy, precision, recall, and F1-score. We further tested the proposed algorithm on larger dataset which consisted of 42337 fruit images (64 various categories). Experimental results based on larger and more general dataset have shown that the proposed multiple features trained on Linear Discriminant Analysis classifier able to achieve 72.38% accuracy, 73% precision, 72% recall, and 72% F1-score

    Semantics representation in a sentence with concept relational model (CRM)

    Get PDF
    The current way of representing semantics or meaning in a sentence is by using the conceptual graphs. Conceptual graphs define concepts and conceptual relations loosely. This causes ambiguity because a word can be classified as a concept or relation. Ambiguity disrupts the process of recognizing graphs similarity, rendering difficulty to multiple graphs interaction. Relational flow is also altered in conceptual graphs when additional linguistic information is input. Inconsistency of relational flow is caused by the bipartite structure of conceptual graphs that only allows the representation of connection between concept and relations but never between relations per se. To overcome the problem of ambiguity, the concept relational model (CRM) described in this article strictly organizes word classes into three main categories; concept, relation and attribute. To do so, CRM begins by tagging the words in text and proceeds by classifying them according to a predefi ned mapping. In addition, CRM maintains the consistency of the relational flow by allowing connection between multiple relations as well. CRM then uses a set of canonical graphs to be worked on these newly classified components for the representation of semantics. The overall result is better accuracy in text engineering related task like relation extraction

    Feature-based similarity method for aligning the Malay and English news documents

    Get PDF
    Corpus-based translation approach can be used to obtain reliable translation knowledge in addition to the use of dictionaries or machine translation. But the availability of such corpus is very limited especially for the low-resources languages. Many works have been reported for the alignments of multilingual documents especially among the European languages, but less focusing on the languages with less linguistics resources. One of the challenges is to align the available multilingual documents for the creation of comparable corpus for these kinds of languages. This article describes an alignment method that utilized the statistical features of the documents such as the documents’ titles, texts of the contents, and also the named entities present in each document. This method will be focusing on the English and Malay news documents, in which in which the Malay language is considered as a low-resource language. Source and target documents were then compared in a pair. Accuracy, precision, and recall measurements were used in evaluating the results with the inclusion of three relevance scales; Same story, Shared aspect and Unrelated, to assess the alignment pairs. The results indicate that the method performed well in aligning the news documents with the accuracy of 96% and average precision of 81%

    Factors contributing to collaborative Game-Based Learning (CGBL) effectiveness

    Get PDF
    In facing the Covid-19 pandemic, the teaching and learning landscape in Malaysian schools has also changed accordingly. The Ministry of Education has introduced Teaching and Learning at Home to take over the previous methods. Conventional teaching methods are unfitting during the ‘new norm’. Therefore, teachers need to diversify their instructional strategies and search for various resources in the digital environment - learning in this mode should create a fun digital learning environment. Digital Game-based Learning (DGBL) is a teaching aid that is capable of promoting enjoyment in learning. This article focused on DGBL as a learning method in a collaborative environment called Collaborative Game-based Learning (CGBL). There is a shortage of insight on the factors that support DGBL’s efficiency in the digital environment, specifically in CGBL in educational settings. This article employed a systematic Thematic Review (TR) approach to synthesise the literature published from 2016 until 2021 on CGBL in the digital environment. A keyword search was conducted, followed by a filtering process using inclusion criteria from the Scopus, Lens, and Mendeley databases. The author identified 65 peer-reviewed journal papers. Only 34 articles were used to be reviewed after the inclusion and exclusion processes. A TR of these articles identified 95 initial codes, later grouped into 32 codes, and created ten categories from three themes. From the TR results, it is found that the factors contributing to CGBL effectiveness are learning environment, learning motivation and learning strategies. This work provides insight on various parties in considering the implementation of CGBL in Teaching and Learning at Home as one of the appropriate alternative resources and methods

    Acoustic feature analysis for wet and dry road surface classification using two-stream CNN

    Get PDF
    Road surface wetness affects road safety and is one of the main reasons for weather-related accidents. Study on road surface classification is not only vital for future driverless vehicles but also important to the development of current vehicle active safety systems. In recent years, studies on road surface wetness classification using acoustic signals have been on the rise. Detection of road surface wetness from acoustic signals involve analysis of signal changes over time and frequency-domain caused by interaction of the tyre and the wet road surface to determine the suitable features. In this paper, two single stream CNN architectures have been investigated. The first architecture uses MFCCs and the other uses temporal and spectral features as the input for road surface wetness detection. A two-stream CNN architecture that merges the MFCCs and spectral feature sets by concatenating the outputs of the two streams is proposed for further improving classification performance of road surface wetness detection. Acoustic signals of wet and dry road surface conditions were recorded with two microphones instrumented on two different cars in a controlled environment. Experimentation and comparative performance evaluations against single stream architectures and the two-stream architecture were performed. Results shows that the accuracy performance of the proposed two-stream CNN architecture is significantly higher compared to single stream CNN for road surface wetness detection
    corecore