100 research outputs found
ANNOTATED DISJUNCT FOR MACHINE TRANSLATION
Most information found in the Internet is available in English version. However,
most people in the world are non-English speaker. Hence, it will be of great advantage
to have reliable Machine Translation tool for those people. There are many
approaches for developing Machine Translation (MT) systems, some of them are
direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses
on developing an MT for less resourced languages i.e. languages that do not have
available grammar formalism, parser, and corpus, such as some languages in South
East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer
approaches. Moreover, the unavailability of grammar formalism and parser in the
target languages motivates us to develop a hybrid between direct and transfer
approaches. This hybrid approach is referred as a hybrid transfer approach. This
approach uses the Annotated Disjunct (ADJ) method. This method, based on Link
Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and
many-to-many word(s) translations. This method consists of transfer rules module
which maps source words in a source sentence (SS) into target words in correct
position in a target sentence (TS). The developed transfer rules are demonstrated on
English → Indonesian translation tasks. An experimental evaluation is conducted to
measure the performance of the developed system over available English-Indonesian
MT systems. The developed ADJ-based MT system translated simple, compound, and
complex English sentences in present, present continuous, present perfect, past, past
perfect, and future tenses with better precision than other systems, with the accuracy
of 71.17% in Subjective Sentence Error Rate metric
ANNOTATED DISJUNCT FOR MACHINE TRANSLATION
Most information found in the Internet is available in English version. However,
most people in the world are non-English speaker. Hence, it will be of great advantage
to have reliable Machine Translation tool for those people. There are many
approaches for developing Machine Translation (MT) systems, some of them are
direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses
on developing an MT for less resourced languages i.e. languages that do not have
available grammar formalism, parser, and corpus, such as some languages in South
East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer
approaches. Moreover, the unavailability of grammar formalism and parser in the
target languages motivates us to develop a hybrid between direct and transfer
approaches. This hybrid approach is referred as a hybrid transfer approach. This
approach uses the Annotated Disjunct (ADJ) method. This method, based on Link
Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and
many-to-many word(s) translations. This method consists of transfer rules module
which maps source words in a source sentence (SS) into target words in correct
position in a target sentence (TS). The developed transfer rules are demonstrated on
English → Indonesian translation tasks. An experimental evaluation is conducted to
measure the performance of the developed system over available English-Indonesian
MT systems. The developed ADJ-based MT system translated simple, compound, and
complex English sentences in present, present continuous, present perfect, past, past
perfect, and future tenses with better precision than other systems, with the accuracy
of 71.17% in Subjective Sentence Error Rate metric
Computer Program Software for Determining formal Symmetry of Evolution Eqations
The existence of formal symmetry of an evolution equation is one of the criteria of the complete integrability or solvability of evolution equations, due to Sokolov and Shabat.
Many evolution equations such as the soliton (solitary equation) of Korteweg-de Vries (KdV) equation have been found recently to have various kinds of explicit integral or solutions. Such evolution equations admit infinitely many symmetries or admit the recursion operator. In this paper we introduce the definition of the formal symmetry. Formal symmetry is the approximation of the recursion operator, which brings us to a convenient way of characterizing equations admitting infinitely many symmetries.
In this research, we developed a program for computing the formal symmetries of evolution equations. To verify the correctness of the program, we apply it to some evolution equations (as testing equations), which have been proved to be formally completely integrable.
The program we obtained can compute the formal symmetry of finite arbitrary order (up to order 18) of the testing equations, which verify the correctness of the program
Arsitektur Jaringan Neural Berbasis Simpul Ram Untuk Pengenalan Huruf
Every handwritten letter is obviously different depending on who writes it. Similarly letters printed from a computer are also different depending on the type of font selected and the type of the printer. In that sense, a method which recognizes letters is needed, among which is neural network method. Examples of neural network method are Adaline, Madaline and Backward Propagation. But the disadvantage of the mentioned methods is that they have interconnection Weights which need a lot of iterations so that the computation time is longer. In this study, a neural network based on RAM Node is used, which has a considerable shorter computation time because it doesn\u27t involve weight vectors in it\u27s process. In this case, with an input letter pattern of the 64 x 48 pixels binary image and by using Turbo C++ version 1.0, we obtain a recognition time less than 2 seconds. While if another method was used, for example Backward Propagation, it could have consumed time in the order minutes or even hours
Point of Interest (POI) Recommendation System using Implicit Feedback Based on K-Means+ Clustering and User-Based Collaborative Filtering
Recommendation system always involves huge volumes of data, therefore it causes the scalability issues that do not only increase the processing time but also reduce the accuracy. In addition, the type of data used also greatly affects the result of the recommendations. In the recommendation system, there are two common types of data namely implicit (binary) rating and explicit (scalar) rating. Binary rating produces lower accuracy when it is not handled with the properly. Thus, optimized K-Means+ clustering and user-based collaborative filtering are proposed in this research. The K-Means clustering is optimized by selecting the K value using the Davies-Bouldin Index (DBI) method. The experimental result shows that the optimization of the K values produces better clustering than Elbow Method. The K-Means+ and User-Based Collaborative Filtering (UBCF) produce precision of 8.6% and f-measure of 7.2%, respectively. The proposed method was compared to DBSCAN algorithm with UBCF, and had better accuracy of 1% increase in precision value. This result proves that K-Means+ with UBCF can handle implicit feedback datasets and improve precision
Perbandingan Performa Relational, Document-Oriented dan Graph Database Pada Struktur Data Directed Acyclic Graph
Abstract.Directed Acyclic Graph (DAG) is a directed graph which is not cyclic and is usually employed in social network and data genealogy. Based on the characteristic of DAG data, a suitable database type should be evaluated and then chosen as a platform. A performance comparison among relational database (PostgreSQL), document-oriented database (MongoDB), and graph database (Neo4j) on a DAG dataset are then conducted to get the appropriate database type. The performance test is done on Node.js running on Windows 10 and uses the dataset that has 3910 nodes in single write synchronous (SWS) and single read (SR). The access performance of PostgreSQL is 0.64ms on SWS and 0.32ms on SR, MongoDB is 0.64ms on SWS and 4.59ms on SR, and Neo4j is 9.92ms on SWS and 8.92ms on SR. Hence, relational database (PostgreSQL) has better performance in the operation of SWS and SR than document-oriented database (MongoDB) and graph database (Neo4j).Keywords: database performance, directed acyclic graph, relational database, document-oriented database, graph database Abstrak.Directed Acyclic Graph (DAG) adalah graf berarah tanpa putaran yang dapat ditemui pada data jejaring sosial dan silsilah keluarga. Setiap jenis database memiliki performa yang berbeda sesuai dengan struktur data yang ditangani. Oleh karena itu perlu diketahui database yang tepat khususnya untuk data DAG. Tujuan penelitian ini adalah membandingkan performa dari relational database (PostgreSQL), document-oriented database (MongoDB) dan graph database (Neo4j) pada data DAG. Metode yang dilakukan adalah mengimplentasi dataset yang memiliki 3910 node dalam operasi single write synchronous (SWS) dan single read (SR) pada setiap database menggunakan Node.js dalam Windows 10. Hasil pengujian performa PostgreSQL dalam operasi SWS sebesar 0.64ms dan SR sebesar 0.32ms, performa MongoDB pada SWS sebesar 0.64ms dan SR sebesar 4.59ms sedangkan performa Neo4j pada operasi SWS sebesar 9.92ms dan SR sebesar 8.92ms. Hasil penelitian menunjukan bahwa relational database (PostgreSQL) memiliki performa terbaik dalam operasi SWS dan SR dibandingkan document-oriented database (MongoDB) dan graph database (Neo4j).Kata Kunci: performa database, directed acyclic graph, relational database, document-oriented database, graph databas
Impact of Matrix Factorization and Regularization Hyperparameter on a Recommender System for Movies
Recommendation system is developed to match consumers with product to meet their variety of special needs and tastes in order to enhance user satisfaction and loyalty. The popularity of personalized recommendation system has been increased in recent years and applied in several areas include movies, songs, books, news, friend recommendations on social media, travel products, and other products in general. Collaborative Filtering methods are widely used in recommendation systems. The collaborative filtering method is divided into neighborhood-based and model-based. In this study, we are implementing matrix factorization which is part of model-based that learns latent factor for each user and item and uses them to make rating predictions. The method will be trained using stochastic gradient descent with additional tricks and optimization of regularization hyperparameter. In the end, neighborhood-based collaborative filtering and matrix factorization with different values of regularization hyperparameter will be compared. Our result shows that matrix factorization method with lowest regularization hyperparameter outperformed the other methods in term of RMSE score. In this study, the used functions are available from Graphlab and using Movielens 100k data set for building the recommendation systems
Stemming Influence on Similarity Detection of Abstract Written in Indonesia
In this paper we would like to discuss about stemming effect by using Nazief and Adriani algorithm against similarity detection result of Indonesian written abstract. The contents of the publication abstract similarity detection can be used as an early indication of whether or not the act of plagiarism in a writing. Mostly in processing the text adding a pre-process, one of it which is called a stemming by changing the word into the root word in order to maximize the searching process. The result of stemming process will be changed as a certain word n-gram set then applied an analysis of similarity using Fingerprint Matching to perform similarity matching between text. Based on the F1-score which used to balance the precision and recall number, the detection that implements stemming and stopword removal has a better result in detecting similarity between the text with an average is 42%. It is higher comparing to the similarity detection by using only stemming process (31%) or the one that was done without involving the text pre-process (34%) while applying the bigram
User Curiosity Factor in Determining Serendipity of Recommender System
Recommender rystem (RS) is created to solve the problem by recommending some items among a huge selection of items that will be useful for the e-commerce users. RS prevents the users from being flooded by information that is irrelevant for them.Unlike information retrieval (IR) systems, the RS system's goal is to present information to the users that is accurate and preferably useful to them. Too much focus on accuracy in RS may lead to an overspecialization problem, which will decrease its effectiveness. Therefore, the trend in RS research is focusing beyond accuracy methods, such as serendipity. Serendipity can be described as an unexpected discovery that is useful. Since the concept of a recommendation system is still evolving today, formalizing the definition of serendipity in a recommendation system is very challenging.One known subjective factor of serendipity is curiosity. While some researchers already addressed curiosity factor, it is found that the relationships between various serendipity component as perceived by the users and their curiosity levels is still yet to be researched. In this paper, the method to determine user curiosity model by considering the variation of rated items was presented, then relation to serendipity components using existing user feedback data was validated. The finding showed that the curiosity model was related to some user-perceived values of serendipity, but not all. Moreover, it also had positive effect on broadening the user preference.
The Social Engagement to Agricultural Issues using Social Network Analysis
Twitter is one of the micro-blogging social media which emphasizes the speed of communication. In the 4.0 era, the government also promotes the distribution of information through social media to reach the community from various lines. In previous research, Social Network Analysis was used to see the relationship between actors in a work environment, or as a basis for identifying the application of technology adoption in decision making, whereas no one has used SNA to see trends in people's response to agricultural information. This study aims to see the extent to which information about agriculture reaches the community, as well as to see the community's response to take part in agricultural development. This article also shows the actors who took part in disseminating information. Data was taken on November 13 to 20, 2020 from the Drone Emprit Academic, and was taken limited to 3000 nodes. Then, the measurements of the SNA are represented on the values of Degree Centrality, Betweenness Centrality, Closeness Centrality, and Eigenvector Centrality. @AdrianiLaksmi has the highest value in Eigenvector Centrality and Degree Centrality, he has the greatest role in disseminating information and has many followers among other accounts that spread the same information. While the @RamliRizal account ranks the highest in Betweenness Centrality, who has the most frequently referred information, and the highest Closeness Centrality is owned by the @baigmac account because of the fastest to re-tweet the first information
- …