2,354 research outputs found

    Towards a Universal Wordnet by Learning from Combined Evidenc

    Get PDF
    Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

    AgroSupportAnalytics: big data recommender system for agricultural farmer complaints in Egypt

    Get PDF
    The world’s agricultural needs are growing with the pace of increase in its population. Agricultural farmers play a vital role in our society by helping us in fulfilling our basic food needs. So, we need to support farmers to keep up their great work, even in difficult times such as the coronavirus disease (COVID-19) outbreak, which causes hard regulations like lockdowns, curfews, and social distancing procedures. In this article, we propose the development of a recommender system that assists in giving advice, support, and solutions for the farmers’ agricultural related complaints (or queries). The proposed system is based on the latent semantic analysis (LSA) approach to find the key semantic features of words used in agricultural complaints and their solutions. Further, it proposes to use the support vector machine (SVM) algorithm with Hadoop to classify the large agriculture dataset over Map/Reduce framework. The results show that a semantic-based classification system and ïŹltering methods can improve the recommender system. Our proposed system outperformed the existing interest recommendation models with an accuracy of 87%

    The Today Tendency of Sentiment Classification

    Get PDF
    Sentiment classification has already been studied for many years because it has had many crucial contributions to many different fields in everyday life, such as in political activities, commodity production, and commercial activities. There have been many kinds of the sentiment analysis such as machine learning approaches, lexicon-based approaches, etc., for many years. The today tendency of the sentiment classification is as follows: (1) Processing many big data sets with shortening execution times (2) Having a high accuracy (3) Integrating flexibly and easily into many small machines or many different approaches. We will present each category in more details

    Bridging SMT and TM with translation recommendation

    Get PDF
    We propose a translation recommendation framework to integrate Statistical Machine Translation (SMT) output with Translation Memory (TM) systems. The framework recommends SMT outputs to a TM user when it predicts that SMT outputs are more suitable for post-editing than the hits provided by the TM. We describe an implementation of this framework using an SVM binary classifier. We exploit methods to fine-tune the classifier and investigate a variety of features of different types. We rely on automatic MT evaluation metrics to approximate human judgements in our experiments. Experimental results show that our system can achieve 0.85 precision at 0.89 recall, excluding exact matches. futhermore, it is possible for the end-user to achieve a desired balance between precision and recall by adjusting confidence levels
    • 

    corecore