Search CORE

24 research outputs found

Image Based Model for Document Search and Re-ranking

Author: Bhise R. H. (Rajesh)
Mhatre R. G. (Rutuja)
Publication venue: 'Arunai Publications Private Limited'
Publication date: 01/08/2016
Field of study

Traditional Web search engines do not use the images in the web pages to search relevant documents for a given query. Instead, they are typically operated by computing a measure of agreement between the keywords provided by the user and only the text portion of each web page. This project describes whether the image content appearing in a Web page can be used to enhance the semantic description of Web page and accordingly improve the performance of a keyword-based search engine. A Web-scalable system is presented in such a way that exploits a pure text-based search engine that finds an initial set of candidate documents as per given query. Then, by using visual information extracted from the images contained in the pages, the candidate set will be re-ranked. The computational efficiency of traditional text-based search engines will be maintained by the resulting system with only a small additional storage cost that will be needed to predetermine the visual information

Neliti

Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

Author: Boulesteix Anne-Laure
Janitza Silke
Kruppa Jochen
König Inke R.
Publication venue
Publication date: 25/07/2012
Field of study

The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is given to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of the methodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research

Crossref

Open Access LMU

Users' Traces for Enhancing Arabic Facebook Search

Author: Badache Ismail
Barhoumi Amira
Bayoudhi Amine
Dahou Abdelghani
ElSahar Hady
Greiffenstern Sandra
Mikolov Tomas
Mohamed
Westerveld Thijs
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/09/2019
Field of study

International audienceThis paper proposes an approach on Facebook search in Arabic, which exploits several users' traces (e.g. comment, share, reactions) left on Facebook posts to estimate their social importance. Our goal is to show how these social traces (signals) can play a vital role in improving Arabic Facebook search. Firstly, we identify polarities (positive or negative) carried by the textual signals (e.g. comments) and non-textual ones (e.g. the reactions love and sad) for a given Facebook post. Therefore, the polarity of each comment expressed on a given Facebook post, is estimated on the basis of a neural sentiment model in Arabic language. Secondly, we group signals according to their complementarity using features selection algorithms. Thirdly, we apply learning to rank (LTR) algorithms to re-rank Facebook search results based on the selected groups of signals. Finally, experiments are carried out on 13,500 Facebook posts, collected from 45 topics in Arabic language. Experiments results reveal that Random Forests combined with ReliefFAttributeEval (RLF) was the most effective LTR approach for this task

Crossref

HAL AMU

Towards Effective Network Intrusion Detection: A Hybrid Model Integrating Gini Index and GBDT with PSO

Author: Jianjun Cheng
Longjie Li
Shenshen Bai
Xiaoyun Chen
Yang Yu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

In order to protect computing systems from malicious attacks, network intrusion detection systems have become an important part in the security infrastructure. Recently, hybrid models that integrating several machine learning techniques have captured more attention of researchers. In this paper, a novel hybrid model was proposed with the purpose of detecting network intrusion effectively. In the proposed model, Gini index is used to select the optimal subset of features, the gradient boosted decision tree (GBDT) algorithm is adopted to detect network attacks, and the particle swarm optimization (PSO) algorithm is utilized to optimize the parameters of GBDT. The performance of the proposed model is experimentally evaluated in terms of accuracy, detection rate, precision, F1-score, and false alarm rate using the NSL-KDD dataset. Experimental results show that the proposed model is superior to the compared methods

Crossref

Directory of Open Access Journals

Personalized NewsEvent Retrieval for Small Talk in Social Dialog Systems

Author: Bechberger Lucas
Federico Marcello
Schmidt Maria
Waibel Alex
Publication venue
Publication date
Field of study

This paper presents the NewsTeller system which retrieves a news event based on a user query and the user’s general interests. It can be used by a social dialog system to initiate news-related small talk. The NewsTeller system is implemented as a pipeline with four stages: After collecting a large set of potentially relevant news events, a classifier is used to filter out mal- formed events. The remaining events are then ranked ac- cording to a relevance value predicted by a regressor. In a final step, a short summary of the highest-ranked event is generated and returned to the user. Both the classifier and the regressor were evaluated on hand-labeled data sets. In addition to this, a user study was conducted to further validate the system. Evaluation results indicate that the proposed approach performs significantly better than a random baseline

Archivio della ricerca - Fondazione Bruno Kessler

Pegasus: a comprehensive annotation and prediction tool for detection of driver gene fusions in cancer

Author
Publication venue: BioMed Central
Publication date
Field of study

Springer - Publisher Connector

A cross-benchmark comparison of 87 learning to rank methods

Author: Alcântara
Busa-Fekete
Cai
Chapelle
Chapelle
Chen
Derhami
Djoerd Hiemstra
Duh
Freund
Geng
Geng
Gomes
He
Kao
Lai
Lai
Lai
Laporte
Metzler
Mohan
Niek Tax
Pahikkala
Pan
Qin
Qin
Rousseeuw
Rudin
Sander Bockting
Silva
Song
Sun
Torkestani
Torkestani
Veloso
Wang
Zong
Publication venue: Elsevier
Publication date: 01/01/2015
Field of study

Learning to rank is an increasingly important scientific field that comprises the use of machine learning for the ranking task. New learning to rank methods are generally evaluated on benchmark test collections. However, comparison of learning to rank methods based on evaluation results is hindered by the absence of a standard set of evaluation benchmark collections. In this paper we propose a way to compare learning to rank methods based on a sparse set of evaluation results on a set of benchmark datasets. Our comparison methodology consists of two components: (1) Normalized Winning Number, which gives insight in the ranking accuracy of the learning to rank method, and (2) Ideal Winning Number, which gives insight in the degree of certainty concerning its ranking accuracy. Evaluation results of 87 learning to rank methods on 20 well-known benchmark datasets are collected through a structured literature search. ListNet, SmoothRank, FenchelRank, FSMRank, LRUF and LARF are Pareto optimal learning to rank methods in the Normalized Winning Number and Ideal Winning Number dimensions, listed in increasing order of Normalized Winning Number and decreasing order of Ideal Winning Number

University of Twente Research Information