Search CORE

36 research outputs found

Document Style Recognition Using Shallow Statistical Analysis

Author: Braslavski P.
Браславский П. И.
Publication venue
Publication date: 01/01/2004
Field of study

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

User-Centered Comparison of Web Search Tools

Author: Braslavski P.
Shishkin A.
Браславский П. И.
Publication venue
Publication date: 01/01/2005
Field of study

This study explores a user-centered approach to the comparative evaluation of the Web search tool ProThes against popular all-purpose search engines Yandex and Google. An original research design was developed. Data were collected from 12 volunteers who performed 48 search tasks in total. Main outcomes include: (1) search strategy supported through ProThes can be quite effective for focused Web search and (2) ProThes’ interface and system performance must be improved.The research was supported in part by the Russian Fund of Basic Research, grant # 03-07-90342

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Automatic Geotagging of Russian Web Sites

Author: Braslavski P.
Maslov M.
Pyalling A.
Браславский П. И.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

The poster describes a fast, simple, yet accurate method to associate large amounts of web resources stored in a search engine database with geographic locations. The method uses location-by-IP data, domain names, and content-related features: ZIP and area codes. The novelty of the approach lies in building location-by-IP database by using continuous IP blocks method. Another contribution is domain name analysis. The method uses search engine infrastructure and makes it possible to effectively associate large amounts of search engine data with geography on a regular basis. Experiments ran on Yandex search engine index; evaluation has proved the efficacy of the approach.ACM Special Interest Group on Hypertext, Hypermedia, and We

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

ProThes: Thesaurus-based Meta-Search Engine for a Specific Application Domain

Author: Alshanski G.
Braslavski P.
Shishkin A.
Браславский П. И.
Publication venue: WWW2004 : NY USA
Publication date: 01/01/2004
Field of study

In this poster we introduce ProThes, a pilot meta-search engine (MSE) for a specific application domain. ProThes combines three approaches: meta-search, graphical user interface (GUI) for query specification, and thesaurus-based query techniques. ProThes attempts to employ domain-specific knowledge, which is represented by both a conceptual thesaurus and results ranking heuristics. Since the knowledge representation is separated from the MSE core, adjusting the system to a specific domain is trouble free. Thesaurus allows for manual query building and automatic query techniques. This poster outlines the overall system architecture, thesaurus representation format, and query operations. ProThes is implemented on J2EE platform as a Web service.The project was supported in part by the Russian Fund of Basic Research, grant # 03-07-90342

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

A Large-Scale Community Questions Classification Accounting for Category Similarity: An Exploratory?

Author: Braslavski P.
Lezina G.
Браславский П. И.
Лезина Г.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The paper reports on a large-scale topical categorization of questions from a Russian community question answering (CQA) service [email protected]. We used a data set containing all the questions (more than 11 millions) asked by [email protected] users in 2012. This is the first study on question categorization dealing with non-English data of this size. The study focuses on adjusting category structure in order to get more robust classification results. We investigate several approaches to measure similarity between categories: the share of identical questions, language models, and user activity. The results show that the proposed approach is promising.14-07-00589; RFBR; Russian Foundation for Basic Research

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Что и как спрашивают в социальных вопросно-ответных сервисах по-русски?

Author: Braslavski P.
Mukhin M.
Браславский П. И.
Мухин М. Ю.
Publication venue: Издательство РГГУ
Publication date: 01/01/2012
Field of study

In our study we surveyed different approaches to the study of questions in traditional linguistics, question answering (QA), and, recently, in community question answering (CQA). We adapted a functional-semantic classification scheme for CQA data and manually labeled 2,000 questions in Russian originating from [email protected] CQA service. About half of them are purely conversational and do not aim at obtaining actual information. In the subset of meaningful questions the major classes are requests for recommendations, or how-questions, and fact-seeking questions. The data demonstrate a variety of interrogative sentences as well as a host of formally non-interrogative expressions with the meaning of questions and requests. The observations can be of interest both for linguistics and for practical applications

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Experiment on Style-Dependent Document Ranking

Author: Braslavski P.
Tselischev A.
Браславский П. И.
Publication venue: б. и.
Publication date: 01/01/2005
Field of study

The paper reports on experiments aimed at incorporating style-dependent parameters into ranking schemata in information retrieval tasks. We use ROMIP Web collection and ROMIP-2003 ad-hoc track results in the analysis. Factor analysis techniques have been used to extract factors that would reflect stylistic properties of documents. Comparison of the obtained style-dependent parameters and their derived ranks is conducted. A simple schema for rank aggregation is proposed. Evaluation of the results shows only moderate improvement of relevance ranking.В работе описывается эксперимент по использованию стилистических параметров в ранжировании документов для задачи информационного поиска. В эксперименте использована Веб-коллекция РОМИП, а также результаты оценки дорожки Веб-поиска РОМИП-2003. Для выделения факторов, отражающих стиль документа, использовались методы факторного анализа. Проведено сравнение полученных стилистических параметров и рангов на их основе. Предложена простая схема агрегации рангов. Оценка результатов показала, что метод может давать только незначительное повышение качества ранжирования

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Towards Automatic Evaluation of Health-Related CQA Data

Author: Beloborodov A.
Braslavski P.
Driker M.
Белобородов А. В.
Браславский П. И.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The paper reports on evaluation of Russian community question answering (CQA) data in health domain. About 1,500 question-answer pairs were manually evaluated by medical professionals, in addition automatic evaluation based on reference disease-medicine pairs was performed. Although the results of the manual and automatic evaluation do not fully match, we find the method still promising and propose several improvements. Automatic processing can be used to dynamically monitor the quality of the CQA content and to compare different data sources. Moreover, the approach can be useful for symptomatic surveillance and health education campaigns.This work is partially supported by the Russian Foundation for Basic Research, project #14-07-00589 “Data Analysis and User Modelling in Narrow-Domain Social Media”. We also thank assessors who volunteered for the evaluation and Mail.Ru for granting us access to the data

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

English → Russian MT evaluation campaign

Author: Beloborodov A.
Braslavski P.
Khalilov M.
Sharoff S.
Белобородов А.
Браславский П. И.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2013
Field of study

This paper presents the settings and the result of the ROMIP 2013 MT shared task for the English→Russian language direction. The quality of generated translations was assessed using automatic metrics and human evaluation. We also discuss ways to reduce human evaluation efforts using pairwise sentence comparisons by human judges to simulate sort operations

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Learning to predict closed questions on stack overflow

Author: Braslavski P.
Kuznetsov A.
Lezina G.
Браславский П. И.
Кузнецов А.
Лезина Г.
Publication venue: Казанский университет
Publication date: 01/01/2013
Field of study

The paper deals with the problem of predicting whether the user’s question will be closed by the moderator on Stack Overflow, a popular question answering service devoted to software programming. The task along with data and evaluation metrics was offered as an open machine learning competition on Kaggle platform. To solve this problem, we employed a wide range of classification features related to users, their interactions, and post content. Classification was carried out using several machine learning methods. According to the results of the experiment, the most important features are characteristics of the user and topical features of the question. The best results were obtained using Vowpal Wabbit – an implementation of online learning based on stochastic gradient descent. Our results are among the best ones in overall ranking, although they were obtained after the official competition was over

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin