4,863 research outputs found
Text Extraction and Web Searching in a Non-Latin Language
Recent studies of queries submitted to Internet Search Engines have shown that
non-English queries and unclassifiable queries have nearly tripled during the
last decade. Most search engines were originally engineered for English. They
do not take full account of inflectional semantics nor, for example, diacritics or
the use of capitals which is a common feature in languages other than English.
The literature concludes that searching using non-English and non-Latin based
queries results in lower success and requires additional user effort to achieve
acceptable precision.
The primary aim of this research study is to develop an evaluation methodology
for identifying the shortcomings and measuring the effectiveness of
search engines with non-English queries. It also proposes a number of solutions
for the existing situation. A Greek query log is analyzed considering the morphological
features of the Greek language. Also a text extraction experiment
revealed some problems related to the encoding and the morphological and
grammatical differences among semantically equivalent Greek terms. A first
stopword list for Greek based on a domain independent collection has been
produced and its application in Web searching has been studied. The effect of
lemmatization of query terms and the factors influencing text based image retrieval
in Greek are also studied. Finally, an instructional strategy is presented
for teaching non-English students how to effectively utilize search engines.
The evaluation of the capabilities of the search engines showed that international
and nationwide search engines ignore most of the linguistic idiosyncrasies
of Greek and other complex European languages. There is a lack of
freely available non-English resources to work with (test corpus, linguistic resources,
etc). The research showed that the application of standard IR techniques,
such as stopword removal, stemming, lemmatization and query expansion,
in Greek Web searching increases precision.
i
BlogForever D5.2: Implementation of Case Studies
This document presents the internal and external testing results for the BlogForever case studies. The evaluation of the BlogForever implementation process is tabulated under the most relevant themes and aspects obtained within the testing processes. The case studies provide relevant feedback for the sustainability of the platform in terms of potential usersâ needs and relevant information on the possible long term impact
Network research by data graph management for capacity development and knowledge building in sustainable sanitation
The Millennium Development Goals (MDG) provide clear targets by 2015 and it turns out that sanitation is by far the largest of all the MDG targets affecting about 40% of the global population. The objective of the Sustainable Sanitation Alliance (SuSanA) is to show how Sustainable Sanitation projects should be planned with participation of stakeholders through capacity development activities. Developing the capacity of societies to collaboratively learn through change and uncertainty is fundamental for sustainability science. The aim of this contribution it is to analyze the role of graph database management (GDM) for improve capacity development and knowledge building in the Sustainable Sanitation framework. We provide a theoretical model with four features of network research: link analysis, social network, pattern recognition and keyword search that we illustrate with some examples. Network research allows us to observe how the information in Sustainable Sanitation is scattered properly through the structure and also to detect the emergencies, objections and other characteristics of the network.Peer Reviewe
Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations
Mathematical notation, i.e., the writing system used to communicate concepts
in mathematics, encodes valuable information for a variety of information
search and retrieval systems. Yet, mathematical notations remain mostly
unutilized by today's systems. In this paper, we present the first in-depth
study on the distributions of mathematical notation in two large scientific
corpora: the open access arXiv (2.5B mathematical objects) and the mathematical
reviewing service for pure and applied mathematics zbMATH (61M mathematical
objects). Our study lays a foundation for future research projects on
mathematical information retrieval for large scientific corpora. Further, we
demonstrate the relevance of our results to a variety of use-cases. For
example, to assist semantic extraction systems, to improve scientific search
engines, and to facilitate specialized math recommendation systems. The
contributions of our presented research are as follows: (1) we present the
first distributional analysis of mathematical formulae on arXiv and zbMATH; (2)
we retrieve relevant mathematical objects for given textual search queries
(e.g., linking with `Jacobi
polynomial'); (3) we extend zbMATH's search engine by providing relevant
mathematical formulae; and (4) we exemplify the applicability of the results by
presenting auto-completion for math inputs as the first contribution to math
recommendation systems. To expedite future research projects, we have made
available our source code and data.Comment: Proceedings of The Web Conference 2020 (WWW'20), April 20--24, 2020,
Taipei, Taiwa
Search Engine Optimization
This Special Issue book focuses on the theory and practice of search engine optimization (SEO). It is intended for anyone who publishes content online and it includes five peer-reviewed papers from various researchers. More specifically, the book includes theoretical and case study contributions which review and synthesize important aspects, including, but not limited to, the following themes: theory of SEO, different types of SEO, SEO criteria evaluation, search engine algorithms, social media and SEO, and SEO applications in various industries, as well as SEO on media websites. The book aims to give a better understanding of the importance of SEO in the current state of the Internet and online information search. Even though SEO is widely used by marketing practitioners, there is a relatively small amount of academic research that systematically attempts to capture this phenomenon and its impact across different industries. Thus, this collection of studies offers useful insights, as well as a valuable resource that intends to open the door for future SEO-related research
- âŠ