9,368 research outputs found
Fuzzy Content Mining for Targeted Advertisement
Content-targeted advertising system is becoming an increasingly important part of the funding source of free web services. Highly efficient content analysis is the pivotal key of such a system. This project aims to establish a content analysis engine involving fuzzy logic that is able to automatically analyze real user-posted Web documents such as blog entries. Based on the analysis result, the system matches and retrieves the most appropriate Web advertisements. The focus and complexity is on how to better estimate and acquire the keywords that represent a given Web document. Fuzzy Web mining concept will be applied to synthetically consider multiple factors of Web content. A Fuzzy Ranking System is established based on certain fuzzy (and some crisp) rules, fuzzy sets, and membership functions to get the best candidate keywords. Once it is has obtained the keywords, the system will retrieve corresponding advertisements from certain providers through Web services as matched advertisements, similarly to retrieving a products list from Amazon.com. In 87% of the cases, the results of this system can match the accuracy of the Google Adwords system. Furthermore, this expandable system will also be a solid base for further research and development on this topic
Structured Knowledge Representation for Image Retrieval
We propose a structured approach to the problem of retrieval of images by
content and present a description logic that has been devised for the semantic
indexing and retrieval of images containing complex objects. As other
approaches do, we start from low-level features extracted with image analysis
to detect and characterize regions in an image. However, in contrast with
feature-based approaches, we provide a syntax to describe segmented regions as
basic objects and complex objects as compositions of basic ones. Then we
introduce a companion extensional semantics for defining reasoning services,
such as retrieval, classification, and subsumption. These services can be used
for both exact and approximate matching, using similarity measures. Using our
logical approach as a formal specification, we implemented a complete
client-server image retrieval system, which allows a user to pose both queries
by sketch and queries by example. A set of experiments has been carried out on
a testbed of images to assess the retrieval capabilities of the system in
comparison with expert users ranking. Results are presented adopting a
well-established measure of quality borrowed from textual information
retrieval
Implementation of an efficient Fuzzy Logic based Information Retrieval System
This paper exemplifies the implementation of an efficient Information
Retrieval (IR) System to compute the similarity between a dataset and a query
using Fuzzy Logic. TREC dataset has been used for the same purpose. The dataset
is parsed to generate keywords index which is used for the similarity
comparison with the user query. Each query is assigned a score value based on
its fuzzy similarity with the index keywords. The relevant documents are
retrieved based on the score value. The performance and accuracy of the
proposed fuzzy similarity model is compared with Cosine similarity model using
Precision-Recall curves. The results prove the dominance of Fuzzy Similarity
based IR system.Comment: arXiv admin note: substantial text overlap with
http://ntz-develop.blogspot.in/ ,
http://www.micsymposium.org/mics2012/submissions/mics2012_submission_8.pdf ,
http://www.slideshare.net/JeffreyStricklandPhD/predictive-modeling-and-analytics-selectchapters-41304405
by other author
Improving average ranking precision in user searches for biomedical research datasets
Availability of research datasets is keystone for health and life science
study reproducibility and scientific progress. Due to the heterogeneity and
complexity of these data, a main challenge to be overcome by research data
management systems is to provide users with the best answers for their search
queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we
investigate a novel ranking pipeline to improve the search of datasets used in
biomedical experiments. Our system comprises a query expansion model based on
word embeddings, a similarity measure algorithm that takes into consideration
the relevance of the query terms, and a dataset categorisation method that
boosts the rank of datasets matching query constraints. The system was
evaluated using a corpus with 800k datasets and 21 annotated user queries. Our
system provides competitive results when compared to the other challenge
participants. In the official run, it achieved the highest infAP among the
participants, being +22.3% higher than the median infAP of the participant's
best submissions. Overall, it is ranked at top 2 if an aggregated metric using
the best official measures per participant is considered. The query expansion
method showed positive impact on the system's performance increasing our
baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively.
Our similarity measure algorithm seems to be robust, in particular compared to
Divergence From Randomness framework, having smaller performance variations
under different training conditions. Finally, the result categorization did not
have significant impact on the system's performance. We believe that our
solution could be used to enhance biomedical dataset management systems. In
particular, the use of data driven query expansion methods could be an
alternative to the complexity of biomedical terminologies
Term-Specific Eigenvector-Centrality in Multi-Relation Networks
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multi-relation graphs, that is, graphs where connections of many different types may exist. Based on an extension of the PageRank matrix, eigenvectors representing the distribution of a term after propagating term weights between related data items are computed. The result is an index which takes the document structure into account and can be used with standard document retrieval techniques. As the scheme takes the shape of an index transformation, all necessary calculations are performed during index tim
Fuzzy rule based profiling approach for enterprise information seeking and retrieval
With the exponential growth of information available on the Internet and various organisational intranets there is a need for profile based information seeking and retrieval (IS&R) systems. These systems should be able to support users with their context-aware information needs. This paper presents a new approach for enterprise IS&R systems using fuzzy logic to develop task, user and document profiles to model user information seeking behaviour. Relevance feedback was captured from real users engaged in IS&R tasks. The feedback was used to develop a linear regression model for predicting document relevancy based on implicit relevance indicators. Fuzzy relevance profiles were created using Term Frequency and Inverse Document Frequency (TF/IDF) analysis for the successful user queries. Fuzzy rule based summarisation was used to integrate the three profiles into a unified index reflecting the semantic weight of the query terms related to the task, user and document. The unified index was used to select the most relevant documents and experts related to the query topic. The overall performance of the system was evaluated based on standard precision and recall metrics which show significant improvements in retrieving relevant documents in response to user queries
Personalized Fuzzy Text Search Using Interest Prediction and Word Vectorization
In this paper we study the personalized text search problem. The keyword
based search method in conventional algorithms has a low efficiency in
understanding users' intention since the semantic meaning, user profile, user
interests are not always considered. Firstly, we propose a novel text search
algorithm using a inverse filtering mechanism that is very efficient for label
based item search. Secondly, we adopt the Bayesian network to implement the
user interest prediction for an improved personalized search. According to user
input, it searches the related items using keyword information, predicted user
interest. Thirdly, the word vectorization is used to discover potential targets
according to the semantic meaning. Experimental results show that the proposed
search engine has an improved efficiency and accuracy and it can operate on
embedded devices with very limited computational resources
- …