371 research outputs found
Image Recommendation Based on Keyword Relevance Using Absorbing Markov Chain and Image Features
Image recommendation is an important feature of search engine, as tremendous amount of images are available online. It is necessary to retrieve relevant images to meet the user's requirement. In this paper, we present an algorithm image recommendation with absorbing Markov chain (IRAbMC) to retrieve relevant images for a user's input query. Images are ranked by calculating keyword relevance probability between annotated keywords from log and keywords of user input query. Keyword relevance is computed using absorbing Markov chain. Images are reranked using image visual features. Experimental results show that the IRAbMC algorithm outperforms Markovian semantic indexing (MSI) method with improved relevance score of retrieved ranked images
IRAbMC: Image Recommendation with Absorbing Markov Chain
Image Recommendation is an important feature for search engine as tremendous amount images are available online. It is necessary to retrieve relevant images to meet user's requirement. In this paper, we present an algorithm Image Recommendation with Absorbing Markov Chain (IRAbMC) to retrieve relevant images for user input query. Images are ranked by calculating keyword relevance probability between annotated keywords from log and keywords of user input query. Absorbing Markov chain is used to calculate keyword relevance. Experiments results show that the IRAbMC algorithm outperforms Markovian Semantic Indexing (MSI) method with improved relevance score of retrieved ranked images
Adaptive image retrieval using a graph model for semantic feature integration
The variety of features available to represent multimedia data constitutes a rich pool of information. However, the plethora of data poses a challenge in terms of feature selection and integration for effective retrieval. Moreover, to further improve effectiveness, the
retrieval model should ideally incorporate context-dependent feature representations to allow for retrieval on a higher semantic level. In this paper we present a retrieval model and learning framework for the purpose of interactive information retrieval. We describe
how semantic relations between multimedia objects based on user interaction can be learnt and then integrated with visual and textual features into a unified framework. The framework models both feature similarities and semantic relations in a single graph. Querying in this model is implemented using the theory of random walks. In addition, we present ideas to implement short-term learning from relevance feedback. Systematic experimental results validate the effectiveness of the proposed approach for image retrieval. However, the model is not restricted to the image domain and could easily be employed for retrieving multimedia data (and even a combination of different domains, eg images, audio and text documents)
Application of the Markov Chain Method in a Health Portal Recommendation System
This study produced a recommendation system that can effectively recommend items on a health portal. Toward this aim, a transaction log that records users’ traversal activities on the Medical College of Wisconsin’s HealthLink, a health portal with a subject directory, was utilized and investigated. This study proposed a mixed-method that included the transaction log analysis method, the Markov chain analysis method, and the inferential analysis method. The transaction log analysis method was applied to extract users’ traversal activities from the log. The Markov chain analysis method was adopted to model users’ traversal activities and then generate recommendation lists for topics, articles, and Q&A items on the health portal. The inferential analysis method was applied to test whether there are any correlations between recommendation lists generated by the proposed recommendation system and recommendation lists ranked by experts. The topics selected for this study are Infections, the Heart, and Cancer. These three topics were the three most viewed topics in the portal. The findings of this study revealed the consistency between the recommendation lists generated from the proposed system and the lists ranked by experts. At the topic level, two topic recommendation lists generated from the proposed system were consistent with the lists ranked by experts, while one topic recommendation list was highly consistent with the list ranked by experts. At the article level, one article recommendation list generated from the proposed system was consistent with the list ranked by experts, while 14 article recommendation lists were highly consistent with the lists ranked by experts. At the Q&A item level, three Q&A item recommendation lists generated from the proposed system were consistent with the lists ranked by experts, while 12 Q&A item recommendation lists were highly consistent with the lists ranked by experts. The findings demonstrated the significance of users’ traversal data extracted from the transaction log. The methodology applied in this study proposed a systematic approach to generating the recommendation systems for other similar portals. The outcomes of this study can facilitate users’ navigation, and provide a new method for building a recommendation system that recommends items at three levels: the topic level, the article level, and the Q&A item level
Recommended from our members
Learning on Graphs with Partially Absorbing Random Walks: Theory and Practice
Learning on graphs has been studied for decades with abundant models proposed, yet many of their behaviors and relations remain unclear. This thesis fills this gap by introducing a novel second-order Markov chain, called partially absorbing random walks (ParWalk). Different from ordinary random walk, ParWalk is absorbed at the current state with probability , and follows a random edge out with probability . The partial absorption results in absorption probability between any two vertices, which turns out to encompass various popular models including PageRank, hitting times, label propagation, and regularized Laplacian kernels. The unified treatment reveals the distinguishing characteristics of these models arising from different contexts, and allows comparing them and transferring findings from one paradigm to another.
The key for learning on graphs is capitalizing on the cluster structure of the underlying graph. The absorption probabilities of ParWalk, turn out to be highly effective in capturing the cluster structure. Given a query vertex in a cluster , we show that when the absorbing capacity () of each vertex on the graph is small, the probabilities of ParWalk to be absorbed at have small variations in region of high conductance (within clusters), but have large gaps in region of low conductance (between clusters). And the less absorbent the vertices of are, the better the absorption probabilities can represent the local cluster . Our theory induces principles for designing reliable similarity measures and provides justification to a number of popular ones such as hitting times and the pseudo-inverse of graph Laplacian. Furthermore, it reveals their new important properties. For example, we are the first to show that hitting times is better in retrieving sparse clusters, while the pseudo-inverse of graph Laplacian is better for dense ones.
The theoretical insights instilled from ParWalk guide us in developing robust algorithms for various applications including local clustering, semi-supervised learning, and ranking. For local clustering, we propose a new method for salient object segmentation. By taking a noisy saliency map as the probability distribution of query vertices, we compute the absorption probabilities of ParWalk to the queries, producing a high-quality refined saliency map where the objects can be easily segmented. For semi-supervised learning, we propose a new algorithm for label propagation. The algorithm is justified by our theoretical analysis and guaranteed to be superior than many existing ones. For ranking, we design a new similarity measure using ParWalk, which combines the strengths of both hitting times and the pseudo-inverse of graph Laplacian. The hybrid similarity measure can well adapt to complex data of diverse density, thus performs superiorly overall. For all these learning tasks, our methods achieve substantial improvements over the state-of-the-art on extensive benchmark datasets
Reverse Engineering Static Content and Dynamic Behaviour of E-Commerce Websites for Fun and Profit
Atualmente os websites de comércio eletrónico são uma das ferramentas principais para a realização de transações entre comerciantes online e consumidores ou empresas. Estes websites apoiam- se fortemente na sumarização e análise dos hábitos de navegação dos consumidores, de forma a influenciar as suas ações no website com o intuito de otimizar métricas de sucesso como o CTR (Click through Rate), CPC (Cost per Conversion), Basket e Lifetime Value e User Engagement. A utilização de técnicas de data mining e machine learning na extração de conhecimento a partir dos conjuntos de dados existentes nos websites de comércio eletrónico tem vindo a ter uma crescente influência nas campanhas de marketing realizadas na Internet.Quando o provedor de serviços de machine learning se deparada com um novo website de comércio eletrónico, inicia um processo de web mining, fazendo recolha de dados, tanto históricos como em tempo real, do website e analisando/transformando estes dados de forma a tornar os mesmos utilizáveis para fins de extração de informação tanto sobre a estrutura e conteúdo de um website assim como dos hábitos de navegação dos seus utilizadores típicos. Apenas após este processo é que os data scientists são capazes de desenvolver modelos relevantes e algoritmos para melhorar e otimizar as atividades de marketing online.Este processo é, na sua generalidade, moroso em tempo e recursos, dependendo sempre da condição em que os dados são apresentados ao data scientist. Dados com mais qualidade (p.ex. dados completos), facilitam o trabalho dos data scientists e tornam o mesmo mais rápido. Por outro lado, na generalidade dos casos, os data scientists tem de recorrer a técnicas de monitorização de eventos específicos ao domínio do website de forma a atingir o objetivo de conhecer os hábitos dos utlizadores, tornando-se necessário a realização de modificações ao código fonte do website para a captura desses mesmos eventos, aumentando assim o risco de não capturar toda a informação relevante por não ativar os mecanismos de monitorização em todas as páginas do web- site. Por exemplo, podemos não ter conhecimento a priori que uma visita à página de Condições de Entrega é relevante para prever o desejo de um dado consumidor efetuar uma compra e, desta forma, os mecanismos de monitorização nessas páginas podem não ser ativados.No contexto desta problemática, a solução proposta consiste numa metodologia capaz de ex- trair e combinar a informação sobre um dado website de comércio eletrónico através de um pro- cesso de web mining, compreendendo a estrutura de páginas de um website, assim como do conteúdo das mesmas, baseando-se para isso na identificação de conteúdo dinâmico das páginas assim como informação semântica recolhida de locais predefinidos. Adicionalmente esta informação é complementada, usando dados presente nos registos de acesso de utilizadores, com modelos preditivos do futuro comportamento dos utilizadores no website. Torna-se assim possível a apresentação de um modelo de dados representando a informação sobre um dado website de comércio eletrónico e os seus utilizadores arquetípicos, podendo posteriormente estes dados serem utiliza- dos, por exemplo, em sistemas de simulação.Nowadays electronic commerce websites are one of the main transaction tools between on-line merchants and consumers or businesses. These e-commerce websites rely heavily on summarizing and analyzing the behavior of customers, making an effort to influence user actions towards the optimization of success metrics such as CTR (Click through Rate), CPC (Cost per Conversion), Basket and Lifetime Value and User Engagement. Knowledge extraction from the existing e- commerce websites datasets, using data mining and machine learning techniques, has been greatly influencing the Internet marketing activities.When faced with a new e-commerce website, the machine learning practitioner starts a web mining process by collecting historical and real-time data of the website and analyzing/transforming this data in order to be capable of extracting information about the website structure and content and its users' behavior. Only after this process the data scientists are able to build relevant models and algorithms to enhance marketing activities.This is an expensive process in resources and time since it will always depend on the condition in which the data is presented to the data scientist, since data with more quality (i.e. no incomplete data) will make the data scientist work easier and faster. On the other hand, in most of the cases, data scientists would usually resort to tracking domain-specific events throughout a user's visit to the website in order to fulfill the objective of discovering the users' behavior and, for this, it is necessary to perform code modifications to the pages themselves, that will result in a larger risk of not capturing all the relevant information by not enabling tracking mechanisms in certain pages. For example, we may not know a priori that a visit to a Delivery Conditions page is relevant to the prediction of a user's willingness to buy and therefore would not enable tracking on those pages.Within this problem context, the proposed solution consists in a methodology capable of extracting and combining information about a e-commerce website through a process of web mining, comprehending the structure as well as the content of the website pages, relying mostly on identifying dynamic content and semantic information in predefined locations, complemented with the capability of, using the user's access logs, extracting more accurate models to predict the users future behavior. This allows for the creation of a data model representing an e-commerce website and its archetypical users that can be useful, for example, in simulation systems
- …