4 research outputs found

    Neurala klickmodeller med latenta variabler för webbsöksystem

    No full text
    User click modeling in web search is most commonly done through probabilistic graphical models. Due to the successful use of machine learning techniques in other fields of research, it is interesting to evaluate how machine learning can be applied to click modeling. In this thesis, modeling is done using recurrent neural networks trained on a distributed representation of the state of the art user browsing model (UBM). It is further evaluated how extending this representation with a set of latent variables that are easily derivable from click logs, can affect the model's prediction performance. Results show that a model using the original representation does not perform very well. However, the inclusion of simple variables can drastically increase the performance regarding the click prediction task. For which it manages to outperform the two chosen baseline models, which themselves are well performing already. It also leads to increased performance for the relevance prediction task, although the results are not as significant. It can be argued that the relevance prediction task is not a fair comparison to the baseline functions, due to them needing more significant amounts of data to learn the respective probabilities. However, it is favorable that the neural models manage to perform quite well using smaller amounts of data. It would be interesting to see how well such models would perform when trained on far greater data quantities than what was used in this project. Also tailoring the model for the use of LSTM, which supposedly could increase performance even more. Evaluating other representations than the one used would also be of interest, as this representation did not perform remarkably on its own.Klickmodellering av anvĂ€ndare i söksystem görs vanligtvis med hjĂ€lp av probabilistiska modeller. PĂ„ grund av maskininlĂ€rningens framgĂ„ngar inom andra omrĂ„den Ă€r det intressant att undersöka hur dessa tekniker kan appliceras för klickmodellering. Detta examensarbete undersöker klickmodellering med hjĂ€lp av recurrent neural networks trĂ€nade pĂ„ en distribuerad representation av en populĂ€r och vĂ€lpresterande klickmodell benĂ€mnd user browsing model (UBM). Det undersöks vidare hur utökandet av denna representation med statistiska variabler som enkelt kan utvinnas frĂ„n klickloggar, kan pĂ„verka denna modells prestanda. Resultaten visar att grundrepresentationen inte presterar sĂ€rskilt bra. DĂ€remot har anvĂ€ndningen av simpla variabler visats medföra drastiska prestandaökningar nĂ€r det kommer till att förutspĂ„ en anvĂ€ndares klick. I detta syfte lyckas modellerna prestera bĂ€ttre Ă€n de tvĂ„ baselinemodeller som valts, vilka redan Ă€r vĂ€lpresterande för syftet. De har Ă€ven lyckats förbĂ€ttra modellernas förmĂ„ga att förutspĂ„ relevans, fastĂ€n skillnaderna inte Ă€r lika drastiska. Relevans utgör inte en lika jĂ€mn jĂ€mförelse gentemot baselinemodellerna, dĂ„ dessa krĂ€ver mycket större datamĂ€ngder för att nĂ„ verklig prestanda. Det Ă€r dĂ€remot fördelaktigt att de neurala modellerna nĂ„r relativt god prestanda för datamĂ€ngden som anvĂ€nts. Det vore intressant att undersöka hur dessa modeller skulle prestera nĂ€r de trĂ€nas pĂ„ mycket större datamĂ€ngder Ă€n vad som anvĂ€nts i detta projekt. Även att skrĂ€ddarsy modellerna för LSTM, vilket borde kunna öka prestandan ytterligare. Att evaluera andra representationer Ă€n den som anvĂ€ndes i detta projekt Ă€r ocksĂ„ av intresse, dĂ„ den anvĂ€nda representationen inte presterade mĂ€rkvĂ€rdigt i sin grundform

    Neurala klickmodeller med latenta variabler för webbsöksystem

    No full text
    User click modeling in web search is most commonly done through probabilistic graphical models. Due to the successful use of machine learning techniques in other fields of research, it is interesting to evaluate how machine learning can be applied to click modeling. In this thesis, modeling is done using recurrent neural networks trained on a distributed representation of the state of the art user browsing model (UBM). It is further evaluated how extending this representation with a set of latent variables that are easily derivable from click logs, can affect the model's prediction performance. Results show that a model using the original representation does not perform very well. However, the inclusion of simple variables can drastically increase the performance regarding the click prediction task. For which it manages to outperform the two chosen baseline models, which themselves are well performing already. It also leads to increased performance for the relevance prediction task, although the results are not as significant. It can be argued that the relevance prediction task is not a fair comparison to the baseline functions, due to them needing more significant amounts of data to learn the respective probabilities. However, it is favorable that the neural models manage to perform quite well using smaller amounts of data. It would be interesting to see how well such models would perform when trained on far greater data quantities than what was used in this project. Also tailoring the model for the use of LSTM, which supposedly could increase performance even more. Evaluating other representations than the one used would also be of interest, as this representation did not perform remarkably on its own.Klickmodellering av anvĂ€ndare i söksystem görs vanligtvis med hjĂ€lp av probabilistiska modeller. PĂ„ grund av maskininlĂ€rningens framgĂ„ngar inom andra omrĂ„den Ă€r det intressant att undersöka hur dessa tekniker kan appliceras för klickmodellering. Detta examensarbete undersöker klickmodellering med hjĂ€lp av recurrent neural networks trĂ€nade pĂ„ en distribuerad representation av en populĂ€r och vĂ€lpresterande klickmodell benĂ€mnd user browsing model (UBM). Det undersöks vidare hur utökandet av denna representation med statistiska variabler som enkelt kan utvinnas frĂ„n klickloggar, kan pĂ„verka denna modells prestanda. Resultaten visar att grundrepresentationen inte presterar sĂ€rskilt bra. DĂ€remot har anvĂ€ndningen av simpla variabler visats medföra drastiska prestandaökningar nĂ€r det kommer till att förutspĂ„ en anvĂ€ndares klick. I detta syfte lyckas modellerna prestera bĂ€ttre Ă€n de tvĂ„ baselinemodeller som valts, vilka redan Ă€r vĂ€lpresterande för syftet. De har Ă€ven lyckats förbĂ€ttra modellernas förmĂ„ga att förutspĂ„ relevans, fastĂ€n skillnaderna inte Ă€r lika drastiska. Relevans utgör inte en lika jĂ€mn jĂ€mförelse gentemot baselinemodellerna, dĂ„ dessa krĂ€ver mycket större datamĂ€ngder för att nĂ„ verklig prestanda. Det Ă€r dĂ€remot fördelaktigt att de neurala modellerna nĂ„r relativt god prestanda för datamĂ€ngden som anvĂ€nts. Det vore intressant att undersöka hur dessa modeller skulle prestera nĂ€r de trĂ€nas pĂ„ mycket större datamĂ€ngder Ă€n vad som anvĂ€nts i detta projekt. Även att skrĂ€ddarsy modellerna för LSTM, vilket borde kunna öka prestandan ytterligare. Att evaluera andra representationer Ă€n den som anvĂ€ndes i detta projekt Ă€r ocksĂ„ av intresse, dĂ„ den anvĂ€nda representationen inte presterade mĂ€rkvĂ€rdigt i sin grundform

    Neurala klickmodeller med latenta variabler för webbsöksystem

    No full text
    User click modeling in web search is most commonly done through probabilistic graphical models. Due to the successful use of machine learning techniques in other fields of research, it is interesting to evaluate how machine learning can be applied to click modeling. In this thesis, modeling is done using recurrent neural networks trained on a distributed representation of the state of the art user browsing model (UBM). It is further evaluated how extending this representation with a set of latent variables that are easily derivable from click logs, can affect the model's prediction performance. Results show that a model using the original representation does not perform very well. However, the inclusion of simple variables can drastically increase the performance regarding the click prediction task. For which it manages to outperform the two chosen baseline models, which themselves are well performing already. It also leads to increased performance for the relevance prediction task, although the results are not as significant. It can be argued that the relevance prediction task is not a fair comparison to the baseline functions, due to them needing more significant amounts of data to learn the respective probabilities. However, it is favorable that the neural models manage to perform quite well using smaller amounts of data. It would be interesting to see how well such models would perform when trained on far greater data quantities than what was used in this project. Also tailoring the model for the use of LSTM, which supposedly could increase performance even more. Evaluating other representations than the one used would also be of interest, as this representation did not perform remarkably on its own.Klickmodellering av anvĂ€ndare i söksystem görs vanligtvis med hjĂ€lp av probabilistiska modeller. PĂ„ grund av maskininlĂ€rningens framgĂ„ngar inom andra omrĂ„den Ă€r det intressant att undersöka hur dessa tekniker kan appliceras för klickmodellering. Detta examensarbete undersöker klickmodellering med hjĂ€lp av recurrent neural networks trĂ€nade pĂ„ en distribuerad representation av en populĂ€r och vĂ€lpresterande klickmodell benĂ€mnd user browsing model (UBM). Det undersöks vidare hur utökandet av denna representation med statistiska variabler som enkelt kan utvinnas frĂ„n klickloggar, kan pĂ„verka denna modells prestanda. Resultaten visar att grundrepresentationen inte presterar sĂ€rskilt bra. DĂ€remot har anvĂ€ndningen av simpla variabler visats medföra drastiska prestandaökningar nĂ€r det kommer till att förutspĂ„ en anvĂ€ndares klick. I detta syfte lyckas modellerna prestera bĂ€ttre Ă€n de tvĂ„ baselinemodeller som valts, vilka redan Ă€r vĂ€lpresterande för syftet. De har Ă€ven lyckats förbĂ€ttra modellernas förmĂ„ga att förutspĂ„ relevans, fastĂ€n skillnaderna inte Ă€r lika drastiska. Relevans utgör inte en lika jĂ€mn jĂ€mförelse gentemot baselinemodellerna, dĂ„ dessa krĂ€ver mycket större datamĂ€ngder för att nĂ„ verklig prestanda. Det Ă€r dĂ€remot fördelaktigt att de neurala modellerna nĂ„r relativt god prestanda för datamĂ€ngden som anvĂ€nts. Det vore intressant att undersöka hur dessa modeller skulle prestera nĂ€r de trĂ€nas pĂ„ mycket större datamĂ€ngder Ă€n vad som anvĂ€nts i detta projekt. Även att skrĂ€ddarsy modellerna för LSTM, vilket borde kunna öka prestandan ytterligare. Att evaluera andra representationer Ă€n den som anvĂ€ndes i detta projekt Ă€r ocksĂ„ av intresse, dĂ„ den anvĂ€nda representationen inte presterade mĂ€rkvĂ€rdigt i sin grundform

    A comparative study of the conventional item-based collaborative filtering and the Slope One algorithms for recommender systems

    No full text
    Recommender systems are an important research topic in todays society as the amount of data increases across the globe. In order for commercial systems to give their users good and personalized recommendations on what data may be of interest to them in an effective manner, such a system must be able to give recommendations quickly and scale well as data increases. The purpose of this study is to evaluate two such algorithms with this in mind.  The two different algorithm families tested are classified as item-based collaborative filtering but work very differently. It is therefore of interest to see how their complexities affect their performance, accuracy as well as scalability. The Slope One family is much simpler to implement and proves to be equally as efficient, if not even more efficient than the conventional item-based ones. Both families do require a precomputation stage before recommendations are possible to give, this is the stage where Slope One suffers in comparison to the conventional item-based one. The algorithms are tested using Lenskit, on data provided by GroupLens and their MovieLens project
    corecore