169 research outputs found

    Recurrent Neural Networks with Top-k Gains for Session-based Recommendations

    Full text link
    RNNs have been shown to be excellent models for sequential data and in particular for data that is generated by users in an session-based manner. The use of RNNs provides impressive performance benefits over classical methods in session-based recommendations. In this work we introduce novel ranking loss functions tailored to RNNs in the recommendation setting. The improved performance of these losses over alternatives, along with further tricks and refinements described in this work, allow for an overall improvement of up to 35% in terms of MRR and Recall@20 over previous session-based RNN solutions and up to 53% over classical collaborative filtering approaches. Unlike data augmentation-based improvements, our method does not increase training times significantly. We further demonstrate the performance gain of the RNN over baselines in an online A/B test.Comment: CIKM'18, authors' versio

    Improved and Robust Controversy Detection in General Web Pages Using Semantic Approaches under Large Scale Conditions

    Get PDF
    Detecting controversy in general web pages is a daunting task, but increasingly essential to efficiently moderate discussions and effectively filter problematic content. Unfortunately, controversies occur across many topics and domains, with great changes over time. This paper investigates neural classifiers as a more robust methodology for controversy detection in general web pages. Current models have often cast controversy detection on general web pages as Wikipedia linking, or exact lexical matching tasks. The diverse and changing nature of controversies suggest that semantic approaches are better able to detect controversy. We train neural networks that can capture semantic information from texts using weak signal data. By leveraging the semantic properties of word embeddings we robustly improve on existing controversy detection methods. To evaluate model stability over time and to unseen topics, we asses model performance under varying training conditions to test cross-temporal, cross-topic, cross-domain performance and annotator congruence. In doing so, we demonstrate that weak-signal based neural approaches are closer to human estimates of controversy and are more robust to the inherent variability of controversies

    How Consistent is Relevance Feedback in Exploratory Search?

    Get PDF
    Search activities involving knowledge acquisition, investigation and synthesis are collectively known as exploratory search. Exploratory search is challenging for users, who may be unable to formulate search queries, have ill-defined search goals or may even struggle to understand search results. To ameliorate these difficulties, reinforcement learning-based information retrieval systems were developed to provide adaptive support to users. Reinforcement learning is used to build a model of user intent based on relevance feedback provided by the user. But how reliable is relevance feedback in this context? To answer this question, we developed a novel permutation-based metric for scoring the consistency of relevance feedback. We used this metric to perform a retrospective analysis of interaction data from lookup and exploratory search experiments. Our analysis shows that for lookup search relevance judgments are highly consistent, supporting previous findings that relevance feedback improves retrieval performance. For exploratory search, however, the distribution of consistency scores shows considerable inconsistency.Peer reviewe

    Engineering a Simplified 0-Bit Consistent Weighted Sampling

    Full text link
    The Min-Hashing approach to sketching has become an important tool in data analysis, information retrial, and classification. To apply it to real-valued datasets, the ICWS algorithm has become a seminal approach that is widely used, and provides state-of-the-art performance for this problem space. However, ICWS suffers a computational burden as the sketch size K increases. We develop a new Simplified approach to the ICWS algorithm, that enables us to obtain over 20x speedups compared to the standard algorithm. The veracity of our approach is demonstrated empirically on multiple datasets and scenarios, showing that our new Simplified CWS obtains the same quality of results while being an order of magnitude faster

    ๊ณผ๊ฑฐ ๊ฐ€๊ฒฉ ๋ฐ ํฌ์†Œํ•œ ํŠธ์œ—์„ ์ด์šฉํ•œ ์ฃผ๊ฐ€ ๋ณ€๋™ ์˜ˆ์ธก

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. ๊ฐ•์œ .Given historical stock prices and sparse tweets mentioning the stocks to predict, how can we precisely predict stock price movement? Many market analysts strive to use a large amount of information for prediction. However, they confront more noise when utilizing larger data for prediction. Thus, existing methods use only historical prices, or those along with a small amount of refined data such as news articles or tweets mentioning target stocks. However, they have the following limitations: 1) using only historical prices gives low performance since they have insufficient information, 2) news articles lack timeliness compared to social medias for predicting stock price movement, and 3) the previous methods using tweets do not handle stocks without tweets mentioning them. In this paper, we propose GLT (Stock Price Movement Prediction using Global and Local Trends of Tweets), an accurate stock price movement prediction method that works without tweets mentioning target stocks. GLT pre-trains both of stock and tweet representations in a self-supervised way. Then, GLT generates global and local tweet trends which represent global public opinion and the local trends related to target stocks, respectively. The trend vectors are combined to accurately predict stock price movement. Experimental results show that GLT provides the state-ofthe-art accuracy for stock price movement prediction.๊ณผ๊ฑฐ ์ฃผ๊ฐ€์™€ ์˜ˆ์ธกํ•  ์ฃผ์‹์„ ์–ธ๊ธ‰ํ•˜๋Š” ํฌ์†Œํ•œ ํŠธ์œ—์ด ์ฃผ์–ด์กŒ์„ ๋•Œ ์ฃผ๊ฐ€ ๋ณ€๋™์„ ์–ด๋–ป๊ฒŒ ์ •ํ™•ํ•˜๊ฒŒ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์„๊นŒ? ๋งŽ์€ ์‹œ์žฅ ๋ถ„์„๊ฐ€๋“ค์€ ์˜ˆ์ธก์„ ์œ„ํ•ด ๋งŽ์€ ์–‘์˜ ์ • ๋ณด๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์˜ˆ์ธก์„ ์œ„ํ•ด ๋” ๋งŽ์€ ์–‘์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ• ์ˆ˜ ๋ก ๋” ๋งŽ์€ ๋…ธ์ด์ฆˆ์— ์ง๋ฉดํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ์กด ๋ฐฉ๋ฒ•์€ ๊ณผ๊ฑฐ ์ฃผ์‹ ๊ฐ€๊ฒฉ๋งŒ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ๋‰ด์Šค ๊ธฐ์‚ฌ ํ˜น์€ ๋Œ€์ƒ ์ฃผ์‹์„ ์–ธ๊ธ‰ํ•˜๋Š” ํŠธ์œ—๊ณผ ๊ฐ™์€ ์†Œ๋Ÿ‰์˜ ์ •์ œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค: 1) ๊ณผ๊ฑฐ ์ฃผ์‹ ๊ฐ€๊ฒฉ๋งŒ ์‚ฌ์šฉํ•˜๋ฉด ์ •๋ณด๊ฐ€ ๋ถ€์กฑํ•˜์—ฌ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋˜๊ณ , 2) ๋‰ด์Šค ๊ธฐ์‚ฌ๋Š” ์ฃผ๊ฐ€ ๋ณ€๋™์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์†Œ์…œ ๋ฏธ๋””์–ด์— ๋น„ํ•ด ์ ์‹œ์„ฑ์ด ๋ถ€์กฑํ•˜๋ฉฐ, 3) ํŠธ์œ—์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์ „ ๋ฐฉ๋ฒ•๋“ค์€ ํŠธ์œ—์ด ์–ธ๊ธ‰ํ•˜์ง€ ์•Š์€ ์ฃผ์‹๋“ค์„ ์ฒ˜๋ฆฌํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ชฉํ‘œ ์ฃผ์‹์„ ์–ธ๊ธ‰ํ•˜๋Š” ํŠธ์œ— ์—†์ด๋„ ์ž‘๋™ํ•˜๋Š” ์ •ํ™•ํ•œ ์ฃผ๊ฐ€ ๋ณ€๋™ ์˜ˆ์ธก ๋ฐฉ๋ฒ•์ธ GLT (Stock Price Movement Prediction using Global and Local Trends of Tweets)๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. GLT๋Š” ์ž๊ฐ€ ๊ฐ๋… ๋ฐฉ์‹์„ ํ™œ์šฉํ•˜์—ฌ ์ฃผ์‹ ๋ฐ ํŠธ์œ— ์ž„๋ฒ ๋”ฉ์„ ์‚ฌ์ „ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ GLT๋Š” ๊ฐ๊ฐ ๊ธ€๋กœ๋ฒŒ ์—ฌ๋ก ๊ณผ ๋ชฉํ‘œ ์ฃผ์‹๊ณผ ๊ด€๋ จ๋œ ํŠธ๋ Œ๋“œ ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ธ€๋กœ๋ฒŒ ๋ฐ ๋กœ์ปฌ ํŠธ์œ— ํŠธ๋ Œ๋“œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ถ”์„ธ ๋ฒกํ„ฐ๋“ค์€ ์ฃผ๊ฐ€ ๋ณ€๋™์„ ์ •ํ™•ํ•˜๊ฒŒ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ๊ธฐ์—ฌํ•ฉ๋‹ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ์— ๋”ฐ๋ฅด๋ฉด GLT๋Š” ์ฃผ๊ฐ€ ๋ณ€๋™ ์˜ˆ ์ธก์—์„œ ์ตœ๊ณ  ์ˆ˜์ค€์˜ ์ •ํ™•๋„๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.I. Introduction 1 II. Related Work 4 2.1 Stock Price Movement Prediction 4 2.2 Attentive LSTM 4 III. Proposed Method 6 3.1 Overview 6 3.2 Self-supervised Pre-training for Representing Tweets and Stocks 7 3.3 Global Tweet Trend 10 3.4 Local Tweet Trend 11 3.5 Stock Movement Prediction 11 IV. Experiment 13 4.1 Experiment Setting 13 4.2 Classification Performance 15 4.3 Ablation Study 16 4.4 Hyperparameter Robustness 16 V. Conclusion 18 References 19 Abstract in Korean 23์„
    • โ€ฆ
    corecore