803 research outputs found

    Scaling real-time event detection to massive streams

    Get PDF
    In today’s world the internet and social media are omnipresent and information is accessible to everyone. This shifted the advantage from those who have access to information to those who do so first. Identifying new events as they emerge is of substantial value to financial institutions who consider realtime information in their decision making processes, as well as for journalists that report about breaking news and governmental agencies that collect information and respond to emergencies. First Story Detection is the task of identifying those documents in a stream of documents that talk about new events first. This seemingly simple task is non-trivial as the computational effort increases with every processed document. Standard approaches to solve First Story Detection determine a document’s novelty by comparing it to previously seen documents. This results in the highest reported accuracy but even the currently fastest system only scales to 10% of the Twitter stream. In this thesis, we propose a new algorithm family, called memory-based methods, able to scale to the full Twitter stream on a single core. Our memory-based method computes a document’s novelty up to two orders of magnitude faster than state-of-the-art systems without sacrificing accuracy. This thesis additional provides original work on the impact of processing unbounded data streams on detection accuracy. Our experiments reveal for the first time that the novelty scores of state-of-the-art comparison based and memory-based methods decay over time. We show how to counteract the discovered novelty decay and increase detection accuracy. Additionally, we show that memory-based methods are applicable beyond First Story Detection by building the first real time rumour detection system on social media streams

    Distance,Time and Terms in First Story Detection

    Get PDF
    First Story Detection (FSD) is an important application of online novelty detection within Natural Language Processing (NLP). Given a stream of documents, or stories, about news events in a chronological order, the goal of FSD is to identify the very first story for each event. While a variety of NLP techniques have been applied to the task, FSD remains challenging because it is still not clear what is the most crucial factor in defining the “story novelty”. Giventhesechallenges,thethesisaddressedinthisdissertationisthat the notion of novelty in FSD is multi-dimensional. To address this, the work presented has adopted a three dimensional analysis of the relative qualities of FSD systems and gone on to propose a specific method that wearguesignificantlyimprovesunderstandingandperformanceofFSD. FSD is of course not a new problem type; therefore, our first dimen sion of analysis consists of a systematic study of detection models for firststorydetectionandthedistancesthatareusedinthedetectionmod els for defining novelty. This analysis presents a tripartite categorisa tion of the detection models based on the end points of the distance calculation. The study also considers issues of document representation explicitly, and shows that even in a world driven by distributed repres iv entations,thenearestneighbourdetectionmodelwithTF-IDFdocument representations still achieves the state-of-the-art performance for FSD. Weprovideanalysisofthisimportantresultandsuggestpotentialcauses and consequences. Events are introduced and change at a relatively slow rate relative to the frequency at which words come in and out of usage on a docu ment by document basis. Therefore we argue that the second dimen sion of analysis should focus on the temporal aspects of FSD. Here we are concerned with not only the temporal nature of the detection pro cess, e.g., the time/history window over the stories in the data stream, but also the processes that underpin the representational updates that underpin FSD. Through a systematic investigation of static representa tions, and also dynamic representations with both low and high update frequencies, we show that while a dynamic model unsurprisingly out performs static models, the dynamic model in fact stops improving but stays steady when the update frequency gets higher than a threshold. Our third dimension of analysis moves across to the particulars of lexicalcontent,andcriticallytheaffectoftermsinthedefinitionofstory novelty. Weprovideaspecificanalysisofhowtermsarerepresentedfor FSD, including the distinction between static and dynamic document representations, and the affect of out-of-vocabulary terms and the spe cificity of a word in the calculation of the distance. Our investigation showed that term distributional similarity rather than scale of common v terms across the background and target corpora is the most important factor in selecting background corpora for document representations in FSD. More crucially, in this work the simple idea of the new terms emerged as a vital factor in defining novelty for the first story

    Time of your hate: The challenge of time in hate speech detection on social media

    Get PDF
    The availability of large annotated corpora from social media and the development of powerful classification approaches have contributed in an unprecedented way to tackle the challenge of monitoring users' opinions and sentiments in online social platforms across time. Such linguistic data are strongly affected by events and topic discourse, and this aspect is crucial when detecting phenomena such as hate speech, especially from a diachronic perspective. We address this challenge by focusing on a real case study: the "Contro l'odio" platform for monitoring hate speech against immigrants in the Italian Twittersphere. We explored the temporal robustness of a BERT model for Italian (AlBERTo), the current benchmark on non-diachronic detection settings. We tested different training strategies to evaluate how the classification performance is affected by adding more data temporally distant from the test set and hence potentially different in terms of topic and language use. Our analysis points out the limits that a supervised classification model encounters on data that are heavily influenced by events. Our results show how AlBERTo is highly sensitive to the temporal distance of the fine-tuning set. However, with an adequate time window, the performance increases, while requiring less annotated data than a traditional classifier

    ENTERPRISE & INDUSTRY magazine 2010 September, N0.11

    Get PDF

    ENTERPRISE & INDUSTRY magazine 2010 September, N0.11

    Get PDF

    The brain as a generative model: information-theoretic surprise in learning and action

    Get PDF
    Our environment is rich with statistical regularities, such as a sudden cold gust of wind indicating a potential change in weather. A combination of theoretical work and empirical evidence suggests that humans embed this information in an internal representation of the world. This generative model is used to perform probabilistic inference, which may be approximated through surprise minimization. This process rests on current beliefs enabling predictions, with expectation violation amounting to surprise. Through repeated interaction with the world, beliefs become more accurate and grow more certain over time. Perception and learning may be accounted for by minimizing surprise of current observations, while action is proposed to minimize expected surprise of future events. This framework thus shows promise as a common formulation for different brain functions. The work presented here adopts information-theoretic quantities of surprise to investigate both perceptual learning and action. We recorded electroencephalography (EEG) of participants in a somatosensory roving-stimulus paradigm and performed trial-by-trial modeling of cortical dynamics. Bayesian model selection suggests early processing in somatosensory cortices to encode confidence-corrected surprise and subsequently Bayesian surprise. This suggests the somatosensory system to signal surprise of observations and update a probabilistic model learning transition probabilities. We also extended this framework to include audition and vision in a multi-modal roving-stimulus study. Next, we studied action by investigating a sensitivity to expected Bayesian surprise. Interestingly, this quantity is also known as information gain and arises as an incentive to reduce uncertainty in the active inference framework, which can correspond to surprise minimization. In comparing active inference to a classical reinforcement learning model on the two-step decision-making task, we provided initial evidence for active inference to better account for human model-based behaviour. This appeared to relate to participants’ sensitivity to expected Bayesian surprise and contributed to explaining exploration behaviour not accounted for by the reinforcement learning model. Overall, our findings provide evidence for information-theoretic surprise as a model for perceptual learning signals while also guiding human action.Unsere Umwelt ist reich an statistischen Regelmäßigkeiten, wie z. B. ein plötzlicher kalter Windstoß, der einen möglichen Wetterumschwung ankündigt. Eine Kombination aus theoretischen Arbeiten und empirischen Erkenntnissen legt nahe, dass der Mensch diese Informationen in eine interne Darstellung der Welt einbettet. Dieses generative Modell wird verwendet, um probabilistische Inferenz durchzuführen, die durch Minimierung von Überraschungen angenähert werden kann. Der Prozess beruht auf aktuellen Annahmen, die Vorhersagen ermöglichen, wobei eine Verletzung der Erwartungen einer Überraschung gleichkommt. Durch wiederholte Interaktion mit der Welt nehmen die Annahmen mit der Zeit an Genauigkeit und Gewissheit zu. Es wird angenommen, dass Wahrnehmung und Lernen durch die Minimierung von Überraschungen bei aktuellen Beobachtungen erklärt werden können, während Handlung erwartete Überraschungen für zukünftige Beobachtungen minimiert. Dieser Rahmen ist daher als gemeinsame Bezeichnung für verschiedene Gehirnfunktionen vielversprechend. In der hier vorgestellten Arbeit werden informationstheoretische Größen der Überraschung verwendet, um sowohl Wahrnehmungslernen als auch Handeln zu untersuchen. Wir haben die Elektroenzephalographie (EEG) von Teilnehmern in einem somatosensorischen Paradigma aufgezeichnet und eine trial-by-trial Modellierung der kortikalen Dynamik durchgeführt. Die Bayes'sche Modellauswahl deutet darauf hin, dass frühe Verarbeitung in den somatosensorischen Kortizes confidence corrected surprise und Bayesian surprise kodiert. Dies legt nahe, dass das somatosensorische System die Überraschung über Beobachtungen signalisiert und ein probabilistisches Modell aktualisiert, welches wiederum Wahrscheinlichkeiten in Bezug auf Übergänge zwischen Reizen lernt. In einer weiteren multimodalen Roving-Stimulus-Studie haben wir diesen Rahmen auch auf die auditorische und visuelle Modalität ausgeweitet. Als Nächstes untersuchten wir Handlungen, indem wir die Empfindlichkeit gegenüber der erwarteten Bayesian surprise betrachteten. Interessanterweise ist diese informationstheoretische Größe auch als Informationsgewinn bekannt und stellt, im Rahmen von active inference, einen Anreiz dar, Unsicherheit zu reduzieren. Dies wiederum kann einer Minimierung der Überraschung entsprechen. Durch den Vergleich von active inference mit einem klassischen Modell des Verstärkungslernens (reinforcement learning) bei der zweistufigen Entscheidungsaufgabe konnten wir erste Belege dafür liefern, dass active inference menschliches modellbasiertes Verhalten besser abbildet. Dies scheint mit der Sensibilität der Teilnehmer gegenüber der erwarteten Bayesian surprise zusammenzuhängen und trägt zur Erklärung des Explorationsverhaltens bei, das jedoch nicht vom reinforcement learning-Modell erklärt werden kann. Insgesamt liefern unsere Ergebnisse Hinweise für Formulierungen der informationstheoretischen Überraschung als Modell für Signale wahrnehmungsbasierten Lernens, die auch menschliches Handeln steuern

    Red Pilled - The Allure of Digital Hate

    Get PDF
    Hate is being reinvented. Over the last two decades, online platforms have been used to repackage racist, sexist and xenophobic ideologies into new sociotechnical forms. Digital hate is ancient but novel, deploying the Internet to boost its allure and broaden its appeal. To understand the logic of hate, the author investigates four objects: 8chan, the cesspool of the Internet, QAnon, the popular meta-conspiracy, Parler, a social media site, and Gab, the "platform for the people." Drawing together powerful human stories with insights from media studies, psychology, political science, and race and cultural studies, he portrays how digital hate infiltrates hearts and minds
    corecore