105 research outputs found

    An analytical model for the propagation of social influence

    Get PDF
    Session 1: Web Intelligence Foundations IStudying the propagation of social influence is critical in the analysis of online social networks. While most existing work focuses on the expected number of users influenced, the detailed probability distribution of users influenced is also significant. However, determining the probability distribution of the final influence propagation state is difficult. Monte-Carlo simulations may be used, but are computationally expensive. In this paper, we develop an analytical model for the influence propagation process in online social networks based on discretetime Markov chains, and deduce a closed-form equation for the n-step transition probability matrix. We show that given any initial state, the probability distribution of the final influence propagation state may be easily obtained from a matrix product. This provides a powerful tool to further understand social influence propagation.published_or_final_versio

    User Modeling and User Profiling: A Comprehensive Survey

    Full text link
    The integration of artificial intelligence (AI) into daily life, particularly through information retrieval and recommender systems, has necessitated advanced user modeling and profiling techniques to deliver personalized experiences. These techniques aim to construct accurate user representations based on the rich amounts of data generated through interactions with these systems. This paper presents a comprehensive survey of the current state, evolution, and future directions of user modeling and profiling research. We provide a historical overview, tracing the development from early stereotype models to the latest deep learning techniques, and propose a novel taxonomy that encompasses all active topics in this research area, including recent trends. Our survey highlights the paradigm shifts towards more sophisticated user profiling methods, emphasizing implicit data collection, multi-behavior modeling, and the integration of graph data structures. We also address the critical need for privacy-preserving techniques and the push towards explainability and fairness in user modeling approaches. By examining the definitions of core terminology, we aim to clarify ambiguities and foster a clearer understanding of the field by proposing two novel encyclopedic definitions of the main terms. Furthermore, we explore the application of user modeling in various domains, such as fake news detection, cybersecurity, and personalized education. This survey serves as a comprehensive resource for researchers and practitioners, offering insights into the evolution of user modeling and profiling and guiding the development of more personalized, ethical, and effective AI systems.Comment: 71 page

    An automatic participant detection framework for event tracking on twitter

    Get PDF
    Topic Detection and Tracking (TDT) on Twitter emulates human identifying developments in events from a stream of tweets, but while event participants are important for humans to understand what happens during events, machines have no knowledge of them. Our evaluation on football matches and basketball games shows that identifying event participants from tweets is a difficult problem exacerbated by Twitter’s noise and bias. As a result, traditional Named Entity Recognition (NER) approaches struggle to identify participants from the pre-event Twitter stream. To overcome these challenges, we describe Automatic Participant Detection (APD) to detect an event’s participants before the event starts and improve the machine understanding of events. We propose a six-step framework to identify participants and present our implementation, which combines information from Twitter’s pre-event stream and Wikipedia. In spite of the difficulties associated with Twitter and NER in the challenging context of events, our approach manages to restrict noise and consistently detects the majority of the participants. By empowering machines with some of the knowledge that humans have about events, APD lays the foundation not just for improved TDT.peer-reviewe

    Security information sharing via Twitter: 'Heartbleed' as a case study

    Get PDF
    The current paper outlines an exploratory case study in which we examined the extent to which specific communities of Twitter users engaged with the debate about the security threat known as ‘Heartbleed’ in the first few days after this threat was exposed. The case study explored which professional groups appeared to lead the debate about Heartbleed, the nature of the communication (tweets and retweets), and evidence about behaviour change. Using keywords from the Twitter user profiles, six occupational groups were identified, each of which were likely to have a direct interest in learning about Heartbleed (including legal, financial, entrepreneurial, press, and IT professionals). The groups participated to different degrees in the debate about Heartbleed. This exploratory case study provides an insight into information sharing, potential communities of influence, and points for future research in the absence of a voice of authority in the field of cybersecurity

    ReSpecTX: Programming Interaction Made Easy

    Get PDF
    In this paper we present the ReSpecTX language, toolchain, and standard library as a first step of a path aimed at closing the gap between coordination languages \u2013 mostly a prerogative of the academic realm until now \u2013 and their industrial counterparts. Since the limited adoption of coordination languages within the industrial realm is also due to the lack of suitable toolchains and libraries of reusable mechanisms, ReSpecTX equips a core coordination language (ReSpecT) with tools and features commonly found in mainstream programming languages. In particular, ReSpecTX makes it possible to provide a reference library of reusable and composable interaction patterns

    Machine learning for managing structured and semi-structured data

    Get PDF
    As the digitalization of private, commercial, and public sectors advances rapidly, an increasing amount of data is becoming available. In order to gain insights or knowledge from these enormous amounts of raw data, a deep analysis is essential. The immense volume requires highly automated processes with minimal manual interaction. In recent years, machine learning methods have taken on a central role in this task. In addition to the individual data points, their interrelationships often play a decisive role, e.g. whether two patients are related to each other or whether they are treated by the same physician. Hence, relational learning is an important branch of research, which studies how to harness this explicitly available structural information between different data points. Recently, graph neural networks have gained importance. These can be considered an extension of convolutional neural networks from regular grids to general (irregular) graphs. Knowledge graphs play an essential role in representing facts about entities in a machine-readable way. While great efforts are made to store as many facts as possible in these graphs, they often remain incomplete, i.e., true facts are missing. Manual verification and expansion of the graphs is becoming increasingly difficult due to the large volume of data and must therefore be assisted or substituted by automated procedures which predict missing facts. The field of knowledge graph completion can be roughly divided into two categories: Link Prediction and Entity Alignment. In Link Prediction, machine learning models are trained to predict unknown facts between entities based on the known facts. Entity Alignment aims at identifying shared entities between graphs in order to link several such knowledge graphs based on some provided seed alignment pairs. In this thesis, we present important advances in the field of knowledge graph completion. For Entity Alignment, we show how to reduce the number of required seed alignments while maintaining performance by novel active learning techniques. We also discuss the power of textual features and show that graph-neural-network-based methods have difficulties with noisy alignment data. For Link Prediction, we demonstrate how to improve the prediction for unknown entities at training time by exploiting additional metadata on individual statements, often available in modern graphs. Supported with results from a large-scale experimental study, we present an analysis of the effect of individual components of machine learning models, e.g., the interaction function or loss criterion, on the task of link prediction. We also introduce a software library that simplifies the implementation and study of such components and makes them accessible to a wide research community, ranging from relational learning researchers to applied fields, such as life sciences. Finally, we propose a novel metric for evaluating ranking results, as used for both completion tasks. It allows for easier interpretation and comparison, especially in cases with different numbers of ranking candidates, as encountered in the de-facto standard evaluation protocols for both tasks.Mit der rasant fortschreitenden Digitalisierung des privaten, kommerziellen und öffentlichen Sektors werden immer grĂ¶ĂŸere Datenmengen verfĂŒgbar. Um aus diesen enormen Mengen an Rohdaten Erkenntnisse oder Wissen zu gewinnen, ist eine tiefgehende Analyse unerlĂ€sslich. Das immense Volumen erfordert hochautomatisierte Prozesse mit minimaler manueller Interaktion. In den letzten Jahren haben Methoden des maschinellen Lernens eine zentrale Rolle bei dieser Aufgabe eingenommen. Neben den einzelnen Datenpunkten spielen oft auch deren ZusammenhĂ€nge eine entscheidende Rolle, z.B. ob zwei Patienten miteinander verwandt sind oder ob sie vom selben Arzt behandelt werden. Daher ist das relationale Lernen ein wichtiger Forschungszweig, der untersucht, wie diese explizit verfĂŒgbaren strukturellen Informationen zwischen verschiedenen Datenpunkten nutzbar gemacht werden können. In letzter Zeit haben Graph Neural Networks an Bedeutung gewonnen. Diese können als eine Erweiterung von CNNs von regelmĂ€ĂŸigen Gittern auf allgemeine (unregelmĂ€ĂŸige) Graphen betrachtet werden. Wissensgraphen spielen eine wesentliche Rolle bei der Darstellung von Fakten ĂŒber EntitĂ€ten in maschinenlesbaren Form. Obwohl große Anstrengungen unternommen werden, so viele Fakten wie möglich in diesen Graphen zu speichern, bleiben sie oft unvollstĂ€ndig, d. h. es fehlen Fakten. Die manuelle ÜberprĂŒfung und Erweiterung der Graphen wird aufgrund der großen Datenmengen immer schwieriger und muss daher durch automatisierte Verfahren unterstĂŒtzt oder ersetzt werden, die fehlende Fakten vorhersagen. Das Gebiet der WissensgraphenvervollstĂ€ndigung lĂ€sst sich grob in zwei Kategorien einteilen: Link Prediction und Entity Alignment. Bei der Link Prediction werden maschinelle Lernmodelle trainiert, um unbekannte Fakten zwischen EntitĂ€ten auf der Grundlage der bekannten Fakten vorherzusagen. Entity Alignment zielt darauf ab, gemeinsame EntitĂ€ten zwischen Graphen zu identifizieren, um mehrere solcher Wissensgraphen auf der Grundlage einiger vorgegebener Paare zu verknĂŒpfen. In dieser Arbeit stellen wir wichtige Fortschritte auf dem Gebiet der VervollstĂ€ndigung von Wissensgraphen vor. FĂŒr das Entity Alignment zeigen wir, wie die Anzahl der benötigten Paare reduziert werden kann, wĂ€hrend die Leistung durch neuartige aktive Lerntechniken erhalten bleibt. Wir erörtern auch die LeistungsfĂ€higkeit von Textmerkmalen und zeigen, dass auf Graph-Neural-Networks basierende Methoden Schwierigkeiten mit verrauschten Paar-Daten haben. FĂŒr die Link Prediction demonstrieren wir, wie die Vorhersage fĂŒr unbekannte EntitĂ€ten zur Trainingszeit verbessert werden kann, indem zusĂ€tzliche Metadaten zu einzelnen Aussagen genutzt werden, die oft in modernen Graphen verfĂŒgbar sind. GestĂŒtzt auf Ergebnisse einer groß angelegten experimentellen Studie prĂ€sentieren wir eine Analyse der Auswirkungen einzelner Komponenten von Modellen des maschinellen Lernens, z. B. der Interaktionsfunktion oder des Verlustkriteriums, auf die Aufgabe der Link Prediction. Außerdem stellen wir eine Softwarebibliothek vor, die die Implementierung und Untersuchung solcher Komponenten vereinfacht und sie einer breiten Forschungsgemeinschaft zugĂ€nglich macht, die von Forschern im Bereich des relationalen Lernens bis hin zu angewandten Bereichen wie den Biowissenschaften reicht. Schließlich schlagen wir eine neuartige Metrik fĂŒr die Bewertung von Ranking-Ergebnissen vor, wie sie fĂŒr beide Aufgaben verwendet wird. Sie ermöglicht eine einfachere Interpretation und einen leichteren Vergleich, insbesondere in FĂ€llen mit einer unterschiedlichen Anzahl von Kandidaten, wie sie in den de-facto Standardbewertungsprotokollen fĂŒr beide Aufgaben vorkommen

    Measuring and Analysing the Chain of Implicit Trust: AStudy of Third-party Resources Loading

    Get PDF
    The web is a tangled mass of interconnected services, whereby websites import a range of external resources from various third-party domains. The latter can also load further resources hosted on other domains. For each website, this creates a dependency chain underpinned by a form of implicit trust between the first-party and transitively connected third parties. The chain can only be loosely controlled as first-party websites often have little, if any, visibility on where these resources are loaded from. This article performs a large-scale study of dependency chains in the web to find that around 50% of first-party websites render content that they do not directly load. Although the majority (84.91%) of websites have short dependency chains (below three levels), we find websites with dependency chains exceeding 30. Using VirusTotal, we show that 1.2% of these third parties are classified as suspicious—although seemingly small, this limited set of suspicious third parties have remarkable reach into the wider ecosystem. We find that 73% of websites under-study load resources from suspicious third parties, and 24.8% of first-party webpages contain at least three third parties classified as suspicious in their dependency chain. By running sandboxed experiments, we observe a range of activities with the majority of suspicious JavaScript codes downloading malware

    Annual Report, 2013-2014

    Get PDF
    Beginning in 2004/2005- issued in online format onl

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    Sentiment Analysis for Social Media

    Get PDF
    Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection
    • 

    corecore