22,193 research outputs found

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    A rule dynamics approach to event detection in Twitter with its application to sports and politics

    Get PDF
    The increasing popularity of Twitter as social network tool for opinion expression as well as informa- tion retrieval has resulted in the need to derive computational means to detect and track relevant top- ics/events in the network. The application of topic detection and tracking methods to tweets enable users to extract newsworthy content from the vast and somehow chaotic Twitter stream. In this paper, we ap- ply our technique named Transaction-based Rule Change Mining to extract newsworthy hashtag keywords present in tweets from two different domains namely; sports (The English FA Cup 2012) and politics (US Presidential Elections 2012 and Super Tuesday 2012). Noting the peculiar nature of event dynamics in these two domains, we apply different time-windows and update rates to each of the datasets in order to study their impact on performance. The performance effectiveness results reveal that our approach is able to accurately detect and track newsworthy content. In addition, the results show that the adaptation of the time-window exhibits better performance especially on the sports dataset, which can be attributed to the usually shorter duration of football events

    Alexandria: Extensible Framework for Rapid Exploration of Social Media

    Full text link
    The Alexandria system under development at IBM Research provides an extensible framework and platform for supporting a variety of big-data analytics and visualizations. The system is currently focused on enabling rapid exploration of text-based social media data. The system provides tools to help with constructing "domain models" (i.e., families of keywords and extractors to enable focus on tweets and other social media documents relevant to a project), to rapidly extract and segment the relevant social media and its authors, to apply further analytics (such as finding trends and anomalous terms), and visualizing the results. The system architecture is centered around a variety of REST-based service APIs to enable flexible orchestration of the system capabilities; these are especially useful to support knowledge-worker driven iterative exploration of social phenomena. The architecture also enables rapid integration of Alexandria capabilities with other social media analytics system, as has been demonstrated through an integration with IBM Research's SystemG. This paper describes a prototypical usage scenario for Alexandria, along with the architecture and key underlying analytics.Comment: 8 page

    Event detection on streams of short texts for decision-making

    Get PDF
    L'objectif de cette thèse est de concevoir d'évènements sur les réseaux sociaux permettant d'assister les personnes en charge de prises de décisions dans des contextes industriels. Le but est de créer un système de détection d'évènement permettant de détecter des évènements à la fois ciblés, propres à des domaines particuliers mais aussi des évènements généraux. En particulier, nous nous intéressons à l'application de ce système aux chaînes d'approvisionnements et plus particulièrement celles liées aux matières premières. Le défi est de mettre en place un tel système de détection, mais aussi de déterminer quels sont les évènements potentiellement impactant dans ces contextes. Cette synthèse résume les différentes étapes des recherches menées pour répondre à ces problématiques. Architecture d'un système de détection d'évènements Dans un premier temps, nous introduisons les différents éléments nécessaires à la constitution d'un système de détection d'évènements. Ces systèmes sont classiquement constitués d'une étape de filtrage et de nettoyage des données, permettant de s'assurer de la qualité des données traitées par le reste du système. Ensuite, ces données sont représentées de manière à pouvoir être regroupées par similarité. Une fois ces regroupements de données établis, ils sont analysés de manière à savoir si les documents les constituants traitent d'un évènement ou non. Finalement, l'évolution dans le temps de ces évènements est suivie. Nous avons proposé au cours de cette thèse d'étudier les problématiques propres à chacune de ces étapes. Représentation textuelles de documents issus des réseaux sociaux Nous avons comparé différentes méthodes de représentations des données textuelles, dans le contexte de notre système de détection d'évènements. Nous avons comparé les performances de notre système de détection à l'algorithme First Story Detection (FSD), un algorithme ayant les mêmes objectifs. Nous avons d'abord démontré que le système que nous proposons est plus performant que le FSD, mais aussi que les architectures récentes de réseaux de neurones (transformeur) sont plus performantes que TF-IDF dans notre contexte, contrairement à ce qui avait été montré dans le contexte du FSD. Nous avons ensuite proposé de combiner différentes représentations textuelles afin d'exploiter conjointement leurs forces. Détection d'évènement, suivi et évaluation Nous avons proposé des approches pour les composantes d'analyse de regroupement de documents ainsi que pour le suivi de l'évolution de ces évènements. En particulier, nous utilisons l'entropie et la diversité d'utilisateurs introduits dans [Rajouter les citations] pour évaluer les regroupements. Nous suivons ensuite leur évolution au cours du temps en faisant des comparaisons entre regroupements à des instants différents, afin de créer des chaînes de regroupements. Enfin, nous avons étudié comment évaluer des systèmes de détection d'évènements dans des contextes où seulement peu de données annotées par des humains sont disponibles. Nous avons proposé une méthode permettant d'évaluer automatiquement les systèmes de détection d'évènement en exploitant des données partiellement annotées. Application au contexte des matières premières. Afin de spécifier les types d'évènements à superviser, nous avons mené une étude historique des évènements ayant impacté le cours des matières premières. En particulier, nous nous sommes focalisé sur le phosphate, une matière première stratégique. Nous avons étudié les différents facteurs ayant une influence, proposé une méthode reproductible pouvant être appliquée à d'autres matières premières ou d'autres domaines. Enfin, nous avons dressé une liste d'éléments à superviser pour permettre aux experts d'anticiper les variations des cours.The objective of this thesis is to design an event detection system on social networks to assist people in charge of decision-making in industrial contexts. The event detection system must be able to detect both targeted, domain-specific events and general events. In particular, we are interested in the application of this system to supply chains and more specifically those related to raw materials. The challenge is to build such a detection system, but also to determine which events are potentially influencing the raw materials supply chains. This synthesis summarizes the different stages of research conducted to answer these problems. Architecture of an event detection system First, we introduce the different building blocks of an event detection system. These systems are classically composed of a data filtering and cleaning step, ensuring the quality of the data processed by the system. Then, these data are embedded in such a way that they can be clustered by similarity. Once these data clusters are created, they are analyzed in order to know if the documents constituting them discuss an event or not. Finally, the evolution of these events is tracked. In this thesis, we have proposed to study the problems specific to each of these steps. Textual representation of documents from social networks We compared different text representation models, in the context of our event detection system. We also compared the performances of our event detection system to the First Story Detection (FSD) algorithm, an algorithm with the same objectives. We first demonstrated that our proposed system performs better than FSD, but also that recent neural network architectures perform better than TF-IDF in our context, contrary to what was shown in the context of FSD. We then proposed to combine different textual representations in order to jointly exploit their strengths. Event detection, monitoring, and evaluation We have proposed different approaches for event detection and event tracking. In particular, we use the entropy and user diversity introduced in ... to evaluate the clusters. We then track their evolution over time by making comparisons between clusters at different times, in order to create chains of clusters. Finally, we studied how to evaluate event detection systems in contexts where only few human-annotated data are available. We proposed a method to automatically evaluate event detection systems by exploiting partially annotated data. Application to the commodities context In order to specify the types of events to supervise, we conducted a historical study of events that have impacted the price of raw materials. In particular, we focused on phosphate, a strategic raw material. We studied the different factors having an influence, proposed a reproducible method that can be applied to other raw materials or other fields. Finally, we drew up a list of elements to supervise to enable experts to anticipate price variations

    A hierarchical topic modelling approach for tweet clustering

    Get PDF
    While social media platforms such as Twitter can provide rich and up-to-date information for a wide range of applications, manually digesting such large volumes of data is difficult and costly. Therefore it is important to automatically infer coherent and discriminative topics from tweets. Conventional topic models and document clustering approaches fail to achieve good results due to the noisy and sparse nature of tweets. In this paper, we explore various ways of tackling this challenge and finally propose a two-stage hierarchical topic modelling system that is efficient and effective in alleviating the data sparsity problem. We present an extensive evaluation on two datasets, and report our proposed system achieving the best performance in both document clustering performance and topic coherence

    A survey of data mining techniques for social media analysis

    Get PDF
    Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors

    Cumulative causation in the formation of a technological innovation system: The case of biofuels in the Netherlands

    Get PDF
    Despite its worldwide success, the innovation systems approach is often criticized for being theoretically underdeveloped. This article aims to contribute to the conceptual and methodical basis of the (technological) innovation systems approach. We propose an alteration that improves the analysis of dynamics, especially with respect to emerging innovation systems. We do this by expanding on the technological innovation systems and system functions literature, and by employing the method of 'event history analysis'. By mapping events, the interactions between system functions and their development over time can be analysed. Based on this it becomes possible to identify forms of positive feedback, i.e. cumulative causation. As an illustration of the approach, we assess the biofuels innovation system in The Netherlands as it evolved from 1990 to 2005.
    • …
    corecore