9,392 research outputs found

    EGO: a personalised multimedia management tool

    Get PDF
    The problems of Content-Based Image Retrieval (CBIR) sys- tems can be attributed to the semantic gap between the low-level data representation and the high-level concepts the user associates with images, on the one hand, and the time-varying and often vague nature of the underlying information need, on the other. These problems can be addressed by improving the interaction between the user and the system. In this paper, we sketch the development of CBIR interfaces, and introduce our view on how to solve some of the problems of the studied interfaces. To address the semantic gap and long-term multifaceted information needs, we propose a "retrieval in context" system. EGO is a tool for the management of image collections, supporting the user through personalisation and adaptation. We will describe how it learns from the user's personal organisation, allowing it to recommend relevant images to the user. The recommendation algorithm is detailed, which is based on relevance feedback techniques

    Automatic Irony Detection using Feature Fusion and Ensemble Classifier

    Get PDF
    With the advent of micro-blogging sites, users are pioneer in expressing their sentiments and emotions on global issues through text. Automatic detection and classification of sentiments like sarcastic or ironic content in microblogging reviews is a challenging task. It requires a system that manages some kind of knowledge to interpret the sentiment expressed in text. The available approaches are quite limited in their capabilities and scope to detect ironic utterances present in the text. In this regards, the paper propose feature fusion to provide knowledge to the system by alternative sets of features obtained using linguistic and content based text features. The proposed work extracts five sets of linguistic features and fuses with features selected using two stages of a feature selection method. In order to demonstrate the effectiveness of the proposed method, we conduct extensive experimentation by selecting different feature subsets. The performances of the proposed method are evaluated using Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT) and ensemble classifiers. The experimental result shows the proposed approach significantly out-performs the conventional methods

    An interactive human centered data science approach towards crime pattern analysis

    Get PDF
    The traditional machine learning systems lack a pathway for a human to integrate their domain knowledge into the underlying machine learning algorithms. The utilization of such systems, for domains where decisions can have serious consequences (e.g. medical decision-making and crime analysis), requires the incorporation of human experts' domain knowledge. The challenge, however, is how to effectively incorporate domain expert knowledge with machine learning algorithms to develop effective models for better decision making. In crime analysis, the key challenge is to identify plausible linkages in unstructured crime reports for the hypothesis formulation. Crime analysts painstakingly perform time-consuming searches of many different structured and unstructured databases to collate these associations without any proper visualization. To tackle these challenges and aiming towards facilitating the crime analysis, in this paper, we examine unstructured crime reports through text mining to extract plausible associations. Specifically, we present associative questioning based searching model to elicit multi-level associations among crime entities. We coupled this model with partition clustering to develop an interactive, human-assisted knowledge discovery and data mining scheme. The proposed human-centered knowledge discovery and data mining scheme for crime text mining is able to extract plausible associations between crimes, identifying crime pattern, grouping similar crimes, eliciting co-offender network and suspect list based on spatial-temporal and behavioral similarity. These similarities are quantified through calculating Cosine, Jacquard, and Euclidean distances. Additionally, each suspect is also ranked by a similarity score in the plausible suspect list. These associations are then visualized through creating a two-dimensional re-configurable crime cluster space along with a bipartite knowledge graph. This proposed scheme also inspects the grand challenge of integrating effective human interaction with the machine learning algorithms through a visualization feedback loop. It allows the analyst to feed his/her domain knowledge including choosing of similarity functions for identifying associations, dynamic feature selection for interactive clustering of crimes and assigning weights to each component of the crime pattern to rank suspects for an unsolved crime. We demonstrate the proposed scheme through a case study using the Anonymized burglary dataset. The scheme is found to facilitate human reasoning and analytic discourse for intelligence analysis

    Exploiting multimedia in creating and analysing multimedia Web archives

    No full text
    The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general

    Visualising the structure of document search results: A comparison of graph theoretic approaches

    Get PDF
    This is the post-print of the article - Copyright @ 2010 Sage PublicationsPrevious work has shown that distance-similarity visualisation or ‘spatialisation’ can provide a potentially useful context in which to browse the results of a query search, enabling the user to adopt a simple local foraging or ‘cluster growing’ strategy to navigate through the retrieved document set. However, faithfully mapping feature-space models to visual space can be problematic owing to their inherent high dimensionality and non-linearity. Conventional linear approaches to dimension reduction tend to fail at this kind of task, sacrificing local structural in order to preserve a globally optimal mapping. In this paper the clustering performance of a recently proposed algorithm called isometric feature mapping (Isomap), which deals with non-linearity by transforming dissimilarities into geodesic distances, is compared to that of non-metric multidimensional scaling (MDS). Various graph pruning methods, for geodesic distance estimation, are also compared. Results show that Isomap is significantly better at preserving local structural detail than MDS, suggesting it is better suited to cluster growing and other semantic navigation tasks. Moreover, it is shown that applying a minimum-cost graph pruning criterion can provide a parameter-free alternative to the traditional K-neighbour method, resulting in spatial clustering that is equivalent to or better than that achieved using an optimal-K criterion

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Tweet-tales: moods of socio-economic crisis?

    Get PDF
    The widespread adoption of highly interactive social media like Twitter, Facebook and other platforms allow users to communicate moods and opinions to their social network. Those platforms represent an unprecedented source of information about human habits and socio-economic interactions. Several new studies have started to exploit the potential of these big data as fingerprints of economic and social interactions. The present analysis aims at exploring the informative power of indicators derived from social media activity, with the aim to trace some preliminary guidelines to investigate the eventual correspondence between social media indices and available labour market indicators at a territorial level. The study is based on a large dataset of about 4 million Italian-language tweets collected from October 2014 to December 2015, filtered by a set of specific keywords related to the labour market. With techniques from machine learning and user’s geolocalization, we were able to subset the tweets on specific topics in all Italian provinces. The corpus of tweets is then analyzed with linguistic tools and hierarchical clustering analysis. A comparison with traditional economic indicators suggests a strong need for further cleaning procedures, which are then developed in detail. As data from social networks are easy to obtain, this represents a very first attempt to evaluate their informative power in the Italian context, which is of potentially high importance in economic and social research
    corecore