19,421 research outputs found

    Learning Multimodal Latent Attributes

    Get PDF
    Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning

    Social Media Text Mining Framework for Drug Abuse: An Opioid Crisis Case Analysis

    Get PDF
    Social media is considered as a promising and viable source of data for gaining insights into various disease conditions, patients’ attitudes and behaviors, and medications. The daily use of social media provides new opportunities for analyzing several aspects of communication. Social media as a big data source can be used to recognize communication and behavioral themes of problematic use of prescription drugs. Mining and analyzing such media have challenges and limitations with respect to topic deduction and data quality. There is a need for a structured approach to efficiently and effectively analyze social media content related to drug abuse in a manner that can mitigate the challenges surrounding the use of this data source. Following a design science research methodology, the research aims at developing and evaluating a framework for mining and analyzing social media content related to drug abuse in a manner that will mitigate challenges and limitations related to topic deduction and data quality. The framework consists of four phases: Topic Discovery and Detection; Data Collection; Data Preparation and Quality; and Analysis and Results. The topic discovery and detection phase consists of a topic expansion stage for the drug abuse related topics that address the research domain and objectives. The topic expansion is based on different terms related to keywords, categories, and characteristics of the topic of interest and the objective of monitoring. To formalize the process and supporting artifacts, we create an ontology for drug abuse that captures the different categories that exist in the topic expansion and the literature. The data collection phase is characterized by the date range, social media platforms, search keywords, and a set of inclusion/exclusion criteria. The data preparation and quality phase is mainly concerned with obtaining high-quality data to mitigate problems with data veracity. In this phase, we pre-process the collected data then we evaluate the quality of the data, with respect to the terms and objectives of the research topic phase, using a data quality evaluation matrix. Finally, in the data analysis phase, the researcher can choose the suitable analysis approach. We used a combination of unsupervised and supervised machine learning approaches, including opinion and content analysis modeling. We demonstrate and evaluate the applicability of the proposed framework to identify common concerns toward opioid crisis from two perspectives; the addicted users’ perspective and the public’s (non-addicted users) perspective. In both cases, data is collected from twitter using Crimson Hexagon, a social media analytics tool for data collection and analysis. Natural language processing is used for data preparation and pre-processing. Different data visualization techniques such as, word clouds and clustering visualization, are used to form a deeper understanding of the relationships among the identified themes for the selected communities. The results help in understanding concerns of the public and opioid addicts towards the opioid crisis in the United States. Results of this study could help in understanding the problem aspects and provide key input when it comes to defining and implementing innovative solutions/strategies to face the opioid epidemic. From a theoretical perspective, this study highlights the importance of developing and adapting text mining techniques to social media for drug abuse. This study proposes a social media text mining framework for drug abuse research which lead to a good quality of datasets. Emphasis is placed on developing methods for improving the discovery and identification of topics in social media domains characterized by a plethora of highly diverse terms and a lack of commonly available dictionary/language by the community such as in the opioid and drug abuse case. From a practical perspective, automatically analyzing social media users’ posts using machine learning tools can help in understanding the public themes and topics that exist in the recent discussions of online users of social media networks. This could help in developing proper mitigation strategies. Examples of such strategies can be gaining insights from the discussion topics to make the opioid media campaigns more effective in preventing opioid misuse. Finally, the study helps address some of the U.S. Department of Health and Human Services (HHS) five-point strategy by providing a systematic approach that could support conducting better research on addiction and drug abuse and strengthening public health data reporting and collection using social media data

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Design issues for agent-based resource locator systems

    Get PDF
    While knowledge is viewed by many as an asset, it is often difficult to locate particularitems within a large electronic corpus. This paper presents an agent based framework for the location of resources to resolve a specific query, and considers the associated design issue. Aspects of the work presented complements current research into both expertise finders and recommender systems. The essential issues for the proposed design are scalability, together ith the ability to learn and adapt to changing resources. As knowledge is often implicit within electronic resources, and therefore difficult to locate, we have proposed the use of ontologies, to extract the semantics and infer meaning to obtain the results required. We explore the use of communities of practice, applying ontology-based networks, and e-mail message exchanges to aid the resource discovery process

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Applied Evaluative Informetrics: Part 1

    Full text link
    This manuscript is a preprint version of Part 1 (General Introduction and Synopsis) of the book Applied Evaluative Informetrics, to be published by Springer in the summer of 2017. This book presents an introduction to the field of applied evaluative informetrics, and is written for interested scholars and students from all domains of science and scholarship. It sketches the field's history, recent achievements, and its potential and limits. It explains the notion of multi-dimensional research performance, and discusses the pros and cons of 28 citation-, patent-, reputation- and altmetrics-based indicators. In addition, it presents quantitative research assessment as an evaluation science, and focuses on the role of extra-informetric factors in the development of indicators, and on the policy context of their application. It also discusses the way forward, both for users and for developers of informetric tools.Comment: The posted version is a preprint (author copy) of Part 1 (General Introduction and Synopsis) of a book entitled Applied Evaluative Bibliometrics, to be published by Springer in the summer of 201

    Medical data processing and analysis for remote health and activities monitoring

    Get PDF
    Recent developments in sensor technology, wearable computing, Internet of Things (IoT), and wireless communication have given rise to research in ubiquitous healthcare and remote monitoring of human\u2019s health and activities. Health monitoring systems involve processing and analysis of data retrieved from smartphones, smart watches, smart bracelets, as well as various sensors and wearable devices. Such systems enable continuous monitoring of patients psychological and health conditions by sensing and transmitting measurements such as heart rate, electrocardiogram, body temperature, respiratory rate, chest sounds, or blood pressure. Pervasive healthcare, as a relevant application domain in this context, aims at revolutionizing the delivery of medical services through a medical assistive environment and facilitates the independent living of patients. In this chapter, we discuss (1) data collection, fusion, ownership and privacy issues; (2) models, technologies and solutions for medical data processing and analysis; (3) big medical data analytics for remote health monitoring; (4) research challenges and opportunities in medical data analytics; (5) examples of case studies and practical solutions

    City networks in cyberspace and time : using Google hyperlinks to measure global economic and environmental crises

    Get PDF
    Geographers and social scientists have long been interested in ranking and classifying the cities of the world. The cutting edge of this research is characterized by a recognition of the crucial importance of information and, specifically, ICTs to cities’ positions in the current Knowledge Economy. This chapter builds on recent “cyberspace” analyses of the global urban system by arguing for, and demonstrating empirically, the value of Web search engine data as a means of understanding cities as situated within, and constituted by, flows of digital information. To this end, we show how the Google search engine can be used to specify a dynamic, informational classification of North American cities based on both the production and the consumption of Web information about two prominent current issues global in scope: the global financial crisis, and global climate change
    corecore