530 research outputs found

    Explainable pattern modelling and summarization in sensor equipped smart homes of elderly

    Get PDF
    In the next several decades, the proportion of the elderly population is expected to increase significantly. This has led to various efforts to help live them independently for longer periods of time. Smart homes equipped with sensors provide a potential solution by capturing various behavioral and physiological patterns of the residents. In this work, we develop techniques to model and detect changes in these patterns. The focus is on methods that are explainable in nature and allow for generating natural language descriptions. We propose a comprehensive change description framework that can detect unusual changes in the sensor parameters and describe the data leading to those changes in natural language. An approach that models and detects variations in physiological and behavioral routines of the elderly forms one part of the change description framework. The second part comes from a natural language generation system in which we identify important health-relevant features from the sensor parameters. Throughout this dissertation, we validate the developed techniques using both synthetic and real data obtained from the homes of the elderly living in sensor-equipped facilities. Using multiple real data retrospective case studies, we show that our methods are able to detect variations in the sensor data that are correlated with important health events in the elderly as recorded in their Electronic Health Records.Includes bibliographical reference

    Data Mining Techniques to Understand Textual Data

    Get PDF
    More than ever, information delivery online and storage heavily rely on text. Billions of texts are produced every day in the form of documents, news, logs, search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Text understanding is a fundamental and essential task involving broad research topics, and contributes to many applications in the areas text summarization, search engine, recommendation systems, online advertising, conversational bot and so on. However, understanding text for computers is never a trivial task, especially for noisy and ambiguous text such as logs, search queries. This dissertation mainly focuses on textual understanding tasks derived from the two domains, i.e., disaster management and IT service management that mainly utilizing textual data as an information carrier. Improving situation awareness in disaster management and alleviating human efforts involved in IT service management dictates more intelligent and efficient solutions to understand the textual data acting as the main information carrier in the two domains. From the perspective of data mining, four directions are identified: (1) Intelligently generate a storyline summarizing the evolution of a hurricane from relevant online corpus; (2) Automatically recommending resolutions according to the textual symptom description in a ticket; (3) Gradually adapting the resolution recommendation system for time correlated features derived from text; (4) Efficiently learning distributed representation for short and lousy ticket symptom descriptions and resolutions. Provided with different types of textual data, data mining techniques proposed in those four research directions successfully address our tasks to understand and extract valuable knowledge from those textual data. My dissertation will address the research topics outlined above. Concretely, I will focus on designing and developing data mining methodologies to better understand textual information, including (1) a storyline generation method for efficient summarization of natural hurricanes based on crawled online corpus; (2) a recommendation framework for automated ticket resolution in IT service management; (3) an adaptive recommendation system on time-varying temporal correlated features derived from text; (4) a deep neural ranking model not only successfully recommending resolutions but also efficiently outputting distributed representation for ticket descriptions and resolutions

    Mining Social Media for Newsgathering: A Review

    Get PDF
    Social media is becoming an increasingly important data source for learning about breaking news and for following the latest developments of ongoing news. This is in part possible thanks to the existence of mobile devices, which allows anyone with access to the Internet to post updates from anywhere, leading in turn to a growing presence of citizen journalism. Consequently, social media has become a go-to resource for journalists during the process of newsgathering. Use of social media for newsgathering is however challenging, and suitable tools are needed in order to facilitate access to useful information for reporting. In this paper, we provide an overview of research in data mining and natural language processing for mining social media for newsgathering. We discuss five different areas that researchers have worked on to mitigate the challenges inherent to social media newsgathering: news discovery, curation of news, validation and verification of content, newsgathering dashboards, and other tasks. We outline the progress made so far in the field, summarise the current challenges as well as discuss future directions in the use of computational journalism to assist with social media newsgathering. This review is relevant to computer scientists researching news in social media as well as for interdisciplinary researchers interested in the intersection of computer science and journalism.Comment: Accepted for publication in Online Social Networks and Medi

    A visual analytics platform for competitive intelligence

    Get PDF
    Silva, D., & Bação, F. (2023). MapIntel: A visual analytics platform for competitive intelligence. Expert Systems, [e13445]. https://doi.org/https://www.authorea.com/doi/full/10.22541/au.166785335.50477185, https://doi.org/10.1111/exsy.13445 --- Funding Information: This work was supported by the (research grant under the DSAIPA/DS/0116/2019 project). Fundação para a Ciência e Tecnologia of Ministério da Ciência e Tecnologia e Ensino SuperiorCompetitive Intelligence allows an organization to keep up with market trends and foresee business opportunities. This practice is mainly performed by analysts scanning for any piece of valuable information in a myriad of dispersed and unstructured sources. Here we present MapIntel, a system for acquiring intelligence from vast collections of text data by representing each document as a multidimensional vector that captures its own semantics. The system is designed to handle complex Natural Language queries and visual exploration of the corpus, potentially aiding overburdened analysts in finding meaningful insights to help decision-making. The system searching module uses a retriever and re-ranker engine that first finds the closest neighbours to the query embedding and then sifts the results through a cross-encoder model that identifies the most relevant documents. The browsing or visualization module also leverages the embeddings by projecting them onto two dimensions while preserving the multidimensional landscape, resulting in a map where semantically related documents form topical clusters which we capture using topic modelling. This map aims at promoting a fast overview of the corpus while allowing a more detailed exploration and interactive information encountering process. We evaluate the system and its components on the 20 newsgroups data set, using the semantic document labels provided, and demonstrate the superiority of Transformer-based components. Finally, we present a prototype of the system in Python and show how some of its features can be used to acquire intelligence from a news article corpus we collected during a period of 8 months.preprintauthorsversionepub_ahead_of_prin

    Mapping the Current Landscape of Research Library Engagement with Emerging Technologies in Research and Learning: Final Report

    Get PDF
    The generation, dissemination, and analysis of digital information is a significant driver, and consequence, of technological change. As data and information stewards in physical and virtual space, research libraries are thoroughly entangled in the challenges presented by the Fourth Industrial Revolution:1 a societal shift powered not by steam or electricity, but by data, and characterized by a fusion of the physical and digital worlds.2 Organizing, structuring, preserving, and providing access to growing volumes of the digital data generated and required by research and industry will become a critically important function. As partners with the community of researchers and scholars, research libraries are also recognizing and adapting to the consequences of technological change in the practices of scholarship and scholarly communication. Technologies that have emerged or become ubiquitous within the last decade have accelerated information production and have catalyzed profound changes in the ways scholars, students, and the general public create and engage with information. The production of an unprecedented volume and diversity of digital artifacts, the proliferation of machine learning (ML) technologies,3 and the emergence of data as the “world’s most valuable resource,”4 among other trends, present compelling opportunities for research libraries to contribute in new and significant ways to the research and learning enterprise. Librarians are all too familiar with predictions of the research library’s demise in an era when researchers have so much information at their fingertips. A growing body of evidence provides a resounding counterpoint: that the skills, experience, and values of librarians, and the persistence of libraries as an institution, will become more important than ever as researchers contend with the data deluge and the ephemerality and fragility of much digital content. This report identifies strategic opportunities for research libraries to adopt and engage with emerging technologies,5 with a roughly fiveyear time horizon. It considers the ways in which research library values and professional expertise inform and shape this engagement, the ways library and library worker roles will be reconceptualized, and the implication of a range of technologies on how the library fulfills its mission. The report builds on a literature review covering the last five years of published scholarship, primarily North American information science literature, and interviews with a dozen library field experts, completed in fall 2019. It begins with a discussion of four cross-cutting opportunities that permeate many or all aspects of research library services. Next, specific opportunities are identified in each of five core research library service areas: facilitating information discovery, stewarding the scholarly and cultural record, advancing digital scholarship, furthering student learning and success, and creating learning and collaboration spaces. Each section identifies key technologies shaping user behaviors and library services, and highlights exemplary initiatives. Underlying much of the discussion in this report is the idea that “digital transformation is increasingly about change management”6 —that adoption of or engagement with emerging technologies must be part of a broader strategy for organizational change, for “moving emerging work from the periphery to the core,”7 and a broader shift in conceptualizing the research library and its services. Above all, libraries are benefitting from the ways in which emerging technologies offer opportunities to center users and move from a centralized and often siloed service model to embedded, collaborative engagement with the research and learning enterprise

    COSPO/CENDI Industry Day Conference

    Get PDF
    The conference's objective was to provide a forum where government information managers and industry information technology experts could have an open exchange and discuss their respective needs and compare them to the available, or soon to be available, solutions. Technical summaries and points of contact are provided for the following sessions: secure products, protocols, and encryption; information providers; electronic document management and publishing; information indexing, discovery, and retrieval (IIDR); automated language translators; IIDR - natural language capabilities; IIDR - advanced technologies; IIDR - distributed heterogeneous and large database support; and communications - speed, bandwidth, and wireless
    corecore