38 research outputs found
Understanding Human Actions in Video
Understanding human behavior is crucial for any autonomous system which interacts with humans. For example, assistive robots need to know when a person is signaling for help, and autonomous vehicles need to know when a person is waiting to cross the street. However, identifying human actions in video is a challenging and unsolved problem. In this work, we address several of the key challenges in human action recognition. To enable better representations of video sequences, we develop novel deep learning architectures which improve representations both at the level of instantaneous motion as well as at the level of long-term context. In addition, to reduce reliance on fixed action vocabularies, we develop a compositional representation of actions which allows novel action descriptions to be represented as a sequence of sub-actions. Finally, we address the issue of data collection for human action understanding by creating a large-scale video dataset, consisting of 70 million videos collected from internet video sharing sites and their matched descriptions. We demonstrate that these contributions improve the generalization performance of human action recognition systems on several benchmark datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162887/1/stroud_1.pd
Social media mining under the COVID-19 context: Progress, challenges, and opportunities
Social media platforms allow users worldwide to create and share information, forging vast sensing networks that
allow information on certain topics to be collected, stored, mined, and analyzed in a rapid manner. During the
COVID-19 pandemic, extensive social media mining efforts have been undertaken to tackle COVID-19 challenges
from various perspectives. This review summarizes the progress of social media data mining studies in the
COVID-19 contexts and categorizes them into six major domains, including early warning and detection, human
mobility monitoring, communication and information conveying, public attitudes and emotions, infodemic and
misinformation, and hatred and violence. We further document essential features of publicly available COVID-19
related social media data archives that will benefit research communities in conducting replicable and repro�ducible studies. In addition, we discuss seven challenges in social media analytics associated with their potential
impacts on derived COVID-19 findings, followed by our visions for the possible paths forward in regard to social
media-based COVID-19 investigations. This review serves as a valuable reference that recaps social media mining
efforts in COVID-19 related studies and provides future directions along which the information harnessed from
social media can be used to address public health emergencies
Recommended from our members
Continually improving grounded natural language understanding through human-robot dialog
As robots become ubiquitous in homes and workplaces such as hospitals and factories, they must be able to communicate with humans. Several kinds of knowledge are required to understand and respond to a human's natural language commands and questions. If a person requests an assistant robot to take me to Alice's office, the robot must know that Alice is a person who owns some unique office, and that take me means it should navigate there. Similarly, if a person requests bring me the heavy, green mug, the robot must have accurate mental models of the physical concepts heavy, green, and mug. To avoid forcing humans to use key phrases or words robots already know, this thesis focuses on helping robots understanding new language constructs through interactions with humans and with the world around them. To understand a command in natural language, a robot must first convert that command to an internal representation that it can reason with. Semantic parsing is a method for performing this conversion, and the target representation is often semantic forms represented as predicate logic with lambda calculus. Traditional semantic parsing relies on hand-crafted resources from a human expert: an ontology of concepts, a lexicon connecting language to those concepts, and training examples of language with abstract meanings. One thrust of this thesis is to perform semantic parsing with sparse initial data. We use the conversations between a robot and human users to induce pairs of natural language utterances with the target semantic forms a robot discovers through its questions, reducing the annotation effort of creating training examples for parsing. We use this data to build more dialog-capable robots in new domains with much less expert human effort (Thomason et al., 2015; Padmakumar et al., 2017). Meanings of many language concepts are bound to the physical world. Understanding object properties and categories, such as heavy, green, and mug requires interacting with and perceiving the physical world. Embodied robots can use manipulation capabilities, such as pushing, picking up, and dropping objects to gather sensory data about them. This data can be used to understand non-visual concepts like heavy and empty (e.g. get the empty carton of milk from the fridge), and assist with concepts that have both visual and non-visual expression (e.g. tall things look big and also exert force sooner than short things when pressed down on). A second thrust of this thesis focuses on strategies for learning these concepts using multi-modal sensory information. We use human-in-the-loop learning to get labels between concept words and actual objects in the environment (Thomason et al., 2016, 2017). We also explore ways to tease out polysemy and synonymy in concept words (Thomason and Mooney, 2017) such as light, which can refer to a weight or a color, the latter sense being synonymous with pale. Additionally, pushing, picking up, and dropping objects to gather sensory information is prohibitively time-consuming, so we investigate strategies for using linguistic information and human input to expedite exploration when learning a new concept (Thomason et al., 2018). Finally, we build an integrated agent with both parsing and perception capabilities that learns from conversations with users to improve both components over time. We demonstrate that parser learning from conversations (Thomason et al., 2015) can be combined with multi-modal perception (Thomason et al., 2016) using predicate-object labels gathered through opportunistic active learning (Thomason et al., 2017) during those conversations to improve performance for understanding natural language commands from humans. Human users also qualitatively rate this integrated learning agent as more usable after it has improved from conversation-based learning.Computer Science
Big data analytics: a predictive analysis applied to cybersecurity in a financial organization
Project Work presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Knowledge Management and Business IntelligenceWith the generalization of the internet access, cyber attacks have registered an alarming growth in frequency and severity of damages, along with the awareness of organizations with heavy investments in cybersecurity, such as in the financial sector. This work is focused on an organization’s financial service that operates on the international markets in the payment systems industry. The objective was to develop a predictive framework solution responsible for threat detection to support the security team to open investigations on intrusive server requests, over the exponentially growing log events collected by the SIEM from the Apache Web Servers for the financial service.
A Big Data framework, using Hadoop and Spark, was developed to perform classification tasks over the financial service requests, using Neural Networks, Logistic Regression, SVM, and Random Forests algorithms, while handling the training of the imbalance dataset through BEV. The main conclusions over the analysis conducted, registered the best scoring performances for the Random Forests classifier using all the preprocessed features available. Using the all the available worker nodes with a balanced configuration of the Spark executors, the most performant elapsed times for loading and preprocessing of the data were achieved using the column-oriented ORC with native format, while the row-oriented CSV format performed the best for the training of the classifiers.Com a generalização do acesso à internet, os ciberataques registaram um crescimento alarmante em frequência e severidade de danos causados, a par da consciencialização das organizações, com elevados investimentos em cibersegurança, como no setor financeiro. Este trabalho focou-se no serviço financeiro de uma organização que opera nos mercados internacionais da indústria de sistemas de pagamento. O objetivo consistiu no desenvolvimento uma solução preditiva responsável pela detecção de ameaças, por forma a dar suporte à equipa de segurança na abertura de investigações sobre pedidos intrusivos no servidor, relativamente aos exponencialmente crescentes eventos de log coletados pelo SIEM, referentes aos Apache Web Servers, para o serviço financeiro.
Uma solução de Big Data, usando Hadoop e Spark, foi desenvolvida com o objectivo de executar tarefas de classificação sobre os pedidos do serviço financeiros, usando os algoritmos Neural Networks, Logistic Regression, SVM e Random Forests, solucionando os problemas associados ao treino de um dataset desequilibrado atravĂ©s de BEV. As principais conclusões sobre as análises realizadas registaram os melhores resultados de classificação usando o algoritmo Random Forests com todas as variáveis prĂ©-processadas disponĂveis. Usando todos os nĂłs do cluster e uma configuração balanceada dos executores do Spark, os melhores tempos para carregar e prĂ©-processar os dados foram obtidos usando o formato colunar ORC nativo, enquanto o formato CSV, orientado a linhas, apresentou os melhores tempos para o treino dos classificadores
Exploring Supervised Techniques for Automated Recognition of Intention Classes from Portuguese Free Texts on Agriculture
Technical and scientific knowledge is vast and complex, particularly in interdisciplinary fields such as sustainable agriculture, which is available in several interrelated, geographically dispersed and interdisciplinary online textual information sources. In this context, it is essential to support people with computational mechanisms that allow them to retrieve and interpret information in an appropriate way, as communication in these software systems is typically asynchronous and textual. User’s intention recognition and analysis in textual documents results in benefits for better information retrieval. However, intentions are expressed implicitly in texts in natural language and the specificities of the domain and cultural aspects of language make it difficult to process and analyze the text by computer systems. This requires the study of methods for the automatic recognition of intention classes in text. In this article, we conduct extensive experimental analyses on techniques based on language models and machine learning to detect instances of intention classes in texts about sustainable agriculture written in Portuguese. In our methodology, we perform a morphological analysis of the sentences and evaluate four Word Embeddings techniques (Word2Vec, Wang2Vec, FastText and Glove) combined with four machine learning techniques (Support Vector Machine, Artificial Neural Network, Random Forest and Transfer Learning). The results obtained by applying the techniques proposed in a database with textual information on sustainable agriculture indicate promising possibilities in the recognition of intentions in free texts  in Portuguese language on sustainable agriculture
Recommended from our members
Dialog as a vehicle for lifelong learning of grounded language understanding systems
Natural language interfaces have the potential to make various forms of technology, including mobile phones and computers as well as robots or other machines such as ATMs and self-checkout counters, more accessible and less intimidating to users who are unfamiliar or uncomfortable with other types of interfaces. In particular, natural language understanding systems on physical robots face a number of challenges, including the need to ground language in perception, the ability to adapt to changes in the environment and novel uses of language, and to deal with uncertainty in understanding. To effectively handle these challenges, such systems need to perform lifelong learning - continually updating the scope and predictions of the model with user interactions. In this thesis, we discuss ways in which dialog interaction with users can be used to improve grounded natural language understanding systems, motivated by service robot applications.
We focus on two types of queries that can be used in such dialog systems – active learning queries to elicit knowledge about the environment that can be used to improve perceptual models, and clarification questions that confirm the system’s hypotheses, or elicit specific information required to complete a task. Our goal is to build a system that can learn how to interact with users balancing a quick completion of tasks desired by the user with asking additional active learning questions viiito improve the underlying grounded language understanding components.
We present work on jointly improving semantic parsers from and learning a dialog policy for clarification dialogs, that improve a robot’s ability to understand natural language commands. We introduce the framework of opportunistic active learning, where a robot introduces opportunistic queries, that may not be immediately relevant, into an interaction in the hope of improving performance in future interactions. We demonstrate the usefulness of this framework in learning to ground natural language descriptions of objects, and learn a dialog policy for such interactions. We also learn dialog policies that balance task completion, opportunistic active learning, and attribute-based clarification questions. Finally, we attempt to expand this framework to different types of underlying models of grounded language understanding.Computer Science