317 research outputs found

    Discovering Social Events through Online Attention

    Get PDF
    abstract: Twitter is a major social media platform in which users send and read messages (“tweets”) of up to 140 characters. In recent years this communication medium has been used by those affected by crises to organize demonstrations or find relief. Because traffic on this media platform is extremely heavy, with hundreds of millions of tweets sent every day, it is difficult to differentiate between times of turmoil and times of typical discussion. In this work we present a new approach to addressing this problem. We first assess several possible “thermostats” of activity on social media for their effectiveness in finding important time periods. We compare methods commonly found in the literature with a method from economics. By combining methods from computational social science with methods from economics, we introduce an approach that can effectively locate crisis events in the mountains of data generated on Twitter. We demonstrate the strength of this method by using it to locate the social events relating to the Occupy Wall Street movement protests at the end of 2011.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.010200

    Multi-Rater Consensus Learning for Modeling Multiple Sparse Ratings of Affective Behaviour

    Get PDF
    The use of multiple raters to label datasets is an established practice in affective computing. The principal goal is to reduce unwanted subjective bias in the labelling process. Unfortunately, this leads to the key problem of identifying a ground truth for training the affect recognition system. This problem becomes more relevant in a sparsely-crossed annotation where each rater only labels a portion of the full dataset to ensure a manageable workload per rater. In this paper, we introduce a Multi-Rater Consensus Learning (MRCL) method which learns a representative affect recognition model that accounts for each rater’s agreement with the other raters. MRCL combines a multitask learning (MTL) regularizer and a consensus loss. Unlike standard MTL, this approach allows the model to learn to predict each rater’s label while explicitly accounting for the consensus among raters. We evaluated our approach on two different datasets based on spontaneous affective body movement expressions for pain behaviour detection and laughter type recognition respectively. The two naturalistic datasets were chosen for the different forms of labelling (different in affect, observation stimuli, and raters) that they together offer for evaluating our approach. Empirical results demonstrate that MRCL is effective for modelling affect from datasets with sparsely-crossed multi-rater annotation

    Multi-Rater Consensus Learning for Modeling Multiple Sparse Ratings of Affective Behaviour

    Get PDF
    The use of multiple raters to label datasets is an established practice in affective computing. The principal goal is to reduce unwanted subjective bias in the labelling process. Unfortunately, this leads to the key problem of identifying a ground truth for training the affect recognition system. This problem becomes more relevant in a sparsely-crossed annotation where each rater only labels a portion of the full dataset to ensure a manageable workload per rater. In this paper, we introduce a Multi-Rater Consensus Learning (MRCL) method which learns a representative affect recognition model that accounts for each rater's agreement with the other raters. MRCL combines a multitask learning (MTL) regularizer and a consensus loss. Unlike standard MTL, this approach allows the model to learn to predict each rater's label while explicitly accounting for the consensus among raters. We evaluated our approach on two different datasets based on spontaneous affective body movement expressions for pain behaviour detection and laughter type recognition respectively. The two naturalistic datasets were chosen for the different forms of labelling (different in affect, observation stimuli, and raters) that they together offer for evaluating our approach. Empirical results demonstrate that MRCL is effective for modelling affect from datasets with sparsely-crossed multi-rater annotation

    Integration Of Multi-Sensory Earth Observations For Characterization Of Air Quality Events

    Get PDF
    In order to characterize air quality events, such as dust storms or smoke events from fires, a wide variety of Earth observations are needed from satellites, surface monitors and models. Traditionally, the burden of data access and processing was placed on the data user. These challenges of finding, accessing and merging data are overcome through the principles of Service Oriented Architecture. This thesis describes the collaborative, service-oriented approach now available for air quality event analysis, where datasets are turned into services that can be accessed by tools through standard queries. This thesis extends AQ event evidence to include photos, videos and personal observations gathered from social media websites such as Flickr, Twitter and YouTube. In this thesis, the service-oriented approach is demonstrated using two case studies. The first explains the benefits of data reuse in real-time event analysis focusing on the 2009 Southern California Smoke event. The second case study highlights post-event analysis for EPAΓÇÖs Exceptional Event Rule. The thesis concludes with a first attempt to quantify the benefits of data reuse by identifying all of the different user requirements for Earth observation data. We found that the real-time and post-event analysis had 68 unique Earth observation requirements making it an ideal example for illustrating the benefits of service oriented architecture for air quality analysis. While this thesis focuses on the air quality domain, the tools and methods can be applied to any area that needs distributed data

    Addressing training data sparsity and interpretability challenges in AI based cellular networks

    Get PDF
    To meet the diverse and stringent communication requirements for emerging networks use cases, zero-touch arti cial intelligence (AI) based deep automation in cellular networks is envisioned. However, the full potential of AI in cellular networks remains hindered by two key challenges: (i) training data is not as freely available in cellular networks as in other fields where AI has made a profound impact and (ii) current AI models tend to have black box behavior making operators reluctant to entrust the operation of multibillion mission critical networks to a black box AI engine, which allow little insights and discovery of relationships between the configuration and optimization parameters and key performance indicators. This dissertation systematically addresses and proposes solutions to these two key problems faced by emerging networks. A framework towards addressing the training data sparsity challenge in cellular networks is developed, that can assist network operators and researchers in choosing the optimal data enrichment technique for different network scenarios, based on the available information. The framework encompasses classical interpolation techniques, like inverse distance weighted and kriging to more advanced ML-based methods, like transfer learning and generative adversarial networks, several new techniques, such as matrix completion theory and leveraging different types of network geometries, and simulators and testbeds, among others. The proposed framework will lead to more accurate ML models, that rely on sufficient amount of representative training data. Moreover, solutions are proposed to address the data sparsity challenge specifically in Minimization of drive test (MDT) based automation approaches. MDT allows coverage to be estimated at the base station by exploiting measurement reports gathered by the user equipment without the need for drive tests. Thus, MDT is a key enabling feature for data and artificial intelligence driven autonomous operation and optimization in current and emerging cellular networks. However, to date, the utility of MDT feature remains thwarted by issues such as sparsity of user reports and user positioning inaccuracy. For the first time, this dissertation reveals the existence of an optimal bin width for coverage estimation in the presence of inaccurate user positioning, scarcity of user reports and quantization error. The presented framework can enable network operators to configure the bin size for given positioning accuracy and user density that results in the most accurate MDT based coverage estimation. The lack of interpretability in AI-enabled networks is addressed by proposing a first of its kind novel neural network architecture leveraging analytical modeling, domain knowledge, big data and machine learning to turn black box machine learning models into more interpretable models. The proposed approach combines analytical modeling and domain knowledge to custom design machine learning models with the aim of moving towards interpretable machine learning models, that not only require a lesser training time, but can also deal with issues such as sparsity of training data and determination of model hyperparameters. The approach is tested using both simulated data and real data and results show that the proposed approach outperforms existing mathematical models, while also remaining interpretable when compared with black-box ML models. Thus, the proposed approach can be used to derive better mathematical models of complex systems. The findings from this dissertation can help solve the challenges in emerging AI-based cellular networks and thus aid in their design, operation and optimization

    Risk Clusters, Hotspots, and Spatial Intelligence: Risk Terrain Modeling as an Algorithm for Police Resource Allocation Strategies

    Full text link
    The study reported here follows the suggestion by Caplan et al. (Justice Q, 2010) that risk terrain modeling (RTM) be developed by doing more work to elaborate, operationalize, and test variables that would provide added value to its application in police operations. Building on the ideas presented by Caplan et al., we address three important issues related to RTM that sets it apart from current approaches to spatial crime analysis. First, we address the selection criteria used in determining which risk layers to include in risk terrain models. Second, we compare the ‘‘best model’’ risk terrain derived from our analysis to the traditional hotspot density mapping technique by considering both the statistical power and overall usefulness of each approach. Third, we test for ‘‘risk clusters’’ in risk terrain maps to determine how they can be used to target police resources in a way that improves upon the current practice of using density maps of past crime in determining future locations of crime occurrence. This paper concludes with an in depth exploration of how one might develop strategies for incorporating risk terrains into police decisionmaking. RTM can be developed to the point where it may be more readily adopted by police crime analysts and enable police to be more effectively proactive and identify areas with the greatest probability of becoming locations for crime in the future. The targeting of police interventions that emerges would be based on a sound understanding of geographic attributes and qualities of space that connect to crime outcomes and would not be the result of identifying individuals from specific groups or characteristics of people as likely candidates for crime, a tactic that has led police agencies to be accused of profiling. In addition, place-based interventions may offer a more efficient method of impacting crime than efforts focused on individuals

    Human behavioural ecology, anthropogenic impact and subsistence change at the teouma lapita site, central Vanuatu, 3000-2500 BP

    Get PDF
    This thesis investigates early human palaeoecological interaction at the Teouma Lapita site on Efate Island, central Vanuatu, and how it changed during a period of cultural transition between 3000-2500 BP. Here I take a quantified approach through an evolutionary ecological theoretical framework using optimal foraging models (Prey Choice, Patch Choice and Central Place Foraging) to generate predictions of optimal economic behaviour in response to temporal variation in prey abundances. These optimal foraging models (OFM) which typically focus on foraging cultures had to be adjusted to the broad spectrum Lapita mixed economy which combined foraging within marine and terrestrial resource patches and horticulture, incorporating pig husbandry and plant cultivation. To this end mammal, bird and reptile vertebrate taxa were divided into three broad resource patches, coastal, terrestrial and the domestic patch. Alternative social theoretical perspectives were also built into the models such as costly signalling theory. OFM predictions were then tested using multiple zooarchaeological datasets to demonstrate changes in foraging efficiency and mobility between resource patches as a result of human induced resource depression. Datasets used include measures of prey diversity, relative abundance, demography, skeletal element representation, and butchery intensity. The results indicate that Lapita foragers focused initially on high ranked fruit bat and large bodied sea turtle resources in concentrated and predictable proximal locations which yielded high post encounter return rates. Giant tortoise exploitation in distant resource patches gained in importance over time as these proximal resource patches became depleted. Domestic patch resources were established and pig abundances increased very quickly but had initially high infant mortality rates due to nutritional deficiency and/or selective culling to reduce associated labour costs. Pigs were closely managed and regulated for a range of purposes which included daily household meat consumption as well as ritualistic feasting events. Faunal abundances peaked during the later post-cemetery period as Lapita settlement and foraging intensified which had a huge impact on the terrestrial and coastal resources due in part to direct foraging, forest clearance. An ecological tipping point followed which saw the disappearance of crocodile and a number of fruit bat and bird species from the record. As encounter rates of high ranked taxa declined so did foraging efficiency and the transition from Lapita to post-Lapita culture saw a dramatic change in subsistence patterns. Tortoise and sea turtle nesting populations were devastated as giant tortoises became extinct around the transition between Lapita to early Erueti, rat demography and the large New Guinea Spiny rat declined likely as a result of human predation as settlement intensity appears to have peaked by the end of the Lapita period. Pig production also declined likely in response to ecological and social developments, and a switch to hunting feral pigs may have occurred. These subsistence changes and declines in foraging efficiency appear to have been associated with changes in settlement patterns which conform to the ideal free distribution model as well as declines in social stratification
    • 

    corecore