5 research outputs found

    Ligação de dados espaço-temporais e textuais de mídias sociais com locais de interesse visitados / Linking space-time and textual data from social media with visited sites of interest

    Get PDF
    Dados de mídias sociais são gerados constantemente e representam uma relevante fonte de informação atualizada nos dias atuais. Todavia, mí- dias sociais incluem dados semi-estruturados (e.g. coordenadas geográ- ficas opcionais, indicação de local) e não estruturados (e.g. textos de postagens) que precisam ter sua semântica correta explicitada. Uma das formas de fazer isso é através de anotações semânticas que ligam porções relevantes de dados (e.g. um tweet inteiro, menções a entida- des nomeadas) a descrições com semântica bem definida em base de dados (e.g. locais de interesse do OpenStreetMap) ou de conhecimento (e.g. DBpedia, LinkedGeoData). Este trabalho se propõe a estudar, selecionar, desenvolver e testar algoritmos para ligar dados de mídias sociais que possuam coordenadas geográficas e texto (e.g. tweets ge- orreferenciados) a lugares de interesse visitados, descritos em bases de dados ou de conhecimento com semântica bem definida. Esta tarefa pode não ser tão simples em certas situações, porque coordenadas cos- tumam ter precisão limitada e pode haver uma grande quantidade de locais de interesse próximos à coordenada da postagem e até mesmo locais de interesse com coordenadas praticamente sobrepostas (e.g. em um edifício com vários andares). Desta forma, foi implementado um algoritmo para ligação de dados sobre movimento que considera dois raios distintos: um menor em que basta proximidade para ligar com confiança e outro maior em que se requer também menção ao local vi- sitado no texto para aumentar a confiabilidade da ligação. Em ambos os casos, ambiguidades são resolvidas escolhendo o local de interesse com rótulo lexicalmente mais similar ao local mencionado no texto da postagem.


    Get PDF
    Multiple types of passively collected location data (PCLD) have emerged during the past 20 years. Its capability in travel demand analysis has also been studied and revealed. Unlike the traditional surveys whose sample is designed efficiently and carefully, PCLD features a non-probabilistic sample of dramatically larger size. However, PCLD barely contains any ground truth for both the human subjects involved and the movements they produce. The imputation for such missing information has been evaluated for years, including origin and destination, travel mode, trip purpose, etc. This research intends to advance the utilization of PCLD by imputing social demographic information, which can help to create a panorama for the large volume of travel behaviors observed and to further develop a rational weighting procedure for PCLD. The Conditional Inference Tree model has been employed to address the problems because of its abilities to avoid biased variable selection and overfitting

    Activity Recognition for a Smartphone Based Travel Survey Based on Cross-User History Data

    No full text

    Situation inference and context recognition for intelligent mobile sensing applications

    Get PDF
    The usage of smart devices is an integral element in our daily life. With the richness of data streaming from sensors embedded in these smart devices, the applications of ubiquitous computing are limitless for future intelligent systems. Situation inference is a non-trivial issue in the domain of ubiquitous computing research due to the challenges of mobile sensing in unrestricted environments. There are various advantages to having robust and intelligent situation inference from data streamed by mobile sensors. For instance, we would be able to gain a deeper understanding of human behaviours in certain situations via a mobile sensing paradigm. It can then be used to recommend resources or actions for enhanced cognitive augmentation, such as improved productivity and better human decision making. Sensor data can be streamed continuously from heterogeneous sources with different frequencies in a pervasive sensing environment (e.g., smart home). It is difficult and time-consuming to build a model that is capable of recognising multiple activities. These activities can be performed simultaneously with different granularities. We investigate the separability aspect of multiple activities in time-series data and develop OPTWIN as a technique to determine the optimal time window size to be used in a segmentation process. As a result, this novel technique reduces need for sensitivity analysis, which is an inherently time consuming task. To achieve an effective outcome, OPTWIN leverages multi-objective optimisation by minimising the impurity (the number of overlapped windows of human activity labels on one label space over time series data) while maximising class separability. The next issue is to effectively model and recognise multiple activities based on the user's contexts. Hence, an intelligent system should address the problem of multi-activity and context recognition prior to the situation inference process in mobile sensing applications. The performance of simultaneous recognition of human activities and contexts can be easily affected by the choices of modelling approaches to build an intelligent model. We investigate the associations of these activities and contexts at multiple levels of mobile sensing perspectives to reveal the dependency property in multi-context recognition problem. We design a Mobile Context Recognition System, which incorporates a Context-based Activity Recognition (CBAR) modelling approach to produce effective outcome from both multi-stage and multi-target inference processes to recognise human activities and their contexts simultaneously. Upon our empirical evaluation on real-world datasets, the CBAR modelling approach has significantly improved the overall accuracy of simultaneous inference on transportation mode and human activity of mobile users. The accuracy of activity and context recognition can also be influenced progressively by how reliable user annotations are. Essentially, reliable user annotation is required for activity and context recognition. These annotations are usually acquired during data capture in the world. We research the needs of reducing user burden effectively during mobile sensor data collection, through experience sampling of these annotations in-the-wild. To this end, we design CoAct-nnotate --- a technique that aims to improve the sampling of human activities and contexts by providing accurate annotation prediction and facilitates interactive user feedback acquisition for ubiquitous sensing. CoAct-nnotate incorporates a novel multi-view multi-instance learning mechanism to perform more accurate annotation prediction. It also includes a progressive learning process (i.e., model retraining based on co-training and active learning) to improve its predictive performance over time. Moving beyond context recognition of mobile users, human activities can be related to essential tasks that the users perform in daily life. Conversely, the boundaries between the types of tasks are inherently difficult to establish, as they can be defined differently from the individuals' perspectives. Consequently, we investigate the implication of contextual signals for user tasks in mobile sensing applications. To define the boundary of tasks and hence recognise them, we incorporate such situation inference process (i.e., task recognition) into the proposed Intelligent Task Recognition (ITR) framework to learn users' Cyber-Physical-Social activities from their mobile sensing data. By recognising the engaged tasks accurately at a given time via mobile sensing, an intelligent system can then offer proactive supports to its user to progress and complete their tasks. Finally, for robust and effective learning of mobile sensing data from heterogeneous sources (e.g., Internet-of-Things in a mobile crowdsensing scenario), we investigate the utility of sensor data in provisioning their storage and design QDaS --- an application agnostic framework for quality-driven data summarisation. This allows an effective data summarisation by performing density-based clustering on multivariate time series data from a selected source (i.e., data provider). Thus, the source selection process is determined by the measure of data quality. Nevertheless, this framework allows intelligent systems to retain comparable predictive results by its effective learning on the compact representations of mobile sensing data, while having a higher space saving ratio. This thesis contains novel contributions in terms of the techniques that can be employed for mobile situation inference and context recognition, especially in the domain of ubiquitous computing and intelligent assistive technologies. This research implements and extends the capabilities of machine learning techniques to solve real-world problems on multi-context recognition, mobile data summarisation and situation inference from mobile sensing. We firmly believe that the contributions in this research will help the future study to move forward in building more intelligent systems and applications