184 research outputs found

    Social relationship classification based on interaction data from smartphones.

    Get PDF
    無線通信和移動技術已經從根本上改變了人和人之間相互通信的方式,隨著像智能手機這樣功能強大的移動設備不斷普及,現在我們有更多的機會去監測用戶的運動狀態、社交情況和地理位置等信息。近期,越來越多的基於智能手機的傳感研究相繼出現,這些研究利用智能手機中的多種傳感、定位以及近距離無線設備來識別手機用戶當前的活動狀態和周圍環境。一些可識別用戶活動狀態和監控身體健康狀況的移動應用程式已經被開發并投入使用。儘管如此,當前大部份關於智能手機的研究忽視了這樣一個問題,智能手機是用戶與外界通信的一個指令中心。移動用戶可以使用智能手機用很多種方式聯繫他們的朋友,例如打電話、發送短消息、電子郵件、或者通過即時通信程序或者社交網絡,這些多渠道的通信方式和人與人之間面對面的交流一樣重要,因此智能手機是識別用戶和其他聯繫人的社會關係的關鍵。在本論文中,我們提出用智能手機中 獨有的多渠道用戶通信數據來對用戶的的社會關係進行分類。作為我們研究的開始,我們生成人工的通信數據並且用社交矩陣來為人與人之間的通信建立模型,這也幫助我們測試了很多可以應用在此類問題的數據挖掘算法。接下來,我們通過招募真實用戶來採集他們的各種社交通信數據,這些數據包括手機通話記錄、電子郵件、社交網絡(Facebook和Renren)和面對面的交流。通過在社交矩陣上應用不同的分類算法,我們發現SVM的分類性能要超過KNN和決策樹算法,SVM對於社會關係的分類準確率可以達到82.4%。我們也對來自不同渠道的通信數據進行了比較,最終發現來自社交網絡和面對面交流的數據在社交關係分類中起更大的作用。另外,我們通過使用降低維度算法可以把社交矩陣從65維度映射到9維度,關係分類的準確率卻沒有明顯降低,在降低維度的過程中我們也可以提取出用戶主要的通信特徵,從而更好地解釋社會關係分類的原理。最後,我們也應用了CUR矩陣分解算法從社交矩陣65列中選出13列建立新的社交矩陣,關係分類的準確率從82.4%降低到77.7%,但是我們卻可以通過 CUR來選擇合適的傳感器抽樣採集頻率,這樣可以在利用手機採集數據過程中節省更多手機電量。Wireless Communications and Mobile Computing have fundamentally changed the way people interact and communicate with each other. The proliferation of powerful and programmable mobile devices, smartphones in particular, has offered an unprecedented opportunity to continuously monitor the physical, social and geographical activities of their users. Recently, much research has been done on smartphone-based sensing which leverages the rich set of sensing, positioning and short-range radio capabilities of the smartphones to identify the context of user activities and ambient environment conditions. Mobile applications for personal behavior tracking and physical wellness monitoring have also been developed. Despite that, most of the existing work in mobile sensing has neglected the role of smartphone as the command-center of the user’s communications with the outside world. As mobile users contact their friends via phone, SMS, emails, instant messaging, and other online social-networking applications, these multi-modal communication activities are as equally important as physical activities in proling one’s life. They also hold the key to understand the user’s social relationship with other people of interest. In this thesis, we propose to use the unique multi-model interaction data from smartphone to classify social relationships. To jump start our study, we generate articial interaction data and build social interaction matrix to modeMl the interaction between people. This also helps us in testing a wide range of data mining analysis techniques for this type of problem. We then carry out a social interaction data collection campaign with a group of real users to obtain real-life multi-modal communication data, e.g., phone call, Email, online social network(Facebook and Renren), and physical location/proximity. After applying different classification algorithms on social interaction matrix, we find that SVM outperforms KNN and decision tree algorithms, with a classification accuracy of 82.4% (the accuracies of KNN and decision tree are 79.9% and 77.6%, respectively). We also compare the data from different interaction channels and finally find that on-line social network and location/proximity data contribute more to the overall classification accuracy. Additionally, with dimensionality reduction algorithms, the social interaction matrix can be embedded from 65 to 9 dimensional space while preserving the high classification accuracy and we also get principle interaction features as by-product. At last, we use CUR decomposi¬tion to select 13 out of the 65 features in the social interaction matrix. The classification accuracy drops from 82.4% to 77.7% after CUR decomposition. But it can help to determine the right sensor sampling frequency so as to enhance energy efficiency for social data collection.Detailed summary in vernacular field only.Sun, Deyi.Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 90-96).Abstracts also in Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- Research Background --- p.7Chapter 2.1 --- Related work of social relationship analysis --- p.7Chapter 2.1.1 --- Community detection in social network --- p.8Chapter 2.1.2 --- Social influence analysis --- p.10Chapter 2.1.3 --- Modeling social interaction data --- p.10Chapter 2.1.4 --- Social relationship prediction --- p.12Chapter 2.2 --- Classification methodologies --- p.14Chapter 2.2.1 --- Algorithms for social relationship classification --- p.14Chapter 2.2.2 --- Algorithms for dimensionality reduction --- p.16Chapter 3 --- Problem Formulation of Relationship Classicification --- p.19Chapter 3.1 --- Multi-modal data in smartphones --- p.20Chapter 3.2 --- Formulation of relationship classification problem --- p.21Chapter 3.3 --- Refinement of feature definition and energy efficiency --- p.27Chapter 3.4 --- Chapter summary --- p.28Chapter 4 --- Social Interaction Data Acquisition --- p.30Chapter 4.1 --- Social interaction data collection campaign overview --- p.31Chapter 4.2 --- Format of raw interaction data --- p.33Chapter 4.3 --- Building social interaction matrix with real-life interaction data --- p.37Chapter 4.4 --- Chapter summary --- p.43Chapter 5 --- Statistical Analysis of Social Interaction Data --- p.45Chapter 5.1 --- Coverage of social interaction data --- p.46Chapter 5.2 --- Social relationships statistics --- p.48Chapter 5.3 --- Social relationship interaction patterns --- p.52Chapter 5.4 --- Chapter summary --- p.59Chapter 6 --- Automatic Social Relationship Classification Based on Smartphone Interaction Data --- p.61Chapter 6.1 --- Comparison of different classification algorithms --- p.62Chapter 6.2 --- Advantages of multi-modal interaction data --- p.65Chapter 6.3 --- Comparison of interaction data in different communication channels --- p.67Chapter 6.4 --- Dimensionality reduction on social interaction data --- p.72Chapter 6.5 --- Discussions in deploying social relationship classification application --- p.80Chapter 6.5.1 --- Considerations of user privacy --- p.81Chapter 6.5.2 --- Saving smartphone resources --- p.82Chapter 6.6 --- Chapter summary --- p.83Chapter 7 --- Conclusion and Future Work --- p.86Bibliography --- p.9

    Wearable Wireless Devices

    Get PDF
    No abstract available

    Wearable Wireless Devices

    Get PDF
    No abstract available

    Smart workplaces: a system proposal for stress management

    Get PDF
    Over the past last decades of contemporary society, workplaces have become the primary source of many health issues, leading to mental problems such as stress, depression, and anxiety. Among the others, environmental aspects have shown to be the causes of stress, illness, and lack of productivity. With the arrival of new technologies, especially in the smart workplaces field, most studies have focused on investigating the building energy efficiency models and human thermal comfort. However, little has been applied to occupants’ stress recognition and well-being overall. Due to this fact, this present study aims to propose a stress management solution for an interactive design system that allows the adapting of comfortable environmental conditions according to the user preferences by measuring in real-time the environmental and biological characteristics, thereby helping to prevent stress, as well as to enable users to cope stress when being stressed. The secondary objective will focus on evaluating one part of the system: the mobile application. The proposed system uses several usability methods to identify users’ needs, behavior, and expectations from the user-centered design approach. Applied methods, such as User Research, Card Sorting, and Expert Review, allowed us to evaluate the design system according to Heuristics Analysis, resulting in improved usability of interfaces and experience. The study presents the research results, the design interface, and usability tests. According to the User Research results, temperature and noise are the most common environmental stressors among the users causing stress and uncomfortable conditions to work in, and the preference for physical activities over the digital solutions for coping with stress. Additionally, the System Usability Scale (SUS) results identified that the system’s usability was measured as “excellent” and “acceptable” with a final score of 88 points out of the 100. It is expected that these conclusions can contribute to future investigations in the smart workplaces study field and their interaction with the people placed there.Nas últimas décadas da sociedade contemporânea, o local de trabalho tem se tornado principal fonte de muitos problemas de saúde mental, como o stress, depressão e ansiedade. Os aspetos ambientais têm se revelado como as causas de stress, doenças, falta de produtividade, entre outros. Atualmente, com a chegada de novas tecnologias, principalmente na área de locais de trabalho inteligentes, a maioria dos estudos tem se concentrado na investigação de modelos de eficiência energética de edifícios e conforto térmico humano. No entanto, pouco foi aplicado ao reconhecimento do stress dos ocupantes e ao bem-estar geral das pessoas. Diante disso, o objetivo principal é propor um sistema de design de gestão do stress para um sistema de design interativo que permita adaptar as condições ambientais de acordo com as preferências de utilizador, medindo em tempo real as características ambientais e biológicas, auxiliando assim na prevenção de stress, bem como ajuda os utilizadores a lidar com o stress quando estão sob o mesmo. O segundo objetivo é desenhar e avaliar uma parte do projeto — o protótipo da aplicação móvel através da realização de testes de usabilidade. O sistema proposto resulta da abordagem de design centrado no utilizador, utilizando diversos métodos de usabilidade para identificar as necessidades, comportamentos e as expectativas dos utilizadores. Métodos aplicados, como Pesquisa de Usuário, Card Sorting e Revisão de Especialistas, permitiram avaliar o sistema de design de acordo com a análise heurística, resultando numa melhoria na usabilidade das interfaces e experiência. O estudo apresenta os resultados da pesquisa, a interface do design e os testes de usabilidade. De acordo com os resultados de User Research, a temperatura e o ruído são os stressores ambientais mais comuns entre os utilizadores, causando stresse e condições menos favoráveis para trabalhar, igualmente existe uma preferência por atividades físicas sobre as soluções digitais na gestão do stresse. Adicionalmente, os resultados de System Usability Scale (SUS) identificaram a usabilidade do sistema de design como “excelente” e “aceitável” com pontuação final de 88 pontos em 100. É esperado que essas conclusões possam contribuir para futuras investigações no campo de estudo dos smart workplaces e sua interação com os utilizadores

    Brain activity on encoding different textures EEG signal acquisition with ExoAtlet®

    Get PDF
    Powered exoskeletons play a crucial role in the rehabilitation field improving the quality of life for those who need them. Thus, being a major contribution for patients integration into society, providing them with more autonomy and freedom. In spite of these positive outcomes, a thorough description of the brain correlates connected to exoskeleton control is still needed. For instance, the perception of different pavement textures when wearing an exoskeleton is probably going to cause changes in cerebral activity, which could impact both sensory encoding and Brain-Computer Interface (BCI) control. Therefore, the main goal of this work is to describe the brain activity response to different textured pavements using ExoAtlet ® powered exoskeleton. In order to measure, process, analyze and classify the impact of different textures on neurophysiological rhythms, 4-minute signals were recorded by Electroencephalogram (EEG) with a 16-channel cap (actiCAP by Brain Products). Each of the three experimental subjects was instructed to walk in place on four different types of pavement (flat, carpet, foam, and rubber circles) with and without the exoskeleton, for a total of eight different experimental conditions. A counterbalanced design was applied, and informed consent was obtained from participants (Committee for Health Sciences of the Universidade Católica Portuguesa - 99/2022). Additionally, four machine learning methods, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), and Artificial Neural Network (ANN), were selected in order to analyze three distinct classification problems. This study found that there were changes associated with the delta frequency band for electrodes C3 and C4, and when comparing the classifiers performance, LDA presented the best accuracy across the three classification problems involving all subjects. Thereby, this work concludes that the results are consistent with the hypothesis that sensory processing of pavement textures during exoskeleton control induces neural changes and delta variations of the C3 and C4 electrodes. Additionally, LDA demonstrated the best performance across the three classifications of subject-independent problems.Os exoesqueletos motorizados desempenham um papel crucial no campo da reabilitação, melhorando a qualidade de vida das pessoas que deles necessitam. Deste modo, são um contributo importante para que os pacientes com condições físicas limitadas sejam mais facilmente integrados na sociedade, proporcionando-lhes mais autonomia e liberdade. Embora esta tecnologia tenha os seus aspetos positivos, ainda existe a necessidade de descrever os correlatos cerebrais direcionados para o controlo do exoesqueleto. Por exemplo, a percepção de diferentes pavimentos quando se usa um exoesqueleto vai provavelmente causar alterações na actividade cerebral, o que pode ter impacto tanto na codificação sensorial como no controlo da interface cérebro-máquina (BCI). Deste modo, o principal objetivo deste trabalho é descrever a atividade cerebral às diferentes texturas dos pavimentos, utilizando o exoesqueleto ExoAtlet ®. A fim de medir, processar, analisar e classificar o impacto de diferentes texturas em ritmos neurofisiológicos, foram registados sinais de 4 minutos atravês the Eletroencefalograma (EEG) com uma touca de 16 canais (actiCAP by Brain Products). Cada um dos três voluntários foi instruído a dar passos no lugar em quatro tipos diferentes de pavimento (plano, alcatifa, espuma, e círculos de borracha) com e sem o exosqueleto, num total de oito condições experimentais diferentes. Foi aplicado um desenho contrabalançado e foi obtido o consentimento informado dos participantes (Comissão para as Ciências da Saúde da Universidade Católica Portuguesa - 99/2022). Adicionalmente, foram selecionados quatro classificadores: máquinas de vetores de suporte (SVM), k-vizinhos mais próximos (KNN), análise discriminante linear (LDA) e redes neuronais artificiais (ANN) para analisar três problemas de classificação distintos. Os resultados obtidos por este estudo demonstraram que existiam alterações associadas à banda de frequência delta para os eléctrodos C3 e C4 e, ao comparar o desempenho dos classificadores, o LDA apresentou a melhor exatidão nos três problemas de classificação envolvendo todos os sujeitos. Assim, estes resultados são consistentes com a hipótese de que o processamento sensorial dos pavimentos durante o controlo do exoesqueleto induz alterações neuronais

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur

    Finding the online cry for help : automatic text classification for suicide prevention

    Get PDF
    Successful prevention of suicide, a serious public health concern worldwide, hinges on the adequate detection of suicide risk. While online platforms are increasingly used for expressing suicidal thoughts, manually monitoring for such signals of distress is practically infeasible, given the information overload suicide prevention workers are confronted with. In this thesis, the automatic detection of suicide-related messages is studied. It presents the first classification-based approach to online suicidality detection, and focuses on Dutch user-generated content. In order to evaluate the viability of such a machine learning approach, we developed a gold standard corpus, consisting of message board and blog posts. These were manually labeled according to a newly developed annotation scheme, grounded in suicide prevention practice. The scheme provides for the annotation of a post's relevance to suicide, and the subject and severity of a suicide threat, if any. This allowed us to derive two tasks: the detection of suicide-related posts, and of severe, high-risk content. In a series of experiments, we sought to determine how well these tasks can be carried out automatically, and which information sources and techniques contribute to classification performance. The experimental results show that both types of messages can be detected with high precision. Therefore, the amount of noise generated by the system is minimal, even on very large datasets, making it usable in a real-world prevention setting. Recall is high for the relevance task, but at around 60%, it is considerably lower for severity. This is mainly attributable to implicit references to suicide, which often go undetected. We found a variety of information sources to be informative for both tasks, including token and character ngram bags-of-words, features based on LSA topic models, polarity lexicons and named entity recognition, and suicide-related terms extracted from a background corpus. To improve classification performance, the models were optimized using feature selection, hyperparameter, or a combination of both. A distributed genetic algorithm approach proved successful in finding good solutions for this complex search problem, and resulted in more robust models. Experiments with cascaded classification of the severity task did not reveal performance benefits over direct classification (in terms of F1-score), but its structure allows the use of slower, memory-based learning algorithms that considerably improved recall. At the end of this thesis, we address a problem typical of user-generated content: noise in the form of misspellings, phonetic transcriptions and other deviations from the linguistic norm. We developed an automatic text normalization system, using a cascaded statistical machine translation approach, and applied it to normalize the data for the suicidality detection tasks. Subsequent experiments revealed that, compared to the original data, normalized data resulted in fewer and more informative features, and improved classification performance. This extrinsic evaluation demonstrates the utility of automatic normalization for suicidality detection, and more generally, text classification on user-generated content

    Enhancing extremist data classification through textual analysis

    Get PDF
    The high volume of extremist materials on the Internet has created the need for intelligence gathering via the Web and real-time monitoring of potential websites for evidence of extremist activities. However, the manual classification for such contents is practically difficult and time-consuming. In response to this challenge, the work reported here developed several classification frameworks. Each framework provides a basis of text representation before being fed into machine learning algorithm. The basis of text representation are Sentiment-rule, Posit-textual analysis with word-level features, and an extension of Posit analysis, known as Extended-Posit, which adopts character-level as well as word-level data. Identifying some gaps in the aforementioned techniques created avenues for further improvements, most especially in handling larger datasets with better classification accuracy. Consequently, a novel basis of text representation known as the Composite-based method was developed. This is a computational framework that explores the combination of both sentiment and syntactic features of textual contents of a Web page. Subsequently, these techniques are applied on a dataset that had been subjected to a manual classification process, thereafter fed into machine learning algorithm. This is to generate a measure of how well each page can be classified into their appropriate classes. The classifiers considered are both Neural Network (RNN and MLP) and Machine Learning classifiers (such as J48, Random Forest and KNN). In addition, features selection and model optimisation were evaluated to know the cost when creating machine learning model. However, considering all the result obtained from each of the framework, the results indicated that composite features are preferable to solely syntactic or sentiment features which offer improved classification accuracy when used with machine learning algorithms. Furthermore, the extension of Posit analysis to include both word and character-level data out-performed word-level feature alone when applied on the assembled textual data. Moreover, Random Forest classifier outperformed other classifiers explored. Taking cost into account, feature selection improves classification accuracy and save time better than hyperparameter turning (model optimisation).The high volume of extremist materials on the Internet has created the need for intelligence gathering via the Web and real-time monitoring of potential websites for evidence of extremist activities. However, the manual classification for such contents is practically difficult and time-consuming. In response to this challenge, the work reported here developed several classification frameworks. Each framework provides a basis of text representation before being fed into machine learning algorithm. The basis of text representation are Sentiment-rule, Posit-textual analysis with word-level features, and an extension of Posit analysis, known as Extended-Posit, which adopts character-level as well as word-level data. Identifying some gaps in the aforementioned techniques created avenues for further improvements, most especially in handling larger datasets with better classification accuracy. Consequently, a novel basis of text representation known as the Composite-based method was developed. This is a computational framework that explores the combination of both sentiment and syntactic features of textual contents of a Web page. Subsequently, these techniques are applied on a dataset that had been subjected to a manual classification process, thereafter fed into machine learning algorithm. This is to generate a measure of how well each page can be classified into their appropriate classes. The classifiers considered are both Neural Network (RNN and MLP) and Machine Learning classifiers (such as J48, Random Forest and KNN). In addition, features selection and model optimisation were evaluated to know the cost when creating machine learning model. However, considering all the result obtained from each of the framework, the results indicated that composite features are preferable to solely syntactic or sentiment features which offer improved classification accuracy when used with machine learning algorithms. Furthermore, the extension of Posit analysis to include both word and character-level data out-performed word-level feature alone when applied on the assembled textual data. Moreover, Random Forest classifier outperformed other classifiers explored. Taking cost into account, feature selection improves classification accuracy and save time better than hyperparameter turning (model optimisation)
    corecore