Search CORE

546 research outputs found

Identification of Online Users' Social Status via Mining User-Generated Data

Author: Zhao Tao
Publication venue: University Goettingen Repository
Publication date: 05/09/2019
Field of study

With the burst of available online user-generated data, identifying online users’ social status via mining user-generated data can play a significant role in many commercial applications, research and policy-making in many domains. Social status refers to the position of a person in relation to others within a society, which is an abstract concept. The actual definition of social status is specific in terms of specific measure indicator. For example, opinion leadership measures individual social status in terms of influence and expertise in an online society, while socioeconomic status characterizes personal real-life social status based on social and economic factors. Compared with traditional survey method which is time-consuming, expensive and sometimes difficult, some efforts have been made to identify specific social status of users based on specific user-generated data using classic machine learning methods. However, in fact, regarding specific social status identification based on specific user-generated data, the specific case has several specific challenges. However, classic machine learning methods in existing works fail to address these challenges, which lead to low identification accuracy. Given the importance of improving identification accuracy, this thesis studies three specific cases on identification of online and offline social status. For each work, this thesis proposes novel effective identification method to address the specific challenges for improving accuracy. The first work aims at identifying users’ online social status in terms of topic-sensitive influence and knowledge authority in social community question answering sites, namely identifying topical opinion leaders who are both influential and expert. Social community question answering (SCQA) site, an innovative community question answering platform, not only offers traditional question answering (QA) services but also integrates an online social network where users can follow each other. Identifying topical opinion leaders in SCQA has become an important research area due to the significant role of topical opinion leaders. However, most previous related work either focus on using knowledge expertise to find experts for improving the quality of answers, or aim at measuring user influence to identify influential ones. In order to identify the true topical opinion leaders, we propose a topical opinion leader identification framework called QALeaderRank which takes account of both topic-sensitive influence and topical knowledge expertise. In the proposed framework, to measure the topic-sensitive influence of each user, we design a novel influence measure algorithm that exploits both the social and QA features of SCQA, taking into account social network structure, topical similarity and knowledge authority. In addition, we propose three topic-relevant metrics to infer the topical expertise of each user. The extensive experiments along with an online user study show that the proposed QALeaderRank achieves significant improvement compared with the state-of-the-art methods. Furthermore, we analyze the topic interest change behaviors of users over time and examine the predictability of user topic interest through experiments. The second work focuses on predicting individual socioeconomic status from mobile phone data. Socioeconomic Status (SES) is an important social and economic aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised Hypergraph based Factor Graph Model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on individual SES prediction with using a set of anonymized real mobile phone data. The third work is to predict social media users’ socioeconomic status based on their social media content, which is useful for related organizations and companies in a range of applications, such as economic and social policy-making. Previous work leverage manually defined textual features and platform-based user level attributes from social media content and feed them into a machine learning based classifier for SES prediction. However, they ignore some important information of social media content, containing the order and the hierarchical structure of social media text as well as the relationships among user level attributes. To this end, we propose a novel coupled social media content representation model for individual SES prediction, which not only utilizes a hierarchical neural network to incorporate the order and the hierarchical structure of social media text but also employs a coupled attribute representation method to take into account intra-coupled and inter-coupled interaction relationships among user level attributes. The experimental results show that the proposed model significantly outperforms other stat-of-the-art models on a real dataset, which validate the efficiency and robustness of the proposed model

Georg-August-University Göttingen

Recommended from our members

Design and Implementation of a Web Usage Mining Model Based On Upgrowth and Preflxspan

Author: Wang Hengshan
Yang Cheng
Zeng Hua
Publication venue: CSUSB ScholarWorks
Publication date: 06/01/2015
Field of study

Web Usage Mining (WUM) integrates the techniques of two popular research fields - Data Mining and the Internet. By analyzing the potential rules hidden in web logs, WUM helps personalize the delivery of web content and improve web design, customer satisfaction and user navigation through pre-fetching and caching. This paper introduces two prevalent data mining algorithms - FPgrowth and PrefixSpan into WUM and they are applied in a real business case. Maximum Forward Path (MFP) is also used in the web usage mining model during sequential pattern mining along with PrefixSpan so as to reduce the interference of false visit caused by browser cache and raise the accuracy of mining frequent traversal paths. Detailed analysis and application on the corresponding results are discussed

CSUSB ScholarWorks

Alter ego, state of the art on user profiling: an overview of the most relevant organisational and behavioural aspects regarding User Profiling.

Author: Dijk J.A.G.M. van
Ebbers W.E.
Fennis B.M.
Geest T.M. van der
Loorbach N.R.
Pieterson W.J.
Steehouder M.F.
Taal E.
Vries P.W. de
Publication venue: Telematica Instituut
Publication date: 01/01/2005
Field of study

This report gives an overview of the most relevant organisational and\ud behavioural aspects regarding user profiling. It discusses not only the\ud most important aims of user profiling from both an organisation’s as\ud well as a user’s perspective, it will also discuss organisational motives\ud and barriers for user profiling and the most important conditions for\ud the success of user profiling. Finally recommendations are made and\ud suggestions for further research are given

University of Twente Research Information

New Fundamental Technologies in Data Mining

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

Directory of Open Access Books (DOAB)

Data Mining

Author
Publication venue: 'IntechOpen'
Publication date: 27/07/2022
Field of study

The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining

Directory of Open Access Books (DOAB)

Proceedings of the 20th BCS HCI Group conference Volume Two

Author: Fields Bob
Healey Patrick
Nickerson Louise Valgerdur
Stockman Tony
Publication venue
Publication date: 30/12/2013
Field of study

Queen Mary Research Online

Archives, Access and Artificial Intelligence: Working with Born-Digital and Digitized Archival Collections

Author
Publication venue: 'Transcript Verlag'
Publication date: 01/01/2022
Field of study

Digital archives are transforming the Humanities and the Sciences. Digitized collections of newspapers and books have pushed scholars to develop new, data-rich methods. Born-digital archives are now better preserved and managed thanks to the development of open-access and commercial software. Digital Humanities have moved from the fringe to the center of academia. Yet, the path from the appraisal of records to their analysis is far from smooth. This book explores crossovers between various disciplines to improve the discoverability, accessibility, and use of born-digital archives and other cultural assets

SSOAR - Social Science Open Access Repository

Archives, Access and Artificial Intelligence

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 05/05/2022
Field of study

Directory of Open Access Books (DOAB)

Methodology for the Implementation of Knowledge Management Systems 2.0 - A Case Study in an Oil and Gas Company

Author: Chalmeta Ricardo
Orenga-Roglá Sergio
Publication venue: AIS Electronic Library (AISeL)
Publication date: 04/06/2019
Field of study

Web 2.0 and Big Data tools can be used to develop knowledge management systems based on facilitating the participation and collaboration of people in order to enhance knowledge. The paper presents a methodology that can help organizations with the use of Web 2.0 and Big Data tools to discover, gather, manage and apply their knowledge by making the process of implementing a knowledge management system faster and simpler. First, an initial version of the methodology was developed and it was then applied to an oil and gas company in order to analyze and refine it. The results obtained show the effectiveness of the methodology, since it helped this company to carry out the implementation quickly and effectively, thereby allowing the company to gain the maximum benefits from existing knowledge

AIS Electronic Library (AISeL)