1,288 research outputs found

    User identification and community exploration via mining big personal data in online platforms

    Get PDF
    User-generated big data mining is vital important for large online platforms in terms of security, profits improvement, products recommendation and system management. Personal attributes recognition, user behavior prediction, user identification, and community detection are the most critical and interesting issues that remain as challenges in many real applications in terms of accuracy, efficiency and data security. For an online platform with tens of thousands of users, it is always vulnerable to malicious users who pose a threat to other innocent users and consume unnecessary resources, where accurate user identification is urgently required to prevent corresponding malicious attempts. Meanwhile, accurate prediction of user behavior will help large platforms provide satisfactory recommendations to users and efficiently allocate different amounts of resources to different users. In addition to individual identification, community exploration of large social networks that formed by online databases could also help managers gain knowledge of how a community evolves. And such large scale and diverse social networks can be used to validate network theories, which are previously developed from synthetic networks or small real networks. In this thesis, we study several specific cases to address some key challenges that remain in different types of large online platforms, such as user behavior prediction for cold-start users, privacy protection for user-generated data, and large scale and diverse social community analysis. In the first case, as an emerging business, online education has attracted tens of thousands users as it can provide diverse courses that can exactly satisfy whatever demands of the students. Due to the limitation of public school systems, many students pursue private supplementary tutoring for improving their academic performance. Similar to online shopping platform, online education system is also a user-product based service, where users usually have to select and purchase the courses that meet their demands. It is important to construct a course recommendation and user behavior prediction system based on user attributes or user-generated data. Item recommendation in current online shopping systems is usually based on the interactions between users and products, since most of the personal attributes are unnecessary for online shopping services, and users often provide false information during registration. Therefore, it is not possible to recommend items based on personal attributes by exploiting the similarity of attributes among users, such as education level, age, school, gender, etc. Different from most online shopping platforms, online education platforms have access to a large number of credible personal attributes since accurate personal information is important in education service, and user behaviors could be predicted with just user attribute. Moreover, previous works on learning individual attributes are based primarily on panel survey data, which ensures its credibility but lacks efficiency. Therefore, most works simply include hundreds or thousands of users in the study. With more than 200,000 anonymous K-12 students' 3-year learning data from one of the world's largest online extra-curricular education platforms, we uncover students' online learning behaviors and infer the impact of students' home location, family socioeconomic situation and attended school's reputation/rank on the students' private tutoring course participation and learning outcomes. Further analysis suggests that such impact may be largely attributed to the inequality of access to educational resources in different cities and the inequality in family socioeconomic status. Finally, we study the predictability of students' performance and behaviors using machine learning algorithms with different groups of features, showing students' online learning performance can be predicted based on personal attributes and user-generated data with MAE<10%<10\%. As mentioned above, user attributes are usually fake information in most online platforms, and online platforms are usually vulnerable of malicious users. It is very important to identify the users or verify their attributes. Many researches have used user-generated mobile phone data (which includes sensitive information) to identify diverse user attributes, such as social economic status, ages, education level, professions, etc. Most of these approaches leverage original sensitive user data to build feature-rich models that take private information as input, such as exact locations, App usages and call detailed records. However, accessing users' mobile phone raw data may violate the more and more strict private data protection policies and regulations (e.g. GDPR). We observe that appropriate statistical methods can offer an effective means to eliminate private information and preserve personal characteristics, thus enabling the identification of the user attributes without privacy concern. Typically, identifying an unfamiliar caller's profession is important to protect citizens' personal safety and property. Due to limited data protection of various popular online services in some countries such as taxi hailing or takeouts ordering, many users nowadays encounter an increasing number of phone calls from strangers. The situation may be aggravated when criminals pretend to be such service delivery staff, bringing threats to the user individuals as well as the society. Additionally, more and more people suffer from excessive digital marketing and fraud phone calls because of personal information leakage. Therefore, a real time identification of unfamiliar caller is urgently needed. We explore the feasibility of user identification with privacy-preserved user-generated mobile, and we develop CPFinder, a system which implements automatic user identification callers on end devices. The system could mainly identify four categories of users: taxi drivers, delivery and takeouts staffs, telemarketers and fraudsters, and normal users (other professions). Our evaluation over an anonymized dataset of 1,282 users with a period of 3 months in Shanghai City shows that the CPFinder can achieve an accuracy of 75+\% for multi-class classification and 92.35+\% for binary classification. In addition to the mining of personal attributes and behaviors, the community mining of a large group of people based on online big data also attracts lots of attention due to the accessibility of large scale social network in online platforms. As one of the very important branch of social network, scientific collaboration network has been studied for decades as online big publication databases are easy to access and many user attribute are available. Academic collaborations become regular and the connections among researchers become closer due to the prosperity of globalized academic communications. It has been found that many computer science conferences are closed communities in terms of the acceptance of newcomers' papers, especially are the well-regarded conferences~\cite{cabot2018cs}. However, an in-depth study on the difference in the closeness and structural features of different conferences and what caused these differences is still missing. %Also, reviewing the strong and weak tie theories, there are multifaceted influences exerted by the combination of this two types of ties in different context. More analysis is needed to determine whether the network is closed or has other properties. We envision that social connections play an increasing role in the academic society and influence the paper selection process. The influences are not only restricted within visible links, but also extended to weak ties that connect two distanced node. Previous studies of coauthor networks did not adequately consider the central role of some authors in the publication venues, such as \ac{PC} chairs of the conferences. Such people could influence the evolutionary patterns of coauthor networks due to their authorities and trust for members to select accepted papers and their core positions in the community. Thus, in addition to the ratio of newcomers' papers it would be interesting if the PC chairs' relevant metrics could be quantified to measure the closure of a conference from the perspective of old authors' papers. Additionally, the analysis of the differences among different conferences in terms of the evolution of coauthor networks and degree of closeness may disclose the formation of closed communities. Therefore, we will introduce several different outcomes due to the various structural characteristics of several typical conferences. In this paper, using the DBLP dataset of computer science publications and a PC chair dataset, we show the evidence of the existence of strong and weak ties in coauthor networks and the PC chairs' influences are also confirmed to be related with the tie strength and network structural properties. Several PC chair relevant metrics based on coauthor networks are introduced to measure the closure and efficiency of a conference.2021-10-2

    Setting the Future of Digital and Social Media Marketing Research: Perspectives and Research Propositions

    Get PDF
    in pressThe use of the internet and social media have changed consumer behavior and the ways in which companies conduct their business. Social and digital marketing offers significant opportunities to organizations through lower costs, improved brand awareness and increased sales. However, significant challenges exist from negative electronic word-of-mouth as well as intrusive and irritating online brand presence. This article brings together the collective insight from several leading experts on issues relating to digital and social media marketing. The experts' perspectives offer a detailed narrative on key aspects of this important topic as well as perspectives on more specific issues including artificial intelligence, augmented reality marketing, digital content management, mobile marketing and advertising, B2B marketing, electronic word of mouth and ethical issues therein. This research offers a significant and timely contribution to both researchers and practitioners in the form of challenges and opportunities where we highlight the limitations within the current research, outline the research gaps and develop the questions and propositions that can help advance knowledge within the domain of digital and social marketing.Peer reviewe

    Data-Driven Analysis towards Monitoring Software Evolution by Continuously Understanding Changes in Users’ Needs

    Get PDF
    Ohjelmistot eivĂ€t usein vastaa kĂ€yttĂ€jiensĂ€ odotuksia siitĂ€ huolimatta, ettĂ€ niiden odotetaan tarjoavan riittĂ€vĂ€ toiminnallisuus ja olevan virheettömiĂ€. TĂ€stĂ€ syystĂ€ ohjelmiston yllĂ€pito on vĂ€istĂ€mĂ€töntĂ€ ja tĂ€rkeÀÀ jokaiselle ohjelmistoyritykselle, joka haluaa pitÀÀ tuotteensa tai palvelunsa kannattavana. Koska kilpailu nykyajan ohjelmistomarkkinoilla on tiukkaa ja kĂ€yttĂ€jien on helppo lopettaa tuotteen kĂ€yttö, yritysten on erityisen tĂ€rkeÀÀ tarkkailla ja yllĂ€pitÀÀ kĂ€yttĂ€jĂ€tyytyvĂ€isyyttĂ€ pitkĂ€aikaisen menestyksen turvaamiseksi. TĂ€mĂ€n saavuttamiseksi tĂ€rkeÀÀ on jatkuvasti ymmĂ€rtÀÀ ja kohdata kĂ€yttĂ€jien tarpeet ja odotukset, sillĂ€ on tehokkaampaa kohdentaa yllĂ€pito kĂ€yttĂ€jien esittĂ€mien ongelmien perusteella. Toisaalta internet-teknologiat ovat kehittyneet nopeasti samalla, kun kĂ€yttĂ€jien luoman sisĂ€llön mÀÀrĂ€ on kasvanut rĂ€jĂ€hdysmĂ€isesti. KĂ€yttĂ€jien antama palaute (numeerinen arvostelu, ehdotus tai tekstuaalinen arvio) on esimerkki tĂ€llaisesta kĂ€yttĂ€jien luomasta sisĂ€llöstĂ€ ja sen merkitys tuotteiden kehittĂ€misessĂ€ asiakkaiden tarpeiden pohjalta kasvaa jatkuvasti. KĂ€yttĂ€jien tarpeiden ymmĂ€rtĂ€minen on erityisen tĂ€rkeÀÀ jatkuvaa yllĂ€pitoa ja kehitystystĂ€ vaativissa ohjelmistoissa. TĂ€llöin on myös oleellista ymmĂ€rtÀÀ, miten asiakkaiden mielipiteet muuttuvat ajan kuluessa. TĂ€mĂ€n lisĂ€ksi datan louhimisen ja koneoppimisen kehitys vĂ€hentĂ€vĂ€t vaivaa, joka kĂ€yttĂ€jĂ€n tuottaman datan analysointiin ja erityisesti heidĂ€n kĂ€yttymisensĂ€ ymmĂ€rtĂ€miseen tarvitaan. Vaikka useat tutkimukset ehdottavat tietokeskeistĂ€ lĂ€hestymistĂ€ palautteen arvioin- tiin, ohjelmiston yllĂ€pitoa ja kehitystĂ€ hyödyntĂ€viĂ€ lĂ€hestymistapoja on vĂ€hĂ€n. Monet menetelmĂ€t keskittyvĂ€t arvostelujen analysoinnissa tekstinlouhintaan paljastaakseen kĂ€yttĂ€jien mielipiteet. Useat menetelmĂ€t keskittyvĂ€t myös tunnistamaan ja luokit- telemaan palautetyyppejĂ€ kuten ominaisuuspyyntöjĂ€, virheilmoituksia ja tunteenilmauksia. Jotta ohjelmiston yllĂ€pidosta saataisiin tehokkaampaa, tarvitaankin tehokas lĂ€hestymistapa ohjelmiston havaitun kĂ€yttĂ€jĂ€kokemuksen ja sen muutosten tarkkailuun ohjelmiston kehittyessĂ€.Software products, though always being expected to provide satisfactory functionalities and be bug-free, somehow fail to meet the expectations of their users. Thus, software maintenance is inevitable and critical for any software companies who want their products or services to continue proïŹting. On the other hand, due to the ïŹerce competitiveness in the contemporary software market, as well as the ease of user churns, monitoring and sustaining the satisfaction of the users is a critical criterion for the long-term success of any software products within their evolution stage. To such an end, continuously understanding and meeting the users’ needs and expectations is the key, as it is more efïŹcient and effective to allocate maintenance effort accordingly to address the issues raised by users. On the other hand, accompanied by the rapid development of internet technologies, the volume of user-generated content has been increasing exponentially. Among such user-generated content, feedback from the customers, either numeric rating, recommendation, or textual reviews, have been playing an increasingly critical role in product designs in terms of understanding customers’ needs. Especially for software products that require constant maintenance and are continuously evolving, understanding of users’ needs and complaints, as well as the changes in their opinions through time, is of great importance. Additionally, supported by the advance of data mining and machine learning techniques, the effort of knowledge discovery from analyzing such data and specially understanding the behavior of the users shall be largely reduced. However, though many studies propose data-driven approaches for feedback analysis, the ones speciïŹcally on applying such methods supporting software maintenance and evolution are limited. Many studies focus on the text mining perspective of review analysis towards eliciting users’ opinions. Many others focus on the detection and classiïŹcation of feedback types, e.g., feature requests, bug reports, and emotion expression, etc. For the purpose of enhancing the effectiveness in soft ware maintenance and evolution practice, an effective approach on the software’s perceived user experience and the monitoring of its changes during evolution is re- quired. To support the practice of software maintenance and evolution targeting enhancing user satisfaction, we propose a data-driven user review analysis approach. The contribution of this research aims to answer the following research questions: RQ1. How to analyze users’ collective expectation and perceived quality in use with data- driven approaches by exploiting sentiment and topics? RQ2. How to monitor user satisfaction over software updates during software evolution using reviews’ topics and sentiments? RQ3. How to analyze users’ proïŹles, software types and situational contexts as contexts of use that supports the analysis of user satisfaction? Towards answering RQ1, the thesis proposes a data-driven approach of user perceived quality evaluation and users’ needs extraction via sentiment analysis and topic modeling on large volume of user review data. Based on such outcome, the answer to RQ2 encompasses of 1) the approach to monitor user opinion changes through software evolution by detecting similar topic pairs and 2) the approach to identify the problematic updates based on anomalies in review sentiment distribution. Towards the answer to RQ3, a three-fold analysis is proposed: 1) situational contexts and ways of interaction analysis, 2) user proïŹle and preference analysis and 3) software type and related features analysis. All the above approaches are validated by case studies. This thesis contributes to the examination of applying data-driven end user re- view analysis methods supporting software maintenance and evolution. The main implication is to enrich the existing domain knowledge of software maintenance and evolution in terms of taking advantage of the collective intelligence of end users. In addition, it conveys unique contribution to the research on software evolution con- texts in terms of various meaningful aspects and leads to a potential interdisciplinary contribution as well. On the other hand, this thesis also contributes to software maintenance and evolution practice even in the larger scope of the software industry by proposing an effective series of approaches that address critical issues within. It helps the developers ease their effort in release planning and other decision-making activities

    Mobile Consumers and Applications: Essays on Mobile Marketing

    Get PDF

    Mobile Consumers and Applications: Essays on Mobile Marketing

    Get PDF

    Determinants of continuance intention and word of mouth for hotel branded mobile app users.

    Get PDF
    This study examined the cognitive and affective factors that influence users\u27 post-adoption behavioral intention. Specifically, based on the Expectation Confirmation Model (ECM) (Bhattacherjee, 2001b) the impact of cognitive factors (i.e., perceived usefulness, confirmation of expectations, mobility, personalization and responsiveness) and affective factors (i.e., satisfaction, perceived enjoyment) on hotel branded mobile applications (apps) users\u27 continuance intention and WOM were examined. Hospitality firms invest considerable resources on technology solutions that are aimed at improving the consumer experience. However, for investments to be profitable firms must ensure that technology solutions are continuously used and ensure post-adoptive behaviors such as continuance intention and WOM. Data for the study were collected from 550 hotel branded mobile app users. After data were collected and cleaned, Partial Least-Square Structural Equation Modeling (PLS-SEM) was used to analyze the data. The results of the structural model indicated that continuance intention and WOM were directly influenced by satisfaction and perceived enjoyment; with satisfaction exerting the most influence on continuance intention. Conversely, perceived enjoyment was most influential to WOM. All cognitive factors were found to influence satisfaction and enjoyment, except for responsiveness and perceived usefulness. The results show that contextual factors have a more significant impact than previously established constructs. The results of the study allow hoteliers and hospitality technology consultants to identify the influential factors impacting post-adoptive behaviors. The study extends the literature on post-adoptive behavior and the ECM by including context specific factors (i.e. perceived mobility, personalization and responsiveness). This study contributes to the scare literature in the lodging industry literature examining users\u27 evaluations of mobile apps and post-adoptive behaviors in the hospitality industry. The study adds to the post-adoptive behavior literature by adding WOM as a second outcome to continuance intention. The treatment of contextual factors in this study, allowed to show the impact technology characteristics have on technology post-adoption
    • 

    corecore