245 research outputs found
Finding influential ebay buyers for viral marketing - a conceptual model of BuyerRank
User Generated Content (UGC) plays a key role in the current and future e-commence at the era of Web 2.0. As an important form of UGC, the online auction site eBay has enjoyed tremendous growth rates since its inception. Many social networks emerge across different communities on eBay. The notion of viral marketing has been proposed in both literature and practice. However, in order to find the "opinion leader" in the social network, marketers need to have a sound analytic tool to rank potential buyers. In order to tackle this issue, this paper propose a BuyerRank, a Social Network Analysis (SNA) model that assists marketers to rank potential buyers based on their future influence estimated from their past auction/purchase behaviour on eBay. The paper also provides a detailed state-of-the-art review of recent work on SNA and viral marketing in the light of the Web 2.0 e-commerce context
Knowledge Extraction from Open Data Repository
The explosion of affluent social networks, online communities, and jointly generated information resources has accelerated the convergence of technological and social networks producing environments that reveal both the framework of the underlying information arrangements and the collective formation of their members. In studying the consequences of these developments, we face the opportunity to analyze the POD repository at unprecedented scale levels and extract useful information from query log data. This chapter aim is to improve the performance of a POD repository from a different point of view. Firstly, we propose a novel query recommender system to help users shorten their query sessions. The idea is to find shortcuts to speed up the user interaction with the open data repository and decrease the number of queries submitted. The proposed model, based on pseudo-relevance feedback, formalizes exploiting the knowledge mined from query logs to help users rapidly satisfy their information need
Exploring enterprises competition: From a perspective of massive recruitment texts mining
Extant research has made limited efforts to conduct competitive intelligence analysis based on recruitment texts. To fill the gap, this study proposes a method for deriving and analyzing competitive relationships, identifying competition paths, and calculating asymmetric competitiveness degrees, from the recruitment texts on e-recruiting websites. Specifically, this study developed a competitive evaluation index system for companies’ skill needs and resource base based on 53,171 job descriptions and 42,641 company profiles published by companies across 8 industries (including 35 industry segments) using automated text processing methods. Furthermore, in order to identify competitive paths and calculate the degree of asymmetric competitiveness, this study proposes a modified bipartite graph approach (i.e., MBGA) for competitive intelligence analysis of recruitment texts based on the competition evaluation index system. Experiments on a real-world dataset of the representative companies clearly validated the effectiveness of the method. Compared to the five state-of-the-art methods, MBGA performs better in disclosing the overall competition and is more accurate in terms of the error rating ratio (i.e., ERR) of the competition
Recommended from our members
Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns. The Development and Evaluation of New Web Mining Methods that enhance Information Retrieval and improve the Understanding of User¿s Web Behavior in Websites and Social Blogs.
The rapid growth of the World Wide Web in the last decade makes it the largest publicly accessible data source in the world, which has become one of the most significant and influential information revolution of modern times. The influence of the Web has impacted almost every aspect of humans' life, activities and fields, causing paradigm shifts and transformational changes in business, governance, and education. Moreover, the rapid evolution of Web 2.0 and the Social Web in the past few years, such as social blogs and friendship networking sites, has dramatically transformed the Web from a raw environment for information consumption to a dynamic and rich platform for information production and sharing worldwide. However, this growth and transformation of the Web has resulted in an uncontrollable explosion and abundance of the textual contents, creating a serious challenge for any user to find and retrieve the relevant information that he truly seeks to find on the Web. The process of finding a relevant Web page in a website easily and efficiently has become very difficult to achieve. This has created many challenges for researchers to develop new mining techniques in order to improve the user experience on the Web, as well as for organizations to understand the true informational interests and needs of their customers in order to improve their targeted services accordingly by providing the products, services and information that truly match the requirements of every online customer.
With these challenges in mind, Web mining aims to extract hidden patterns and discover useful knowledge from Web page contents, Web hyperlinks, and Web usage logs. Based on the primary kinds of Web data used in the mining process, Web mining tasks can be categorized into three main types: Web content mining, which extracts knowledge from Web page contents using text mining techniques, Web structure mining, which extracts patterns from the hyperlinks that represent the structure of the website, and Web usage mining, which mines user's Web navigational patterns from Web server logs that record the Web page access made by every user, representing the interactional activities between the users and the Web pages in a website. The main goal of this thesis is to contribute toward addressing the challenges that have been resulted from the information explosion and overload on the Web, by proposing and developing novel Web mining-based approaches. Toward achieving this goal, the thesis presents, analyzes, and evaluates three major contributions. First, the development of an integrated Web structure and usage mining approach that recommends a collection of hyperlinks for the surfers of a website to be placed at the homepage of that website. Second, the development of an integrated Web content and usage mining approach to improve the understanding of the user's Web behavior and discover the user group interests in a website. Third, the development of a supervised classification model based on recent Social Web concepts, such as Tag Clouds, in order to improve the retrieval of relevant articles and posts from Web social blogs
Context-aware Services for Mobile Devices: From Architecture Design to Empirical Inference
Currently, mobile devices are aware of user position, which can be provided to mobile apps for the development of tailored services known as Location-Based Services. Further advances on current Location-based Services (LBS), i.e. using any other information from the user such as gender, music preferences etc, may lead to transition from a Location-Based environment to a fully developed ContextAware environment.The current trend towards Context-aware Services (CAS) is reflected in academic research since more than twenty years as well as in the progress in Software Development Kits (SDKs) of the main mobile operating systems, where CAS frameworks are currently being used. However, there is no community agreement for modelling context CAS and little is known about the architecture of these context management frameworks of the mobile operating systems.Based on previous research in the area of CAS, I establish and analyse a reasoning architecture, the Context Engine (CE), that enables the main steps of designing and implementing context-aware services. The chief utility of CAS is their ability to formulate and encapsulate information, obtain user context through context acquisition tools and distribute it to third-party applications that build personalised services based on the provided information. The CE has the responsibility of selecting the optimal context acquisition tool to solve a concrete problem which is discussed in this dissertation.Furthermore, this thesis contributes to the development of context inference tools by studying two particular cases. The first case aims at inferring user (semantic) location information based on mobile phone usage data. This first case has been carried out in collaboration with Microsoft Finland, which provides a similar context inference solution to mobile developers through their Software Development Kit (SDK). The second case aims at inferring user information based on social network information, i.e. infer user information based on his or her connections. Both studies yield positive results and have the potential to be extended to obtain better context acquisition tools and, therefore, better user context
Temporal Information Models for Real-Time Microblog Search
Real-time search in Twitter and other social media services is often biased
towards the most recent results due to the “in the moment” nature of topic
trends and their ephemeral relevance to users and media in general. However,
“in the moment”, it is often difficult to look at all emerging topics and single-out
the important ones from the rest of the social media chatter. This thesis proposes
to leverage on external sources to estimate the duration and burstiness of live
Twitter topics. It extends preliminary research where itwas shown that temporal
re-ranking using external sources could indeed improve the accuracy of results.
To further explore this topic we pursued three significant novel approaches: (1)
multi-source information analysis that explores behavioral dynamics of users,
such as Wikipedia live edits and page view streams, to detect topic trends
and estimate the topic interest over time; (2) efficient methods for federated
query expansion towards the improvement of query meaning; and (3) exploiting
multiple sources towards the detection of temporal query intent. It differs from
past approaches in the sense that it will work over real-time queries, leveraging
on live user-generated content. This approach contrasts with previous methods
that require an offline preprocessing step
Prediction, evolution and privacy in social and affiliation networks
In the last few years, there has been a growing interest in studying online social and affiliation networks, leading to a new category of inference problems that consider the actor characteristics and their social environments. These problems have a variety of applications, from creating more effective marketing campaigns to designing better personalized services. Predictive statistical models allow learning hidden information automatically in these networks but also bring many privacy concerns. Three of the main challenges that I address in my thesis are understanding 1) how the complex observed and unobserved relationships among actors can help in building better behavior models, and in designing more accurate predictive algorithms, 2) what are the processes that drive the network growth and link formation, and 3) what are the implications of predictive algorithms to the privacy of users who share content online.
The majority of previous work in prediction, evolution and privacy in online social networks has concentrated on the single-mode networks which form around user-user links, such as friendship and email communication. However, single-mode networks often co-exist with two-mode affiliation networks in which users are linked to other entities, such as social groups, online content and events. We study the interplay between these two types of networks and show that analyzing these higher-order interactions can reveal dependencies that are difficult to extract from the pair-wise interactions alone. In particular, we present our contributions to the challenging problems of collective classification, link prediction, network evolution, anonymization and preserving privacy in social and affiliation networks. We evaluate our models on real-world data sets from well-known online social networks, such as Flickr, Facebook, Dogster and LiveJournal
Open Data
Open data is freely usable, reusable, or redistributable by anybody, provided there are safeguards in place that protect the data’s integrity and transparency. This book describes how data retrieved from public open data repositories can improve the learning qualities of digital networking, particularly performance and reliability. Chapters address such topics as knowledge extraction, Open Government Data (OGD), public dashboards, intrusion detection, and artificial intelligence in healthcare
Temporal models for mining, ranking and recommendation in the Web
Due to their first-hand, diverse and evolution-aware reflection of nearly all areas of life, heterogeneous temporal datasets i.e., the Web, collaborative knowledge bases and social networks have been emerged as gold-mines for content analytics of many sorts. In those collections, time plays an essential role in many crucial information retrieval and data mining tasks, such as from user intent understanding, document ranking to advanced recommendations. There are two semantically closed
and important constituents when modeling along the time dimension, i.e., entity and event. Time is crucially served as the context for changes driven by happenings and phenomena (events) that related to people, organizations or places (so-called entities) in our social lives. Thus, determining what users expect, or in other words, resolving the uncertainty confounded by temporal changes is a compelling task to support consistent user satisfaction.
In this thesis, we address the aforementioned issues and propose temporal models that capture the temporal dynamics of such entities and events to serve for the end tasks. Specifically, we make the following contributions in this thesis:
(1) Query recommendation and document ranking in the Web - we address the issues for suggesting entity-centric queries and ranking effectiveness surrounding the happening time period of an associated event. In particular, we propose a multi-criteria optimization framework that facilitates the combination of multiple temporal models to smooth out the abrupt changes when transitioning between event phases for the former and a probabilistic approach for search result diversification of temporally ambiguous queries for the latter.
(2) Entity relatedness in Wikipedia - we study the long-term dynamics of Wikipedia as a global memory place for high-impact events, specifically the reviving memories of past events. Additionally, we propose a neural network-based approach to measure the temporal relatedness of entities and events. The model engages different latent representations of an entity (i.e., from time, link-based graph and content) and use the collective attention from user navigation as the supervision.
(3) Graph-based ranking and temporal anchor-text mining inWeb Archives - we tackle the problem of discovering important documents along the time-span ofWeb Archives, leveraging the link graph. Specifically, we combine the problems of relevance, temporal authority, diversity and time in a unified framework. The model accounts for the incomplete link structure and natural time lagging in Web Archives in mining the temporal authority.
(4) Methods for enhancing predictive models at early-stage in social media and clinical domain - we investigate several methods to control model instability and enrich contexts of predictive models at the “cold-start” period. We demonstrate their effectiveness for the rumor detection and blood glucose prediction cases respectively.
Overall, the findings presented in this thesis demonstrate the importance of tracking these temporal dynamics surround salient events and entities for IR applications. We show that determining such changes in time-based patterns and trends in prevalent temporal collections can better satisfy user expectations, and boost ranking and recommendation effectiveness over time
- …