Search CORE

4,009 research outputs found

BCS SGAI SMA 2013: the BCS SGAI workshop on social media analysis

Author
Publication venue: M. Jeusfeld
Publication date: 01/01/2013
Field of study

Portsmouth University Research Portal (Pure)

Data Mining Techniques for Mining Query Logs in Web Search Engines

Author: Al-Hegami Ahmed,
Al-Omaisi Hussein,
Publication venue: HAL CCSD
Publication date: 01/04/2017
Field of study

International audienceThe Web is the biggest repository of documents humans have ever built. Even more, it is increasingly growing in size every day. Users rely on Web search engines (WSEs) for finding information on the Web. By submitting a textual query expressing their information need, WSE users obtain a list of documents that are highly relevant to the query. Moreover, WSEs store such huge amount of users activities in query logs. Query log mining is the set of techniques aiming at extracting valuable knowledge from query logs. This knowledge represents one of the most used ways of enhancing the users search experience. The primary focus of this work is on introducing the data mining techniques for mining query logs in web search engines and showing how search engines applications may benefit from this mining

Interactive Constrained {B}oolean Matrix Factorization

Author: Miettinen P.
Mukuze N.
Publication venue
Publication date: 01/01/2016
Field of study

MPG.PuRe

Latitude, longitude, and beyond:mining mobile objects' behavior

Author: Baratchi Mitra
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 24/06/2015
Field of study

Rapid advancements in Micro-Electro-Mechanical Systems (MEMS), and wireless communications, have resulted in a surge in data generation. Mobility data is one of the various forms of data, which are ubiquitously collected by different location sensing devices. Extensive knowledge about the behavior of humans and wildlife is buried in raw mobility data. This knowledge can be used for realizing numerous viable applications ranging from wildlife movement analysis, to various location-based recommendation systems, urban planning, and disaster relief. With respect to what mentioned above, in this thesis, we mainly focus on providing data analytics for understanding the behavior and interaction of mobile entities (humans and animals). To this end, the main research question to be addressed is: How can behaviors and interactions of mobile entities be determined from mobility data acquired by (mobile) wireless sensor nodes in an accurate and efficient manner? To answer the above-mentioned question, both application requirements and technological constraints are considered in this thesis. On the one hand, applications requirements call for accurate data analytics to uncover hidden information about individual behavior and social interaction of mobile entities, and to deal with the uncertainties in mobility data. Technological constraints, on the other hand, require these data analytics to be efficient in terms of their energy consumption and to have low memory footprint, and processing complexity

University of Twente Research Information

Data Mining: The Next Generation

Author: Agrawal Rakesh
Bollinger Toni
Clifton Christopher W.
Dzeroski Saso
Freytag Johann-Christoph
Hipp Jochen
Keim Daniel
Kramer Stefan
Kriegel Hans-Peter
Leser Ulf
Liu Bing
Mannila Heikki
Meo Rosa
Morishita Shinichi
Ng Raymond
Pei Jian
Raghavan Prabhakar
Ramakrishnan Raghu
Spiliopoulou Myra
Srivastava Jaideep
Torra Vicenc
Publication venue: Dagstuhl Seminar Proceedings. 04292 - Perspectives Workshop: Data Mining: The Next Generation
Publication date: 01/01/2005
Field of study

Dagstuhl Research Online Publication Server

Framework based on complex networks to model and mine patient pathways

Author: Gomes Antônio Tadeu Azevedo
Ito Márcia
Rosa Caroline de Oliveira Costa Souza
Vieira Alex Borges
Wehmuth Klaus
Publication venue
Publication date: 25/09/2023
Field of study

The automatic discovery of a model to represent the history of encounters of a group of patients with the healthcare system -- the so-called ``pathway of patients'' -- is a new field of research that supports clinical and organisational decisions to improve the quality and efficiency of the treatment provided. The pathways of patients with chronic conditions tend to vary significantly from one person to another, have repetitive tasks, and demand the analysis of multiple perspectives (interventions, diagnoses, medical specialities, among others) influencing the results. Therefore, modelling and mining those pathways is still a challenging task. In this work, we propose a framework comprising: (i) a pathway model based on a multi-aspect graph, (ii) a novel dissimilarity measurement to compare pathways taking the elapsed time into account, and (iii) a mining method based on traditional centrality measures to discover the most relevant steps of the pathways. We evaluated the framework using the study cases of pregnancy and diabetes, which revealed its usefulness in finding clusters of similar pathways, representing them in an easy-to-interpret way, and highlighting the most significant patterns according to multiple perspectives.Comment: 35 pages, 11 figures, 2 appendice

arXiv.org e-Print Archive

Combination of web usage, content and structure information for diverse web mining applications in the tourism context and the context of users with disabilities

Author: Lojo Novo Aizea
Publication venue
Publication date: 27/07/2015
Field of study

188 p.This PhD focuses on the application of machine learning techniques for behaviourmodelling in different types of websites. Using data mining techniques two aspects whichare problematic and difficult to solve have been addressed: getting the system todynamically adapt to possible changes of user preferences, and to try to extract theinformation necessary to ensure the adaptation in a transparent manner for the users,without infringing on their privacy. The work in question combines information of differentnature such as usage information, content information and website structure and usesappropriate web mining techniques to extract as much knowledge as possible from thewebsites. The extracted knowledge is used for different purposes such as adaptingwebsites to the users through proposals of interesting links, so that the users can get therelevant information more easily and comfortably; for discovering interests or needs ofusers accessing the website and to inform the service providers about it; or detectingproblems during navigation.Systems have been successfully generated for two completely different fields: thefield of tourism, working with the website of bidasoa turismo (www.bidasoaturismo.com)and, the field of disabled people, working with discapnet website (www.discapnet.com)from ONCE/Tecnosite foundation

Archivo Digital para la Docencia y la Investigación

Multidimensional process discovery

Author: Ribeiro J.T.S.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2013
Field of study

Repository TU/e

Pure OAI Repository

Recommending Best Products from E-commerce Purchase History and User Click Behavior Data

Author: Xiao Ying
Publication venue: 'University of Windsor Leddy Library'
Publication date: 27/04/2018
Field of study

E-commerce collaborative filtering recommendation systems, the main input data of user-item rating matrix is a binary purchase data showing only what items a user has purchased recently. This matrix is usually sparse and does not provide a lot of information about customer purchases or product clickstream behavior (eg., clicks, basket placement, and purchase) history, which possibly can improve product recommendations accuracy. Existing recommendation systems in E-commerce with clickstream data include those referred in this thesis as Kim05Rec, Kim11Rec, and Chen13Rec. Kim05Rec forms a decision tree on click behavior attributes such as search type and visit times, discovers the possibility of a user putting products into the basket and uses the information to enrich the user-item rating matrix. If a user clicked a product, Kim11Rec then finds the associated products for it in three stages such as click, basket and purchase, uses the lift value from these stages and calculates a score, it then uses the score to make recommendations. Chen13Rec measures the similarity of users on their category click patterns such as click sequences, click times and visit duration; it then can use the similarity to enhance the collaborative filtering algorithm. However, the similarity between click sequences in sessions can apply to the purchases to some extent, especially for sessions without purchases, this will be able to predict purchases for those session users. But the existing systems have not integrated it, or the historical purchases which shows more than whether or not a user has purchased a product before. In this thesis, we propose HPCRec (Historical Purchase with Clickstream based Recommendation System) to enrich the ratings matrix from both quantity and quality aspects. HPCRec firstly forms a normalized rating-matrix with higher quality ratings from historical purchases, then mines consequential bond between clicks and purchases with weighted frequencies where the weights are similarities between sessions, but rating quantity is better by integrating this information. The experimental results show that our approach HPCRec is more accurate than these existing methods, HPCRec is also capable of handling infrequent cases whereas the existing methods can not

Scholarship at UWindsor