Search CORE

123,761 research outputs found

Web Usage Mining with Evolutionary Extraction of Temporal Fuzzy Association Rules

Author: Ahmadi Samad
Gongora Mario A.
Hopgood Adrian A.
Matthews Stephen G.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

In Web usage mining, fuzzy association rules that have a temporal property can provide useful knowledge about when associations occur. However, there is a problem with traditional temporal fuzzy association rule mining algorithms. Some rules occur at the intersection of fuzzy sets' boundaries where there is less support (lower membership), so the rules are lost. A genetic algorithm (GA)-based solution is described that uses the flexible nature of the 2-tuple linguistic representation to discover rules that occur at the intersection of fuzzy set boundaries. The GA-based approach is enhanced from previous work by including a graph representation and an improved fitness function. A comparison of the GA-based approach with a traditional approach on real-world Web log data discovered rules that were lost with the traditional approach. The GA-based approach is recommended as complementary to existing algorithms, because it discovers extra rules. (C) 2013 Elsevier B.V. All rights reserved

Open Repository and Bibliography - Liège

De Montfort University Open Research Archive

Explore Bristol Research

A Survey on Web Usage Mining

Author: Dr J. Vellingiri
S. Chenthur Pandian
Publication venue: Global Journals Inc. (US)
Publication date: 17/02/2011
Field of study

Now a day World Wide Web become very popular and interactive for transferring of information. The web is huge, diverse and active and thus increases the scalability, multimedia data and temporal matters. The growth of the web has outcome in a huge amount of information that is now freely offered for user access. The several kinds of data have to be handled and organized in a manner that they can be accessed by several users effectively and efficiently. So the usage of data mining methods and knowledge discovery on the web is now on the spotlight of a boosting number of researchers. Web usage mining is a kind of data mining method that can be useful in recommending the web usage patterns with the help of users2019; session and behavior. Web usage mining includes three process, namely, preprocessing, pattern discovery and pattern analysis. There are different techniques already exists for web usage mining. Those existing techniques have their own advantages and disadvantages. This paper presents a survey on some of the existing web usage mining techniques

Global Journal of Computer Science and Technology (GJCST)

Overcoming data scarcity of Twitter: using tweets as bootstrap with application to autism-related topic content analysis

Author: Agarwal A.
Autism
Blei D.
Bollen J.
Chang J.
Danial J. T.
Harrington J. W.
Harshavardhan A.
Higashida N.
Himelboim I.
Hutchings C.
Hviid A.
Ishwaran H.
Jacobson J. W.
Jashinsky J.
Jiang L.
Paul M. J.
Paul M. J.
Robinson B.
Russell M. A.
Scanfeld D.
Teh Y. W.
Teh Y. W.
Trembath D.
Verma S.
Warren Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Notwithstanding recent work which has demonstrated the potential of using Twitter messages for content-specific data mining and analysis, the depth of such analysis is inherently limited by the scarcity of data imposed by the 140 character tweet limit. In this paper we describe a novel approach for targeted knowledge exploration which uses tweet content analysis as a preliminary step. This step is used to bootstrap more sophisticated data collection from directly related but much richer content sources. In particular we demonstrate that valuable information can be collected by following URLs included in tweets. We automatically extract content from the corresponding web pages and treating each web page as a document linked to the original tweet show how a temporal topic model based on a hierarchical Dirichlet process can be used to track the evolution of a complex topic structure of a Twitter community. Using autism-related tweets we demonstrate that our method is capable of capturing a much more meaningful picture of information exchange than user-chosen hashtags.Comment: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 201

arXiv.org e-Print Archive

Deakin Research Online

Crossref

Preprocessing and Content/Navigational Pages Identification as Premises for an Extended Web Usage Mining Model Development

Author: Dan Andrei SITAR TAUT
Daniel MICAN
Publication venue
Publication date
Field of study

From its appearance until nowadays, the internet saw a spectacular growth not only in terms of websites number and information volume, but also in terms of the number of visitors. Therefore, the need of an overall analysis regarding both the web sites and the content provided by them was required. Thus, a new branch of research was developed, namely web mining, that aims to discover useful information and knowledge, based not only on the analysis of websites and content, but also on the way in which the users interact with them. The aim of the present paper is to design a database that captures only the relevant data from logs in a way that will allow to store and manage large sets of temporal data with common tools in real time. In our work, we rely on different web sites or website sections with known architecture and we test several hypotheses from the literature in order to extend the framework to sites with unknown or chaotic structure, which are non-transparent in determining the type of visited pages. In doing this, we will start from non-proprietary, preexisting raw server logs.Knowledge Management, Web Mining, Data Preprocessing, Decision Trees, Databases

Research Papers in Economics

Rancang Bangun Aplikasi Text Mining dalam Mengelompokkan Judul Penelitian Dosen Menggunakan Metode Shared Nearest Neighbor dan Euclidean Similarity

Author: Mushlihudin Mushlihudin
Zahrotun Lisna
Publication venue: 'Universitas Ahmad Dahlan, Kampus 3'
Publication date: 30/12/2017
Field of study

Data mining adalah proses untuk mengekstrak informasi tersembunyi menjadi sebuah pengetahuan. Beberapa jenis data dalam data mining adalah web mining, text mining, sequence mining, graph mining, temporal data mining, mining spatial data, Mining data terdistribusi dan multimedia mining. Pengelompokan dokumen merupakan salah satu teknik dari text mining. Tujuan penelitian ini adalah untuk membangun aplikasi pengelompokkan judul penelitian dosen menggunakan metode shared nearest neighbor. Metode yang digunakan dalam penelitian merupakan salah satu metode pengelompokkan dalam text mining yaitu shared nearest neighbor (SNN) dengan euclidean similarity. Pengujian dilakukan menggunakan black box test. Hasil dari penelitian ini adalah aplikasi text mining yang mampu mengelompokkan judul penelitian dose

Journal of Education and Learning (EduLearn)

UAD Journal Management System

Online data mining services for dynamic spatial databases I: system architecture and client applications

Author: Capeta Nuno
Cardoso Jorge C. S.
Carvalho Vasco
Costa Manuel
Fonseca Alexandra
Franco Ivan
Henriques Diana
Rosa Paulo
Sousa Inês
Teixeira Luis Miguel Lopes
Publication venue
Publication date: 01/05/2005
Field of study

This paper describes online data mining services for dynamic spatial databases connected to environmental monitoring networks. These services can use Artificial Neural Networks as data mining techniques to find temporal relations in monitored parameters. The execution of the data mining algorithms is performed at the server side and a distributed processing scheme is used to overcome problems of scalability. To support the discovery of temporal relations, two other families of online services are made available: vectorial and raster visualization services and a sonification service. The use of this system is illustrated by the DM Plus client application and the SNIRH Data Mining Web site. The sonification service is described and illustrated in the part II paper

Repositório Institucional da Universidade Católica Portuguesa

Time-sensitive opinion mining for prediction

Author: Cheung D
Mamoulis N
Tu W
Publication venue
Publication date: 01/01/2015
Field of study

Users commonly use Web 2.0 platforms to post their opinions and their predictions about future events (e.g., the movement of astock). Therefore, opinion mining can be used as a tool for predicting future events. Previous work on opinion mining extracts from the text only the polarity of opinions as sentiment indicators. We observe that a typical opinion post also contains temporal references which can improve prediction. This short paper presents our preliminary work on extracting reference time tagsand integrating them into an opinion mining model, in order to improvethe accuracy of future event prediction. We conduct anexperimental evaluation using a collection of microblogs posted by investors to demonstrate the effectiveness of our approach.postprin

CiteSeerX

Association for the Advancement of Artificial Intelligence: AAAI Publications

HKU Scholars Hub

Temporal models for mining, ranking and recommendation in the Web

Author: Nguyen Tu
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2020
Field of study

Due to their first-hand, diverse and evolution-aware reflection of nearly all areas of life, heterogeneous temporal datasets i.e., the Web, collaborative knowledge bases and social networks have been emerged as gold-mines for content analytics of many sorts. In those collections, time plays an essential role in many crucial information retrieval and data mining tasks, such as from user intent understanding, document ranking to advanced recommendations. There are two semantically closed and important constituents when modeling along the time dimension, i.e., entity and event. Time is crucially served as the context for changes driven by happenings and phenomena (events) that related to people, organizations or places (so-called entities) in our social lives. Thus, determining what users expect, or in other words, resolving the uncertainty confounded by temporal changes is a compelling task to support consistent user satisfaction. In this thesis, we address the aforementioned issues and propose temporal models that capture the temporal dynamics of such entities and events to serve for the end tasks. Specifically, we make the following contributions in this thesis: (1) Query recommendation and document ranking in the Web - we address the issues for suggesting entity-centric queries and ranking effectiveness surrounding the happening time period of an associated event. In particular, we propose a multi-criteria optimization framework that facilitates the combination of multiple temporal models to smooth out the abrupt changes when transitioning between event phases for the former and a probabilistic approach for search result diversification of temporally ambiguous queries for the latter. (2) Entity relatedness in Wikipedia - we study the long-term dynamics of Wikipedia as a global memory place for high-impact events, specifically the reviving memories of past events. Additionally, we propose a neural network-based approach to measure the temporal relatedness of entities and events. The model engages different latent representations of an entity (i.e., from time, link-based graph and content) and use the collective attention from user navigation as the supervision. (3) Graph-based ranking and temporal anchor-text mining inWeb Archives - we tackle the problem of discovering important documents along the time-span ofWeb Archives, leveraging the link graph. Specifically, we combine the problems of relevance, temporal authority, diversity and time in a unified framework. The model accounts for the incomplete link structure and natural time lagging in Web Archives in mining the temporal authority. (4) Methods for enhancing predictive models at early-stage in social media and clinical domain - we investigate several methods to control model instability and enrich contexts of predictive models at the “cold-start” period. We demonstrate their effectiveness for the rumor detection and blood glucose prediction cases respectively. Overall, the findings presented in this thesis demonstrate the importance of tracking these temporal dynamics surround salient events and entities for IR applications. We show that determining such changes in time-based patterns and trends in prevalent temporal collections can better satisfy user expectations, and boost ranking and recommendation effectiveness over time

Institutionelles Repositorium der Leibniz Universität Hannover