167,809 research outputs found
Web Mining Research: A Survey
With the huge amount of information available online, the World Wide Web is a
fertile area for data mining research. The Web mining research is at the cross
road of research from several research communities, such as database,
information retrieval, and within AI, especially the sub-areas of machine
learning and natural language processing. However, there is a lot of confusions
when comparing research efforts from different point of views. In this paper,
we survey the research in the area of Web mining, point out some confusions
regarded the usage of the term Web mining and suggest three Web mining
categories. Then we situate some of the research with respect to these three
categories. We also explore the connection between the Web mining categories
and the related agent paradigm. For the survey, we focus on representation
issues, on the process, on the learning algorithm, and on the application of
the recent works as the criteria. We conclude the paper with some research
issues.Comment: 15 page
Multi Relational Data Mining Approaches: A Data Mining Technique
The multi relational data mining approach has developed as an alternative way
for handling the structured data such that RDBMS. This will provides the mining
in multiple tables directly. In MRDM the patterns are available in multiple
tables (relations) from a relational database. As the data are available over
the many tables which will affect the many problems in the practice of the data
mining. To deal with this problem, one either constructs a single table by
Propositionalisation, or uses a Multi-Relational Data Mining algorithm. MRDM
approaches have been successfully applied in the area of bioinformatics. Three
popular pattern finding techniques classification, clustering and association
are frequently used in MRDM. Multi relational approach has developed as an
alternative for analyzing the structured data such as relational database. MRDM
allowing applying directly in the data mining in multiple tables. To avoid the
expensive joining operations and semantic losses we used the MRDM technique.
This paper focuses some of the application areas of MRDM and feature directions
as well as the comparison of ILP, GM, SSDM and MRDMComment: 10 pages, 1 Figure, 3 Tables "Published with International Journal of
Computer Applications (IJCA)
An analytical framework for data stream mining techniques based on challenges and requirements
A growing number of applications that generate massive streams of data need
intelligent data processing and online analysis. Real-time surveillance
systems, telecommunication systems, sensor networks and other dynamic
environments are such examples. The imminent need for turning such data into
useful information and knowledge augments the development of systems,
algorithms and frameworks that address streaming challenges. The storage,
querying and mining of such data sets are highly computationally challenging
tasks. Mining data streams is concerned with extracting knowledge structures
represented in models and patterns in non stopping streams of information.
Generally, two main challenges are designing fast mining methods for data
streams and need to promptly detect changing concepts and data distribution
because of highly dynamic nature of data streams. The goal of this article is
to analyze and classify the application of diverse data mining techniques in
different challenges of data stream mining. In this paper, we present the
theoretical foundations of data stream analysis and propose an analytical
framework for data stream mining techniques
A Survey on Web Multimedia Mining
Modern developments in digital media technologies has made transmitting and
storing large amounts of multi/rich media data (e.g. text, images, music, video
and their combination) more feasible and affordable than ever before. However,
the state of the art techniques to process, mining and manage those rich media
are still in their infancy. Advances developments in multimedia acquisition and
storage technology the rapid progress has led to the fast growing incredible
amount of data stored in databases. Useful information to users can be revealed
if these multimedia files are analyzed. Multimedia mining deals with the
extraction of implicit knowledge, multimedia data relationships, or other
patterns not explicitly stored in multimedia files. Also in retrieval, indexing
and classification of multimedia data with efficient information fusion of the
different modalities is essential for the system's overall performance. The
purpose of this paper is to provide a systematic overview of multimedia mining.
This article is also represents the issues in the application process component
for multimedia mining followed by the multimedia mining models.Comment: 13 Pages; The International Journal of Multimedia & Its Applications
(IJMA) Vol.3, No.3, August 201
An Algorithm for Mining High Utility Closed Itemsets and Generators
Traditional association rule mining based on the support-confidence framework
provides the objective measure of the rules that are of interest to users.
However, it does not reflect the utility of the rules. To extract non-redundant
association rules in support-confidence framework frequent closed itemsets and
their generators play an important role. To extract non-redundant association
rules among high utility itemsets, high utility closed itemsets (HUCI) and
their generators should be extracted in order to apply traditional
support-confidence framework. However, no efficient method exists at present
for mining HUCIs with their generators. This paper addresses this issue. A
post-processing algorithm, called the HUCI-Miner, is proposed to mine HUCIs
with their generators. The proposed algorithm is implemented using both
synthetic and real datasets
A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques
The amount of text that is generated every day is increasing dramatically.
This tremendous volume of mostly unstructured text cannot be simply processed
and perceived by computers. Therefore, efficient and effective techniques and
algorithms are required to discover useful patterns. Text mining is the task of
extracting meaningful information from text, which has gained significant
attentions in recent years. In this paper, we describe several of the most
fundamental text mining tasks and techniques including text pre-processing,
classification and clustering. Additionally, we briefly explain text mining in
biomedical and health care domains.Comment: some of References format have update
Enabling Edge Cloud Intelligence for Activity Learning in Smart Home
We propose a novel activity learning framework based on Edge Cloud
architecture for the purpose of recognizing and predicting human activities.
Although activity recognition has been vastly studied by many researchers, the
temporal features that constitute an activity, which can provide useful
insights for activity models, have not been exploited to their full potentials
by mining algorithms. In this paper, we utilize temporal features for activity
recognition and prediction in a single smart home setting. We discover activity
patterns and temporal relations such as the order of activities from real data
to develop a prompting system. Analysis of real data collected from smart homes
was used to validate the proposed method
Survey of state-of-the-art mixed data clustering algorithms
Mixed data comprises both numeric and categorical features, and mixed
datasets occur frequently in many domains, such as health, finance, and
marketing. Clustering is often applied to mixed datasets to find structures and
to group similar objects for further analysis. However, clustering mixed data
is challenging because it is difficult to directly apply mathematical
operations, such as summation or averaging, to the feature values of these
datasets. In this paper, we present a taxonomy for the study of mixed data
clustering algorithms by identifying five major research themes. We then
present a state-of-the-art review of the research works within each research
theme. We analyze the strengths and weaknesses of these methods with pointers
for future research directions. Lastly, we present an in-depth analysis of the
overall challenges in this field, highlight open research questions and discuss
guidelines to make progress in the field.Comment: 20 Pages, 2 columns, 6 Tables, 209 Reference
Literature Review Of Attribute Level And Structure Level Data Linkage Techniques
Data Linkage is an important step that can provide valuable insights for
evidence-based decision making, especially for crucial events. Performing
sensible queries across heterogeneous databases containing millions of records
is a complex task that requires a complete understanding of each contributing
databases schema to define the structure of its information. The key aim is to
approximate the structure and content of the induced data into a concise
synopsis in order to extract and link meaningful data-driven facts. We identify
such problems as four major research issues in Data Linkage: associated costs
in pair-wise matching, record matching overheads, semantic flow of information
restrictions, and single order classification limitations. In this paper, we
give a literature review of research in Data Linkage. The purpose for this
review is to establish a basic understanding of Data Linkage, and to discuss
the background in the Data Linkage research domain. Particularly, we focus on
the literature related to the recent advancements in Approximate Matching
algorithms at Attribute Level and Structure Level. Their efficiency,
functionality and limitations are critically analysed and open-ended problems
have been exposed.Comment: 20 page
Knowledge Discovery System For Fiber Reinforced Polymer Matrix Composite Laminate
In this paper Knowledge Discovery System (KDS) is proposed and implemented
for the extraction of knowledge-mean stiffness of a polymer composite material
in which when fibers are placed at different orientations. Cosine amplitude
method is implemented for retrieving compatible polymer matrix and
reinforcement fiber which is coming under predicted fiber class, from the
polymer and reinforcement database respectively, based on the design
requirements. Fuzzy classification rules to classify fibers into short, medium
and long fiber classes are derived based on the fiber length and the computed
or derive critical length of fiber. Longitudinal and Transverse module of
Polymer Matrix Composite consisting of seven layers with different fiber volume
fractions and different fibers orientations at 0,15,30,45,60,75 and 90 degrees
are analyzed through Rule-of Mixture material design model. The analysis
results are represented in different graphical steps and have been measured
with statistical parameters. This data mining application implemented here has
focused the mechanical problems of material design and analysis. Therefore,
this system is an expert decision support system for optimizing the materials
performance for designing light-weight and strong, and cost effective polymer
composite materials.Comment: International Journal of Computing, Vol. 2, Issue 7, pp. 121-130,
July 2010. (ISSN 2151-9617
- …