Search CORE

3 research outputs found

A Metapattern-Based Automated Discovery Loop for Integrated Data Mining - Unsupervised Learning of Relational Patterns

Author: Bing Leng
Wei-Min Shen
Publication venue
Publication date: 01/01/1996
Field of study

Metapattern (also known as metaquery) is a new approach for integrated data mining systems. Different from a typical "tool-box" like integration, where components must be picked and chosen by users without much help, metapatterns provide a common representation for inter-component communication as well as a human interface for hypothesis development and search control. One weakness of this approach, however, is that the task of generating fruitful metapatterns is still a heavy burden for human users. In this paper, we describe a metapattern generator and an integrated discovery loop that can automatically generate metapatterns. Experiments in both artificial and real-world databases have shown that this new system goes beyond the existing machine learning technologies, and can discover relational patterns without requiring humans to pre-label the data as positive or negative examples for some given target concepts. With this technology, future data mining systems could discover high-qu..

CiteSeerX

Web-based strategies in the manufacturing industry

Author: Velásquez Luis Alexis
Publication venue
Publication date: 01/01/2000
Field of study

The explosive growth of Internet-based architectures is allowing an efficient access to information resources over geographically dispersed areas. This fact is exerting a major influence on current manufacturing practices. Business activities involving customers, partners, employees and suppliers are being rapidly and efficiently integrated through networked information management environments. Therefore, efforts are required to take advantage of distributed infrastructures that can satisfy information integration and collaborative work strategies in corporate environments. In this research, Internet-based distributed solutions focused on the manufacturing industry are proposed. Three different systems have been developed for the tooling sector, specifically for the company Seco Tools UK Ltd (industrial collaborator). They are summarised as follows. SELTOOL is a Web-based open tool selection system involving the analysis of technical criteria to establish appropriate selection of inserts, toolholders and cutting data for turning, threading and grooving operations. It has been oriented to world-wide Seco customers. SELTOOL provides an interactive and crossed-way of searching for tooling parameters, rather than conventional representation schemes provided by catalogues. Mechanisms were developed to filter, convert and migrate data from different formats to the database (SQL-based) used by SELTOOL.TTS (Tool Trials System) is a Web-based system developed by the author and two other researchers to support Seco sales engineers and technical staff, who would perform tooling trials in geographically dispersed machining centres and benefit from sharing data and results generated by these tests. Through TTS tooling engineers (authorised users) can submit and retrieve highly specific technical tooling data for both milling and turning operations. Moreover, it is possible for tooling engineers to avoid the execution of new tool trials knowing the results of trials carried out in physically distant places, when another engineer had previously executed these trials. The system incorporates encrypted security features suitable for restricted use on the World Wide Web. An urgent need exists for tools to make sense of raw data, extracting useful knowledge from increasingly large collections of data now being constructed and made available from networked information environments. This explosive growth in the availability of information is overwhelming the capabilities of traditional information management systems, to provide efficient ways of detecting anomalies and significant patterns in large sets of data. Inexorably, the tooling industry is generating valuable experimental data. It is a potential and unexplored sector regarding the application of knowledge capturing systems. Hence, to address this issue, a knowledge discovery system called DISKOVER was developed. DISKOVER is an integrated Java-application consisting of five data mining modules, able to be operated through the Internet. Kluster and Q-Fast are two of these modules, entirely developed by the author. Fuzzy-K has been developed by the author in collaboration with another research student in the group at Durham. The final two modules (R-Set and MQG) have been developed by another member of the Durham group. To develop Kluster, a complete clustering methodology was proposed. Kluster is a clustering application able to combine the analysis of quantitative as well as categorical data (conceptual clustering) to establish data classification processes. This module incorporates two original contributions. Specifically, consistent indicators to measure the quality of the final classification and application of optimisation methods to the final groups obtained. Kluster provides the possibility, to users, of introducing case-studies to generate cutting parameters for particular Input requirements. Fuzzy-K is an application having the advantages of hierarchical clustering, while applying fuzzy membership functions to support the generation of similarity measures. The implementation of fuzzy membership functions helped to optimise the grouping of categorical data containing missing or imprecise values. As the tooling database is accessed through the Internet, which is a relatively slow access platform, it was decided to rely on faster Information retrieval mechanisms. Q-fast is an SQL-based exploratory data analysis (EDA) application, Implemented for this purpose

Durham e-Theses

OpenGrey Repository