31 research outputs found

    Compiler and runtime support for shared memory parallelization of data mining algorithms

    Get PDF
    Abstract. Data mining techniques focus on finding novel and useful patterns or models from large datasets. Because of the volume of the data to be analyzed, the amount of computation involved, and the need for rapid or even interactive analysis, data mining applications require the use of parallel machines. We have been developing compiler and runtime support for developing scalable implementations of data mining algorithms. Our work encompasses shared memory parallelization, distributed memory parallelization, and optimizations for processing disk-resident datasets. In this paper, we focus on compiler and runtime support for shared memory parallelization of data mining algorithms. We have developed a set of parallelization techniques that apply across algorithms for a variety of mining tasks. We describe the interface of the middleware where these techniques are implemented. Then, we present compiler techniques for translating data parallel code to the middleware specification. Finally, we present a brief evaluation of our compiler using apriori association mining and k-means clustering.

    Classification Techniques in Quantitative Comparative Research: A Meta-Comparison

    Get PDF

    EXPLOITING HIGHER ORDER UNCERTAINTY IN IMAGE ANALYSIS

    Get PDF
    Soft computing is a group of methodologies that works synergistically to provide flexible information processing capability for handling real-life ambiguous situations. Its aim is to exploit the tolerance for imprecision, uncertainty, approximate reasoning, and partial truth in order to achieve tractability, robustness, and low-cost solutions. Soft computing methodologies (involving fuzzy sets, neural networks, genetic algorithms, and rough sets) have been successfully employed in various image processing tasks including image segmentation, enhancement and classification, both individually or in combination with other soft computing techniques. The reason of such success has its motivation in the fact that soft computing techniques provide a powerful tools to describe uncertainty, naturally embedded in images, which can be exploited in various image processing tasks. The main contribution of this thesis is to present tools for handling uncertainty by means of a rough-fuzzy framework for exploiting feature level uncertainty. The first contribution is the definition of a general framework based on the hybridization of rough and fuzzy sets, along with a new operator called RF-product, as an effective solution to some problems in image analysis. The second and third contributions are devoted to prove the effectiveness of the proposed framework, by presenting a compression method based on vector quantization and its compression capabilities and an HSV color image segmentation technique

    The 1st Conference of PhD Students in Computer Science

    Get PDF

    Front Matter - Soft Computing for Data Mining Applications

    Get PDF
    Efficient tools and algorithms for knowledge discovery in large data sets have been devised during the recent years. These methods exploit the capability of computers to search huge amounts of data in a fast and effective manner. However, the data to be analyzed is imprecise and afflicted with uncertainty. In the case of heterogeneous data sources such as text, audio and video, the data might moreover be ambiguous and partly conflicting. Besides, patterns and relationships of interest are usually vague and approximate. Thus, in order to make the information mining process more robust or say, human-like methods for searching and learning it requires tolerance towards imprecision, uncertainty and exceptions. Thus, they have approximate reasoning capabilities and are capable of handling partial truth. Properties of the aforementioned kind are typical soft computing. Soft computing techniques like Genetic

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Criteria of Empirical Significance: Foundations, Relations, Applications

    Get PDF
    This dissertation consists of three parts. Part I is a defense of an artificial language methodology in philosophy and a historical and systematic defense of the logical empiricists' application of an artificial language methodology to scientific theories. These defenses provide a justification for the presumptions of a host of criteria of empirical significance, which I analyze, compare, and develop in part II. On the basis of this analysis, in part III I use a variety of criteria to evaluate the scientific status of intelligent design, and further discuss confirmation, reduction, and concept formation

    Knowledge discovery for moderating collaborative projects

    Get PDF
    In today's global market environment, enterprises are increasingly turning towards collaboration in projects to leverage their resources, skills and expertise, and simultaneously address the challenges posed in diverse and competitive markets. Moderators, which are knowledge based systems have successfully been used to support collaborative teams by raising awareness of problems or conflicts. However, the functioning of a moderator is limited to the knowledge it has about the team members. Knowledge acquisition, learning and updating of knowledge are the major challenges for a Moderator's implementation. To address these challenges a Knowledge discOvery And daTa minINg inteGrated (KOATING) framework is presented for Moderators to enable them to continuously learn from the operational databases of the company and semi-automatically update the corresponding expert module. The architecture for the Universal Knowledge Moderator (UKM) shows how the existing moderators can be extended to support global manufacturing. A method for designing and developing the knowledge acquisition module of the Moderator for manual and semi-automatic update of knowledge is documented using the Unified Modelling Language (UML). UML has been used to explore the static structure and dynamic behaviour, and describe the system analysis, system design and system development aspects of the proposed KOATING framework. The proof of design has been presented using a case study for a collaborative project in the form of construction project supply chain. It has been shown that Moderators can "learn" by extracting various kinds of knowledge from Post Project Reports (PPRs) using different types of text mining techniques. Furthermore, it also proposed that the knowledge discovery integrated moderators can be used to support and enhance collaboration by identifying appropriate business opportunities and identifying corresponding partners for creation of a virtual organization. A case study is presented in the context of a UK based SME. Finally, this thesis concludes by summarizing the thesis, outlining its novelties and contributions, and recommending future research

    Criteria of Empirical Significance: Foundations, Relations, Applications

    Get PDF
    This dissertation consists of three parts. Part I is a defense of an artificial language methodology in philosophy and a historical and systematic defense of the logical empiricists' application of an artificial language methodology to scientific theories. These defenses provide a justification for the presumptions of a host of criteria of empirical significance, which I analyze, compare, and develop in part II. On the basis of this analysis, in part III I use a variety of criteria to evaluate the scientific status of intelligent design, and further discuss confirmation, reduction, and concept formation
    corecore