78,409 research outputs found

    Improved Decision Tree Methodology for the Attributes of Unknown or Uncertain Characteristics – Construction Project Prospective

    Get PDF
    Increasing use of computers, leads to accumulation of data of an organization, demanding the need of sophisticated data handling techniques. Many data handling concepts have evolved that support data analysis, and knowledge discovery. Data warehouse and Data mining techniques are playing an important role in the area of data analysis for knowledge discovery. These techniques typically address the four basic applications such as data classification, data clustering, association between data and finding sequential patterns between the data. Various algorithms that address to classification on large data sets have proved to be efficient in classifying the variables of known or certain characteristics. However they are less effective when applied to the analysis of variable of unknown or uncertain characteristics and creating classes by combining multiple correlated variables in real world. A methodology presented in the paper that addresses two major issues of data classification using decision tree, 1) classification of variables of unknown or uncertain characteristics, 2) creating classification by combining multiple correlated variables

    Concept analysis-based association mining from linked data: A case in industrial decision making

    Get PDF
    International audienceLinked data (LD) is a rich format increasingly exploited in knowledge discovery from data (KDD). To that end, LD is typically structured as graph, but can also fit the multi-relational data mining (MRDM) paradigm, e.g. as multiple types and object properties may be used in the dataset. Formal concept analysis (FCA) has been successfully used as theoretical framework for KDD in a variety of applications , primely in clustering and association rule mining (ARM) tasks. As FCA applicability to LD is limited by its single data table input format, relational concept analysis (RCA) was introduced as a MRDM extension that successfully deals with links in the data, including cyclic ones. While RCA has been mainly adapted for conceptual clustering in the past, we present here an RCA-based ARM method. It exploits the iterative nature of pattern generation to cut cyclic references with a minimal loss of information. The utility of the rules discovered by our method has been validated by an application as a decision support in the aluminum die casting industry

    Discovering medication patterns for high-complexity drug-using diseases through electronic medical records

    Get PDF
    An Electronic Medical Record (EMR) is a professional document that contains all data generated during the treatment process. The EMR can utilize various data formats, such as numerical data, text, and images. Mining the information and knowledge hidden in the huge amount of EMR data is an essential requirement for clinical decision support, such as clinical pathway formulation and evidence-based medical research. In this paper, we propose a machine-learning-based framework to mine the hidden medication patterns in EMR text. The framework systematically integrates the Jaccard similarity evaluation, spectral clustering, the modified Latent Dirichlet Allocation and cross-matching among multiple features to find the residuals that describe additional knowledge and clusters hidden in multiple perspectives of highly complex medication patterns. These methods work together, step by step to reveal the underlying medication pattern. We evaluated the method by using real data from EMR text (patients with cirrhotic ascites) from a large hospital in China. The proposed framework outperforms other approaches for medication pattern discovery, especially for this disease with subtle medication treatment variances. The results also revealed little overlap among the discovered patterns; thus, the distinct features of each pattern are well studied through the proposed framework

    Sentimental classification analysis of polarity multi-view textual data using data mining techniques

    Get PDF
    The data and information available in most community environments is complex in nature. Sentimental data resources may possibly consist of textual data collected from multiple information sources with different representations and usually handled by different analytical models. These types of data resource characteristics can form multi-view polarity textual data. However, knowledge creation from this type of sentimental textual data requires considerable analytical efforts and capabilities. In particular, data mining practices can provide exceptional results in handling textual data formats. Besides, in the case of the textual data exists as multi-view or unstructured data formats, the hybrid and integrated analysis efforts of text data mining algorithms are vital to get helpful results. The objective of this research is to enhance the knowledge discovery from sentimental multi-view textual data which can be considered as unstructured data format to classify the polarity information documents in the form of two different categories or types of useful information. A proposed framework with integrated data mining algorithms has been discussed in this paper, which is achieved through the application of X-means algorithm for clustering and HotSpot algorithm of association rules. The analysis results have shown improved accuracies of classifying the sentimental multi-view textual data into two categories through the application of the proposed framework on online polarity user-reviews dataset upon a given topics
    • …
    corecore