32 research outputs found

    Electrotextiles

    No full text

    Cluster Analysis in R with Big Data Applications

    No full text
    This chapter discusses several popular clustering functions and open source software packages in R and their feasibility of use on larger datasets. These will include the kmeans() function, the pvclust package, and the DBSCAN (density-based spatial clustering of applications with noise) package, which implement K-means, hierarchical, and density-based clustering, respectively. Dimension reduction methods such as PCA (principle component analysis) and SVD (singular value decomposition), as well as the choice of distance measure, are explored as methods to improve the performance of hierarchical and model-based clustering methods on larger datasets. These methods are illustrated through an application to a dataset of RNA-sequencing expression data for cancer patients obtained from the Cancer Genome Atlas Kidney Clear Cell Carcinoma (TCGA-KIRC) data collection from The Cancer Imaging Archive (TCIA)

    Secure Two-Party Association Rule Mining Based on One-Pass FP-Tree a one-pass FP-tree method to perform association rule mining without compromising any data privacy among two parties

    No full text
    Data mining, often referred as the major part in knowledge discovery in database (KDD) is the process of discovering knowledge for decision making in business by utilizing patterns or models existed in data. In this paper, it is proposed that a one-pass Frequent Path tree (FP-tree) method will perform association rule mining without compromising the data privacy between two parties

    Semi-Automatic Ontology Construction by Exploiting Functional Dependencies and Association Rules

    No full text
    This paper presents a novel semi-automatic approach to construct conceptual ontologies over structured data by exploiting both the schema and content of the input dataset. It effectively combines two well-founded database and data mining techniques, i.e., functional dependency discovery and association rule mining, to support domain experts in the construction of meaningful ontologies, tailored to the analyzed data, by using Description Logic (DL). To this aim, functional dependencies are first discovered to highlight valuable conceptual relationships among attributes of the data schema (i.e., among concepts). The set of discovered correlations effectively support analysts in the assertion of the Tbox ontological statements (i.e., the statements involving shared data conceptualizations and their relationships). Then, the analyst-validated dependencies are exploited to drive the association rule mining process. Association rules represent relevant and hidden correlations among data content and they are used to provide valuable knowledge at the instance level. The pushing of functional dependency constraints into the rule mining process allows analysts to look into and exploit only the most significant data item recurrences in the assertion of the Abox ontological statements (i.e., the statements involving concept instances and their relationships)
    corecore