141,031 research outputs found

    A cluster based hybrid feature selection approach

    Get PDF
    Data collection and storage capacities have increased significantly in the past decades. In order to cope with the increasingly complexity of data, feature selection methods have become an omnipresent preprocessing step in data analysis. In this paper we present a hybrid (filter — wrapper) feature selection method tailored for data classification problems. Our hybrid approach is composed of two stages. In the first stage, a filter clusters features to identify and remove redundancy. In the second stage, a wrapper evaluates different feature subsets produced by the filter, determining the one that produces the best classification performance in terms of accuracy. The effectiveness of our method is demonstrated through an empirical evaluation performed on real-world datasets coming from various sources.FAPESP (Grant #2011/04247-5 and #2013/18698-4)CNPq (Grant #304137/2013-8

    A hybrid supervised/unsupervised machine learning approach to solar flare prediction

    Get PDF
    We introduce a hybrid approach to solar flare prediction, whereby a supervised regularization method is used to realize feature importance and an unsupervised clustering method is used to realize the binary flare/no-flare decision. The approach is validated against NOAA SWPC data

    Submodular Load Clustering with Robust Principal Component Analysis

    Full text link
    Traditional load analysis is facing challenges with the new electricity usage patterns due to demand response as well as increasing deployment of distributed generations, including photovoltaics (PV), electric vehicles (EV), and energy storage systems (ESS). At the transmission system, despite of irregular load behaviors at different areas, highly aggregated load shapes still share similar characteristics. Load clustering is to discover such intrinsic patterns and provide useful information to other load applications, such as load forecasting and load modeling. This paper proposes an efficient submodular load clustering method for transmission-level load areas. Robust principal component analysis (R-PCA) firstly decomposes the annual load profiles into low-rank components and sparse components to extract key features. A novel submodular cluster center selection technique is then applied to determine the optimal cluster centers through constructed similarity graph. Following the selection results, load areas are efficiently assigned to different clusters for further load analysis and applications. Numerical results obtained from PJM load demonstrate the effectiveness of the proposed approach.Comment: Accepted by 2019 IEEE PES General Meeting, Atlanta, G

    StackInsights: Cognitive Learning for Hybrid Cloud Readiness

    Full text link
    Hybrid cloud is an integrated cloud computing environment utilizing a mix of public cloud, private cloud, and on-premise traditional IT infrastructures. Workload awareness, defined as a detailed full range understanding of each individual workload, is essential in implementing the hybrid cloud. While it is critical to perform an accurate analysis to determine which workloads are appropriate for on-premise deployment versus which workloads can be migrated to a cloud off-premise, the assessment is mainly performed by rule or policy based approaches. In this paper, we introduce StackInsights, a novel cognitive system to automatically analyze and predict the cloud readiness of workloads for an enterprise. Our system harnesses the critical metrics across the entire stack: 1) infrastructure metrics, 2) data relevance metrics, and 3) application taxonomy, to identify workloads that have characteristics of a) low sensitivity with respect to business security, criticality and compliance, and b) low response time requirements and access patterns. Since the capture of the data relevance metrics involves an intrusive and in-depth scanning of the content of storage objects, a machine learning model is applied to perform the business relevance classification by learning from the meta level metrics harnessed across stack. In contrast to traditional methods, StackInsights significantly reduces the total time for hybrid cloud readiness assessment by orders of magnitude
    corecore