17,663 research outputs found

    Possibilistic and fuzzy clustering methods for robust analysis of non-precise data

    Get PDF
    This work focuses on robust clustering of data affected by imprecision. The imprecision is managed in terms of fuzzy sets. The clustering process is based on the fuzzy and possibilistic approaches. In both approaches the observations are assigned to the clusters by means of membership degrees. In fuzzy clustering the membership degrees express the degrees of sharing of the observations to the clusters. In contrast, in possibilistic clustering the membership degrees are degrees of typicality. These two sources of information are complementary because the former helps to discover the best fuzzy partition of the observations while the latter reflects how well the observations are described by the centroids and, therefore, is helpful to identify outliers. First, a fully possibilistic k-means clustering procedure is suggested. Then, in order to exploit the benefits of both the approaches, a joint possibilistic and fuzzy clustering method for fuzzy data is proposed. A selection procedure for choosing the parameters of the new clustering method is introduced. The effectiveness of the proposal is investigated by means of simulated and real-life data

    Observer-biased bearing condition monitoring: from fault detection to multi-fault classification

    Get PDF
    Bearings are simultaneously a fundamental component and one of the principal causes of failure in rotary machinery. The work focuses on the employment of fuzzy clustering for bearing condition monitoring, i.e., fault detection and classification. The output of a clustering algorithm is a data partition (a set of clusters) which is merely a hypothesis on the structure of the data. This hypothesis requires validation by domain experts. In general, clustering algorithms allow a limited usage of domain knowledge on the cluster formation process. In this study, a novel method allowing for interactive clustering in bearing fault diagnosis is proposed. The method resorts to shrinkage to generalize an otherwise unbiased clustering algorithm into a biased one. In this way, the method provides a natural and intuitive way to control the cluster formation process, allowing for the employment of domain knowledge to guiding it. The domain expert can select a desirable level of granularity ranging from fault detection to classification of a variable number of faults and can select a specific region of the feature space for detailed analysis. Moreover, experimental results under realistic conditions show that the adopted algorithm outperforms the corresponding unbiased algorithm (fuzzy c-means) which is being widely used in this type of problems. (C) 2016 Elsevier Ltd. All rights reserved.Grant number: 145602

    Building information modeling (BIM) and green building index (GBI) assessment framework for non-residential new construction building (NRNC)

    Get PDF
    The global construction industry endorsed Building Information Modeling (BIM) and its many advantages. However, despite this endorsement, BIM still failed to attract Malaysian companies to use BIM in green building assessment, especially for the assessment of Green Building Index (GBI), and maintain GBI certification during building occupancy using BIM features. The main issue of utilizing BIM as a GBI assessment tool is the applicability of BIM Tools to digitalize GBI credit by design team, which results in the digitization of GBI criteria into BIM Model. This study aims to identify common components related to the capability of BIM to digitalize and assess GBI criteria. These components include BIM uses and tools and GBI criteria and processes. This study applied quantitative and qualitative approaches to collect data. The quantitative approach used questionnaires, which were distributed to 900 GBI members, i.e. GBI certifiers and facilitators. The survey generated a response rate of 32% during eight months of data collection. The results were analyzed using SPSS and SmartPLS. Four model categories were identified, namely, BIM uses, BIM tools, GBI criteria and GBI certification process. These categories were used to assess the BIM–GBI framework. The results obtained from the questionnaire showed that only 16 BIM uses must be included in the BIM execution plan of the GBI project for assessment purposes. The results also showed that the BIM tools present different levels of effect on the GBI criteria. The capability of BIM to assess GBI could be stronger in the design assessment (DA) than in the operation assessment, which supports the suggested BIM–GBI assessment framework. The second data collection was conducted through a focus group interview with BIM and GBI experts. Two interview sessions were conducted. Results show that the assessment method has a significant correlation in the BIM– GBI framework. The following categories were identified for the BIM assessment framework: BIM uses, BIM tools, and control, which were based on the GBI criteria for scoring and certification. Findings from the BIM and GBI assessment method framework show that GBI credits can be digitalized using different BIM uses directly and indirectly assessed by BIM tools for each GBI credit in both GBI assessment process. Based on the qualitative result of this research showed that BIM can help the design team to achieve 55% point in design assessment (DA) only and this helps the building to achieve GBI certification in level 4 of certified rating. On the other hand, 45% points of GBI credits can be digitals in completion and verification assessment (CVA). The framework provides a guide for the design team and facility management in digitalizing and assessing GBI criteria using BIM application during design assessment (DA) and completion and verification assessment (CVA) for new nonresidential constructions. The framework also offers and provides insights that will enable designers to understand the relationship between BIM and GBI criteria, which will contribute to BIM integration in Stage 3 and automate GBI assessment for the Malaysian construction industry

    Predictive intelligence to the edge through approximate collaborative context reasoning

    Get PDF
    We focus on Internet of Things (IoT) environments where a network of sensing and computing devices are responsible to locally process contextual data, reason and collaboratively infer the appearance of a specific phenomenon (event). Pushing processing and knowledge inference to the edge of the IoT network allows the complexity of the event reasoning process to be distributed into many manageable pieces and to be physically located at the source of the contextual information. This enables a huge amount of rich data streams to be processed in real time that would be prohibitively complex and costly to deliver on a traditional centralized Cloud system. We propose a lightweight, energy-efficient, distributed, adaptive, multiple-context perspective event reasoning model under uncertainty on each IoT device (sensor/actuator). Each device senses and processes context data and infers events based on different local context perspectives: (i) expert knowledge on event representation, (ii) outliers inference, and (iii) deviation from locally predicted context. Such novel approximate reasoning paradigm is achieved through a contextualized, collaborative belief-driven clustering process, where clusters of devices are formed according to their belief on the presence of events. Our distributed and federated intelligence model efficiently identifies any localized abnormality on the contextual data in light of event reasoning through aggregating local degrees of belief, updates, and adjusts its knowledge to contextual data outliers and novelty detection. We provide comprehensive experimental and comparison assessment of our model over real contextual data with other localized and centralized event detection models and show the benefits stemmed from its adoption by achieving up to three orders of magnitude less energy consumption and high quality of inference

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

    Two-Stage Fuzzy Multiple Kernel Learning Based on Hilbert-Schmidt Independence Criterion

    Full text link
    © 1993-2012 IEEE. Multiple kernel learning (MKL) is a principled approach to kernel combination and selection for a variety of learning tasks, such as classification, clustering, and dimensionality reduction. In this paper, we develop a novel fuzzy multiple kernel learning model based on the Hilbert-Schmidt independence criterion (HSIC) for classification, which we call HSIC-FMKL. In this model, we first propose an HSIC Lasso-based MKL formulation, which not only has a clear statistical interpretation that minimum redundant kernels with maximum dependence on output labels are found and combined, but also enables the global optimal solution to be computed efficiently by solving a Lasso optimization problem. Since the traditional support vector machine (SVM) is sensitive to outliers or noises in the dataset, fuzzy SVM (FSVM) is used to select the prediction hypothesis once the optimal kernel has been obtained. The main advantage of FSVM is that we can associate a fuzzy membership with each data point such that these data points can have different effects on the training of the learning machine. We propose a new fuzzy membership function using a heuristic strategy based on the HSIC. The proposed HSIC-FMKL is a two-stage kernel learning approach and the HSIC is applied in both stages. We perform extensive experiments on real-world datasets from the UCI benchmark repository and the application domain of computational biology which validate the superiority of the proposed model in terms of prediction accuracy

    Dynamic Fuzzy c-Means (dFCM) Clustering and its Application to Calorimetric Data Reconstruction in High Energy Physics

    Full text link
    In high energy physics experiments, calorimetric data reconstruction requires a suitable clustering technique in order to obtain accurate information about the shower characteristics such as position of the shower and energy deposition. Fuzzy clustering techniques have high potential in this regard, as they assign data points to more than one cluster,thereby acting as a tool to distinguish between overlapping clusters. Fuzzy c-means (FCM) is one such clustering technique that can be applied to calorimetric data reconstruction. However, it has a drawback: it cannot easily identify and distinguish clusters that are not uniformly spread. A version of the FCM algorithm called dynamic fuzzy c-means (dFCM) allows clusters to be generated and eliminated as required, with the ability to resolve non-uniformly distributed clusters. Both the FCM and dFCM algorithms have been studied and successfully applied to simulated data of a sampling tungsten-silicon calorimeter. It is seen that the FCM technique works reasonably well, and at the same time, the use of the dFCM technique improves the performance.Comment: 15 pages, 10 figures. It is accepted for publication in NIM
    corecore