11,015 research outputs found

    Predictive intelligence to the edge through approximate collaborative context reasoning

    Get PDF
    We focus on Internet of Things (IoT) environments where a network of sensing and computing devices are responsible to locally process contextual data, reason and collaboratively infer the appearance of a specific phenomenon (event). Pushing processing and knowledge inference to the edge of the IoT network allows the complexity of the event reasoning process to be distributed into many manageable pieces and to be physically located at the source of the contextual information. This enables a huge amount of rich data streams to be processed in real time that would be prohibitively complex and costly to deliver on a traditional centralized Cloud system. We propose a lightweight, energy-efficient, distributed, adaptive, multiple-context perspective event reasoning model under uncertainty on each IoT device (sensor/actuator). Each device senses and processes context data and infers events based on different local context perspectives: (i) expert knowledge on event representation, (ii) outliers inference, and (iii) deviation from locally predicted context. Such novel approximate reasoning paradigm is achieved through a contextualized, collaborative belief-driven clustering process, where clusters of devices are formed according to their belief on the presence of events. Our distributed and federated intelligence model efficiently identifies any localized abnormality on the contextual data in light of event reasoning through aggregating local degrees of belief, updates, and adjusts its knowledge to contextual data outliers and novelty detection. We provide comprehensive experimental and comparison assessment of our model over real contextual data with other localized and centralized event detection models and show the benefits stemmed from its adoption by achieving up to three orders of magnitude less energy consumption and high quality of inference

    Certainty of outlier and boundary points processing in data mining

    Full text link
    Data certainty is one of the issues in the real-world applications which is caused by unwanted noise in data. Recently, more attentions have been paid to overcome this problem. We proposed a new method based on neutrosophic set (NS) theory to detect boundary and outlier points as challenging points in clustering methods. Generally, firstly, a certainty value is assigned to data points based on the proposed definition in NS. Then, certainty set is presented for the proposed cost function in NS domain by considering a set of main clusters and noise cluster. After that, the proposed cost function is minimized by gradient descent method. Data points are clustered based on their membership degrees. Outlier points are assigned to noise cluster and boundary points are assigned to main clusters with almost same membership degrees. To show the effectiveness of the proposed method, two types of datasets including 3 datasets in Scatter type and 4 datasets in UCI type are used. Results demonstrate that the proposed cost function handles boundary and outlier points with more accurate membership degrees and outperforms existing state of the art clustering methods.Comment: Conference Paper, 6 page

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

    Multimodal decision-level fusion for person authentication

    Get PDF
    In this paper, the use of clustering algorithms for decision-level data fusion is proposed. Person authentication results coming from several modalities (e.g., still image, speech), are combined by using fuzzy k-means (FKM), fuzzy vector quantization (FVQ) algorithms, and median radial basis function (MRBF) network. The quality measure of the modalities data is used for fuzzification. Two modifications of the FKM and FVQ algorithms, based on a novel fuzzy vector distance definition, are proposed to handle the fuzzy data and utilize the quality measure. Simulations show that fuzzy clustering algorithms have better performance compared to the classical clustering algorithms and other known fusion algorithms. MRBF has better performance especially when two modalities are combined. Moreover, the use of the quality via the proposed modified algorithms increases the performance of the fusion system
    corecore