327 research outputs found

    Adapted K-Nearest Neighbors for Detecting Anomalies on Spatio–Temporal Traffic Flow

    Get PDF
    Outlier detection is an extensive research area, which has been intensively studied in several domains such as biological sciences, medical diagnosis, surveillance, and traffic anomaly detection. This paper explores advances in the outlier detection area by finding anomalies in spatio-temporal urban traffic flow. It proposes a new approach by considering the distribution of the flows in a given time interval. The flow distribution probability (FDP) databases are first constructed from the traffic flows by considering both spatial and temporal information. The outlier detection mechanism is then applied to the coming flow distribution probabilities, the inliers are stored to enrich the FDP databases, while the outliers are excluded from the FDP databases. Moreover, a k-nearest neighbor for distance-based outlier detection is investigated and adopted for FDP outlier detection. To validate the proposed framework, real data from Odense traffic flow case are evaluated at ten locations. The results reveal that the proposed framework is able to detect the real distribution of flow outliers. Another experiment has been carried out on Beijing data, the results show that our approach outperforms the baseline algorithms for high-urban traffic flow

    Data Mining-Based Decomposition for Solving the MAXSAT Problem: Toward a New Approach

    Get PDF
    This article explores advances in the data mining arena to solve the fundamental MAXSAT problem. In the proposed approach, the MAXSAT instance is first decomposed and clustered by using data mining decomposition techniques, then every cluster resulting from the decomposition is separately solved to construct a partial solution. All partial solutions are merged into a global one, while managing possible conflicting variables due to separate resolutions. The proposed approach has been numerically evaluated on DIMACS instances and some hard Uniform-Random-3-SAT instances, and compared to state-of-the-art decomposition based algorithms. The results show that the proposed approach considerably improves the success rate, with a competitive computation time that's very close to that of the compared solutions

    Machine learning for smart building applications: Review and taxonomy

    Get PDF
    © 2019 Association for Computing Machinery. The use of machine learning (ML) in smart building applications is reviewed in this article. We split existing solutions into two main classes: occupant-centric versus energy/devices-centric. The first class groups solutions that use ML for aspects related to the occupants, including (1) occupancy estimation and identification, (2) activity recognition, and (3) estimating preferences and behavior. The second class groups solutions that use ML to estimate aspects related either to energy or devices. They are divided into three categories: (1) energy profiling and demand estimation, (2) appliances profiling and fault detection, and (3) inference on sensors. Solutions in each category are presented, discussed, and compared; open perspectives and research trends are discussed as well. Compared to related state-of-the-art survey papers, the contribution herein is to provide a comprehensive and holistic review from the ML perspectives rather than architectural and technical aspects of existing building management systems. This is by considering all types of ML tools, buildings, and several categories of applications, and by structuring the taxonomy accordingly. The article ends with a summary discussion of the presented works, with focus on lessons learned, challenges, open and future directions of research in this field

    Frequent itemset mining in big data with effective single scan algorithms

    Get PDF
    © 2013 IEEE. This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases

    Emergent Deep Learning for Anomaly Detection in Internet of Everything

    Get PDF
    This research presents a new generic deep learning framework for anomaly detection in the Internet of Everything (IoE). It combines decomposition methods, deep neural networks, and evolutionary computation to better detect outliers in IoE environments. The dataset is first decomposed into clusters, while similar observations in the same cluster are grouped. Five clustering algorithms were used for this purpose. The generated clusters are then trained using Deep Learning architectures. In this context, we propose a new recurrent neural network for training time series data. Two evolutionary computational algorithms are also proposed: the genetic and the bee swarm to fine-tune the training step. These algorithms consider the hyper-parameters of the trained models and try to find the optimal values. The proposed solutions have been experimentally evaluated for two use cases: 1) road traffic outlier detection and 2) network intrusion detection. The results show the advantages of the proposed solutions and a clear superiority compared to state-of-the-art approaches.acceptedVersio

    Vehicle detection using improved region convolution neural network for accident prevention in smart roads

    Get PDF
    This paper explores the vehicle detection problem and introduces an improved regional convolution neural network. The vehicle data (set of images) is first collected, from which the noise (set of outlier images) is removed using the SIFT extractor. The region convolution neural network is then used to detect the vehicles. We propose a new hyper-parameters optimization model based on evolutionary computation that can be used to tune parameters of the deep learning framework. The proposed solution was tested using the well-known boxy vehicle detection data, which contains more than 200,000 vehicle images and 1,990,000 annotated vehicles. The results are very promising and show superiority over many current state-of-the-art solutions in terms of runtime and accuracy performances.publishedVersio

    Cluster-based information retrieval using pattern mining

    Get PDF
    This paper addresses the problem of responding to user queries by fetching the most relevant object from a clustered set of objects. It addresses the common drawbacks of cluster-based approaches and targets fast, high-quality information retrieval. For this purpose, a novel cluster-based information retrieval approach is proposed, named Cluster-based Retrieval using Pattern Mining (CRPM). This approach integrates various clustering and pattern mining algorithms. First, it generates clusters of objects that contain similar objects. Three clustering algorithms based on k-means, DBSCAN (Density-based spatial clustering of applications with noise), and Spectral are suggested to minimize the number of shared terms among the clusters of objects. Second, frequent and high-utility pattern mining algorithms are performed on each cluster to extract the pattern bases. Third, the clusters of objects are ranked for every query. In this context, two ranking strategies are proposed: i) Score Pattern Computing (SPC), which calculates a score representing the similarity between a user query and a cluster; and ii) Weighted Terms in Clusters (WTC), which calculates a weight for every term and uses the relevant terms to compute the score between a user query and each cluster. Irrelevant information derived from the pattern bases is also used to deal with unexpected user queries. To evaluate the proposed approach, extensive experiments were carried out on two use cases: the documents and tweets corpus. The results showed that the designed approach outperformed traditional and cluster-based information retrieval approaches in terms of the quality of the returned objects while being very competitive in terms of runtime.publishedVersio

    Adaptive learning-enforced broadcast policy for solar energy harvesting wireless sensor networks

    Get PDF
    © 2018 Elsevier B.V. The problem of message broadcast from the base station (BS) to sensor nodes (SNs) in solar energy harvesting enabled wireless sensor networks is considered in this paper. The aim is to ensure fast and reliable broadcast without disturbing upstream communications (from SNs to BS), while taking into account constraints related to the energy harvesting (EH) environment. A new policy is proposed where from the one hand, the BS first selects the broadcast time-slots adaptively with the SNs schedules (to meet active periods that are constrained by EH conditions), and from the other hand, SNs adapt their schedules to enable optimal selection of the broadcast time-slots that minimizes the number of broadcasts per message and the latency. Compared to the existing solutions, this enables fast broadcast and eliminates the need of adding message overhead to the broadcast message. For this purpose, an analytical energy model, a Hidden Markov Model(HMM), Baum–Welch learning algorithm, and a heuristic algorithm of the minimum covering set problem (MCS) are proposed and combined in a unique solution. The proposed solution is analyzed and compared with a state-of-the-art approach. The results confirm that the former has the advantage of performing the broadcast operation more reliably and in lower delay

    Energy-Aware Constrained Relay Node Deployment for Sustainable Wireless Sensor Networks

    Get PDF
    © 2016 IEEE. This paper considers the problem of communication coverage for sustainable data forwarding in wireless sensor networks, where an energy-aware deployment model of relay nodes (RNs) is proposed. The model used in this paper considers constrained placement and is different from the existing one-tiered and two-tiered models. It supposes two different types of sensor nodes to be deployed, i) energy rich nodes (ERNs), and ii) energy limited nodes (ELNs). The aim is thus to use only the ERNs for relaying packets, while ELN's use will be limited to sensing and transmitting their own readings. A minimum number of RNs is added if necessary to help ELNs. This intuitively ensures sustainable coverage and prolongs the network lifetime. The problem is reduced to the traditional problem of minimum weighted connected dominating set (MWCDS) in a vertex weighted graph. It is then solved by taking advantage of the simple form of the weight function, both when deriving exact and approximate solutions. Optimal solution is derived using integer linear programming (ILP), and a heuristic is given for the approximate solution. Upper bounds for the approximation of the heuristic (versus the optimal solution) and for its runtime are formally derived. The proposed model and solutions are also evaluated by simulation. The proposed model is compared with the one-tiered and two-tiered models when using similar solution to determine RNs positions, i.e., minimum connected dominating set (MCDS) calculation. Results demonstrate the proposed model considerably improves the network life time compared to the one-tiered model, and this by adding a lower number of RNs compared to the two-tiered model. Further, both the heuristic and the ILP for the MWCDS are evaluated and compared with a state-of-the-art algorithm. The results show the proposed heuristic has runtime close to the ILP while clearly reducing the runtime compared to both ILP and existing heuristics. The results also demonstrate scalability of the proposed solution

    Intelligent Deep Fusion Network for Anomaly Identification in Maritime Transportation Systems

    Get PDF
    This paper introduces a novel deep learning architecture for identifying outliers in the context of intelligent transportation systems. The use of a convolutional neural network with decomposition is explored to find abnormal behavior in maritime data. The set of maritime data is first decomposed into similar clusters containing homogeneous data, and then a convolutional neural network is used for each data cluster. Different models are trained (one per cluster), and each model is learned from highly correlated data. Finally, the results of the models are merged using a simple but efficient fusion strategy. To verify the performance of the proposed framework, intensive experiments were conducted on marine data. The results show the superiority of the proposed framework compared to the baseline solutions in terms of several accuracy metrics.acceptedVersio
    • …
    corecore