4,273 research outputs found

    A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification

    Get PDF
    kk Nearest Neighbors (kkNN) is one of the most widely used supervised learning algorithms to classify Gaussian distributed data, but it does not achieve good results when it is applied to nonlinear manifold distributed data, especially when a very limited amount of labeled samples are available. In this paper, we propose a new graph-based kkNN algorithm which can effectively handle both Gaussian distributed data and nonlinear manifold distributed data. To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by constructing an RR-level nearest-neighbor strengthened tree over the graph, and then compute a TRW matrix for similarity measurement purposes. After this, the nearest neighbors are identified according to the TRW matrix and the class label of a query point is determined by the sum of all the TRW weights of its nearest neighbors. To deal with online situations, we also propose a new algorithm to handle sequential samples based a local neighborhood reconstruction. Comparison experiments are conducted on both synthetic data sets and real-world data sets to demonstrate the validity of the proposed new kkNN algorithm and its improvements to other version of kkNN algorithms. Given the widespread appearance of manifold structures in real-world problems and the popularity of the traditional kkNN algorithm, the proposed manifold version kkNN shows promising potential for classifying manifold-distributed data.Comment: 32 pages, 12 figures, 7 table

    Outlier Detection Techniques For Wireless Sensor Networks: A Survey

    Get PDF
    In the field of wireless sensor networks, measurements that significantly deviate from the normal pattern of sensed data are considered as outliers. The potential sources of outliers include noise and errors, events, and malicious attacks on the network. Traditional outlier detection techniques are not directly applicable to wireless sensor networks due to the multivariate nature of sensor data and specific requirements and limitations of the wireless sensor networks. This survey provides a comprehensive overview of existing outlier detection techniques specifically developed for the wireless sensor networks. Additionally, it presents a technique-based taxonomy and a decision tree to be used as a guideline to select a technique suitable for the application at hand based on characteristics such as data type, outlier type, outlier degree

    Quantum Algorithm Implementations for Beginners

    Full text link
    As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been developing classical computer programs for most of their careers. While currently available quantum computers have less than 100 qubits, quantum computing hardware is widely expected to grow in terms of qubit count, quality, and connectivity. This review aims to explain the principles of quantum programming, which are quite different from classical programming, with straightforward algebra that makes understanding of the underlying fascinating quantum mechanical principles optional. We give an introduction to quantum computing algorithms and their implementation on real quantum hardware. We survey 20 different quantum algorithms, attempting to describe each in a succinct and self-contained fashion. We show how these algorithms can be implemented on IBM's quantum computer, and in each case, we discuss the results of the implementation with respect to differences between the simulator and the actual hardware runs. This article introduces computer scientists, physicists, and engineers to quantum algorithms and provides a blueprint for their implementations

    Normalized Web Distance and Word Similarity

    Get PDF
    There is a great deal of work in cognitive psychology, linguistics, and computer science, about using word (or phrase) frequencies in context in text corpora to develop measures for word similarity or word association, going back to at least the 1960s. The goal of this chapter is to introduce the normalizedis a general way to tap the amorphous low-grade knowledge available for free on the Internet, typed in by local users aiming at personal gratification of diverse objectives, and yet globally achieving what is effectively the largest semantic electronic database in the world. Moreover, this database is available for all by using any search engine that can return aggregate page-count estimates for a large range of search-queries. In the paper introducing the NWD it was called `normalized Google distance (NGD),' but since Google doesn't allow computer searches anymore, we opt for the more neutral and descriptive NWD. web distance (NWD) method to determine similarity between words and phrases. ItComment: Latex, 20 pages, 7 figures, to appear in: Handbook of Natural Language Processing, Second Edition, Nitin Indurkhya and Fred J. Damerau Eds., CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010, ISBN 978-142008592

    Stress Analysis of Operating Gas Pipeline Installed by Horizontal Directional Drilling and Pullback Force Prediction During Installation

    Get PDF
    With the development of the natural gas industry, the demand for pipeline construction has also increased. In the context of advocating green construction, horizontal directional drilling (HDD), as one of the most widely utilized trenchless methods for pipeline installation, has received extensive attention in industry and academia in recent years. The safety of natural gas pipeline is very important in the process of construction and operation. It is necessary to conduct in-depth study on the safety of the pipeline installed by HDD method. In this dissertation, motivated by the following considerations, two aspects of HDD installation are studied. First, through the literature review, one issue that has not received much attention so far is the presence of stress problem during the operation condition. Thus, two chapters (Chapters 3 and 4) in this dissertation are related to the pipe stress analysis during the operation. Regarding this problem, two cases are considered according to the fluidity of drilling fluid. The more dangerous situation is determined by comparing the pipeline stress in the two working conditions. The stress of pipeline installed by HDD method and open-cut method is also compared, and it indicates that the stress of pipeline installed by HDD method is lower. Moreover, through the analysis of influence factors and stress sensitivity, the influence degree of different parameters on pipeline stress is obtained. Secondly, literature review indicates that the accurate prediction of pullback force in HDD construction is of great significance to construction safety and construction success. However, the accuracy of current analytical methods is not high. In the context of machine learning and big data, three new hybrid data-driven models are proposed in this dissertation (Chapter 5) for near real-time pullback force prediction, including radial basis function neural network with complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN-RBFNN), support vector machine using whale optimization algorithm with CEEMDAN (CEEMDAN-WOA-SVM), and a hybrid model combines random forest (RF) and CEEMDAN. Three novel models have been verified in two projects in China. It is found that the prediction accuracy is dramatically improved compared with the original analytical models (or empirical models). In addition, through the feasibility analysis, the great potential of machine learning model in near real-time prediction is proved. At the end of this dissertation, in addition to summarizing the primary conclusions, three future research directions are also pointed out: (1) stress analysis of pipelines installed by HDD in more complex situations; (2) stress analysis of pipeline during HDD construction; (3) database establishment in HDD engineering
    corecore