4,273 research outputs found
A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification
Nearest Neighbors (NN) is one of the most widely used supervised
learning algorithms to classify Gaussian distributed data, but it does not
achieve good results when it is applied to nonlinear manifold distributed data,
especially when a very limited amount of labeled samples are available. In this
paper, we propose a new graph-based NN algorithm which can effectively
handle both Gaussian distributed data and nonlinear manifold distributed data.
To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by
constructing an -level nearest-neighbor strengthened tree over the graph,
and then compute a TRW matrix for similarity measurement purposes. After this,
the nearest neighbors are identified according to the TRW matrix and the class
label of a query point is determined by the sum of all the TRW weights of its
nearest neighbors. To deal with online situations, we also propose a new
algorithm to handle sequential samples based a local neighborhood
reconstruction. Comparison experiments are conducted on both synthetic data
sets and real-world data sets to demonstrate the validity of the proposed new
NN algorithm and its improvements to other version of NN algorithms.
Given the widespread appearance of manifold structures in real-world problems
and the popularity of the traditional NN algorithm, the proposed manifold
version NN shows promising potential for classifying manifold-distributed
data.Comment: 32 pages, 12 figures, 7 table
Outlier Detection Techniques For Wireless Sensor Networks: A Survey
In the field of wireless sensor networks, measurements that
significantly deviate from the normal pattern of sensed data are
considered as outliers. The potential sources of outliers include
noise and errors, events, and malicious attacks on the network.
Traditional outlier detection techniques are not directly
applicable to wireless sensor networks due to the multivariate
nature of sensor data and specific requirements and limitations of
the wireless sensor networks. This survey provides a comprehensive
overview of existing outlier detection techniques specifically
developed for the wireless sensor networks. Additionally, it
presents a technique-based taxonomy and a decision tree to be used
as a guideline to select a technique suitable for the application
at hand based on characteristics such as data type, outlier type,
outlier degree
Quantum Algorithm Implementations for Beginners
As quantum computers become available to the general public, the need has
arisen to train a cohort of quantum programmers, many of whom have been
developing classical computer programs for most of their careers. While
currently available quantum computers have less than 100 qubits, quantum
computing hardware is widely expected to grow in terms of qubit count, quality,
and connectivity. This review aims to explain the principles of quantum
programming, which are quite different from classical programming, with
straightforward algebra that makes understanding of the underlying fascinating
quantum mechanical principles optional. We give an introduction to quantum
computing algorithms and their implementation on real quantum hardware. We
survey 20 different quantum algorithms, attempting to describe each in a
succinct and self-contained fashion. We show how these algorithms can be
implemented on IBM's quantum computer, and in each case, we discuss the results
of the implementation with respect to differences between the simulator and the
actual hardware runs. This article introduces computer scientists, physicists,
and engineers to quantum algorithms and provides a blueprint for their
implementations
Normalized Web Distance and Word Similarity
There is a great deal of work in cognitive psychology, linguistics, and
computer science, about using word (or phrase) frequencies in context in text
corpora to develop measures for word similarity or word association, going back
to at least the 1960s. The goal of this chapter is to introduce the
normalizedis a general way to tap the amorphous low-grade knowledge available
for free on the Internet, typed in by local users aiming at personal
gratification of diverse objectives, and yet globally achieving what is
effectively the largest semantic electronic database in the world. Moreover,
this database is available for all by using any search engine that can return
aggregate page-count estimates for a large range of search-queries. In the
paper introducing the NWD it was called `normalized Google distance (NGD),' but
since Google doesn't allow computer searches anymore, we opt for the more
neutral and descriptive NWD. web distance (NWD) method to determine similarity
between words and phrases. ItComment: Latex, 20 pages, 7 figures, to appear in: Handbook of Natural
Language Processing, Second Edition, Nitin Indurkhya and Fred J. Damerau
Eds., CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010, ISBN
978-142008592
Stress Analysis of Operating Gas Pipeline Installed by Horizontal Directional Drilling and Pullback Force Prediction During Installation
With the development of the natural gas industry, the demand for pipeline construction has also increased. In the context of advocating green construction, horizontal directional drilling (HDD), as one of the most widely utilized trenchless methods for pipeline installation, has received extensive attention in industry and academia in recent years. The safety of natural gas pipeline is very important in the process of construction and operation. It is necessary to conduct in-depth study on the safety of the pipeline installed by HDD method.
In this dissertation, motivated by the following considerations, two aspects of HDD installation are studied. First, through the literature review, one issue that has not received much attention so far is the presence of stress problem during the operation condition. Thus, two chapters (Chapters 3 and 4) in this dissertation are related to the pipe stress analysis during the operation. Regarding this problem, two cases are considered according to the fluidity of drilling fluid. The more dangerous situation is determined by comparing the pipeline stress in the two working conditions. The stress of pipeline installed by HDD method and open-cut method is also compared, and it indicates that the stress of pipeline installed by HDD method is lower. Moreover, through the analysis of influence factors and stress sensitivity, the influence degree of different parameters on pipeline stress is obtained.
Secondly, literature review indicates that the accurate prediction of pullback force in HDD construction is of great significance to construction safety and construction success. However, the accuracy of current analytical methods is not high. In the context of machine learning and big data, three new hybrid data-driven models are proposed in this dissertation (Chapter 5) for near real-time pullback force prediction, including radial basis function neural network with complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN-RBFNN), support vector machine using whale optimization algorithm with CEEMDAN (CEEMDAN-WOA-SVM), and a hybrid model combines random forest (RF) and CEEMDAN. Three novel models have been verified in two projects in China. It is found that the prediction accuracy is dramatically improved compared with the original analytical models (or empirical models). In addition, through the feasibility analysis, the great potential of machine learning model in near real-time prediction is proved.
At the end of this dissertation, in addition to summarizing the primary conclusions, three future research directions are also pointed out: (1) stress analysis of pipelines installed by HDD in more complex situations; (2) stress analysis of pipeline during HDD construction; (3) database establishment in HDD engineering
- …