3,365 research outputs found
Hierarchical word clustering - automatic thesaurus generation
In this paper, we propose a hierarchical, lexical clustering neural network algorithm that automatically generates a thesaurus (synonym abstraction) using purely stochastic information derived from unstructured text corpora and requiring no prior word classifications. The lexical hierarchy overcomes the Vocabulary Problem by accommodating paraphrasing through using synonym clusters and overcomes Information Overload by focusing search within cohesive clusters. We describe existing word categorisation methodologies, identifying their respective strengths and weaknesses and evaluate our proposed approach against an existing neural approach using a benchmark statistical approach and a human generated thesaurus for comparison. We also evaluate our word context vector generation methodology against two similar approaches to investigate the effect of word vector dimensionality and the effect of the number of words in the context window on the quality of word clusters produced. We demonstrate the effectiveness of our approach and its superiority to existing techniques. (C) 2002 Elsevier Science B.V. All rights reserved
A high performance k-NN approach using binary neural networks
This paper evaluates a novel k-nearest neighbour (k-NN) classifier built from binary neural networks. The binary neural approach uses robust encoding to map standard ordinal, categorical and numeric data sets onto a binary neural network. The binary neural network uses high speed pattern matching to recall a candidate set of matching records, which are then processed by a conventional k-NN approach to determine the k-best matches. We compare various configurations of the binary approach to a conventional approach for memory overheads, training speed, retrieval speed and retrieval accuracy. We demonstrate the superior performance with respect to speed and memory requirements of the binary approach compared to the standard approach and we pinpoint the optimal configurations. (C) 2003 Elsevier Ltd. All rights reserved
Measurement of statistical evidence on an absolute scale following thermodynamic principles
Statistical analysis is used throughout biomedical research and elsewhere to
assess strength of evidence. We have previously argued that typical outcome
statistics (including p-values and maximum likelihood ratios) have poor
measure-theoretic properties: they can erroneously indicate decreasing evidence
as data supporting an hypothesis accumulate; and they are not amenable to
calibration, necessary for meaningful comparison of evidence across different
study designs, data types, and levels of analysis. We have also previously
proposed that thermodynamic theory, which allowed for the first time derivation
of an absolute measurement scale for temperature (T), could be used to derive
an absolute scale for evidence (E). Here we present a novel
thermodynamically-based framework in which measurement of E on an absolute
scale, for which "one degree" always means the same thing, becomes possible for
the first time. The new framework invites us to think about statistical
analyses in terms of the flow of (evidential) information, placing this work in
the context of a growing literature on connections among physics, information
theory, and statistics.Comment: Final version of manuscript as published in Theory in Biosciences
(2013
Improved AURA k-Nearest Neighbour approach
The k-Nearest Neighbour (kNN) approach is a widely-used technique for pattern classification. Ranked distance measurements to a known sample set determine the classification of unknown samples. Though effective, kNN, like most classification methods does not scale well with increased sample size. This is due to their being a relationship between the unknown query and every other sample in the data space. In order to make this operation scalable, we apply AURA to the kNN problem. AURA is a highly-scalable associative-memory based binary neural-network intended for high-speed approximate search and match operations on large unstructured datasets. Previous work has seen AURA methods applied to this problem as a scalable, but approximate kNN classifier. This paper continues this work by using AURA in conjunction with kernel-based input vectors, in order to create a fast scalable kNN classifier, whilst improving recall accuracy to levels similar to standard kNN implementations
On Lorentz Invariance, Spin-Charge Separation And SU(2) Yang-Mills Theory
Previously it has been shown that in spin-charge separated SU(2) Yang-Mills
theory Lorentz invariance can become broken by a one-cocycle that appears in
the Lorentz boosts. Here we study in detail the structure of this one-cocycle.
In particular we show that its non-triviality relates to the presence of a
(Dirac) magnetic monopole bundle. We also explicitely present the finite
version of the cocycle.Comment: 4 page
Hierarchical growing neural gas
“The original publication is available at www.springerlink.com”. Copyright Springer.This paper describes TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. TreeGNG is an extension to the Growing Neural Gas algorithm that maintains a time history of the learned topological mapping. TreeGNG is able to correct poor decisions made during the early phases of the construction of the tree, and provides the novel ability to influence the general shape and form of the learned hierarchy
The Biot-Savart operator and electrodynamics on subdomains of the three-sphere
We study steady-state magnetic fields in the geometric setting of positive
curvature on subdomains of the three-dimensional sphere. By generalizing the
Biot-Savart law to an integral operator BS acting on all vector fields, we show
that electrodynamics in such a setting behaves rather similarly to Euclidean
electrodynamics. For instance, for current J and magnetic field BS(J), we show
that Maxwell's equations naturally hold. In all instances, the formulas we give
are geometrically meaningful: they are preserved by orientation-preserving
isometries of the three-sphere.
This article describes several properties of BS: we show it is self-adjoint,
bounded, and extends to a compact operator on a Hilbert space. For vector
fields that act like currents, we prove the curl operator is a left inverse to
BS; thus the Biot-Savart operator is important in the study of curl
eigenvalues, with applications to energy-minimization problems in geometry and
physics. We conclude with two examples, which indicate our bounds are typically
within an order of magnitude of being sharp.Comment: 24 pages (was 28 pages) Revised to include a new introduction, a
detailed example, and results about helicity; other changes for readabilit
Observations of Stellar Objects at a Shell Boundary in the Star-Forming Complex in the Galaxy IC1613
The single region of ongoing star formation in the galaxy IC 1613 has been
observed in order to reveal the nature of compact emission-line objects at the
edges of two shells in the complex, identified earlier in H-alpha line images.
The continuum images show these compact objects to be stars. Detailed
spectroscopic observations of these stars and the surrounding nebulae were
carried out with an integral field spectrograph MPFS mounted on the 6m
telescope of the Special Astrophysical Observatory. The resulting stellar
spectra were used to determine the spectral types and luminosity classes of the
objects. An Of star we identified is the only object of this spectral type in
IC 1613. The results of optical observations of the multi-shell complex are
compared to 21cm radio observations. The shells harboring the stars at their
boundaries constitute the most active part of the star-forming region. There is
evidence that shocks have played an important role in the formation of the
shells.Comment: 10 pages, 5 PS and 1 color JPEG figur
Earth Occultation Imaging of the Low Energy Gamma-Ray Sky with GBM
The Earth Occultation Technique (EOT) has been applied to Fermi's Gamma-ray
Burst Monitor (GBM) to perform all-sky monitoring for a predetermined catalog
of hard X-ray/soft gamma-ray sources. In order to search for sources not in the
catalog, thus completing the catalog and reducing a source of systematic error
in EOT, an imaging method has been developed -- Imaging with a Differential
filter using the Earth Occultation Method (IDEOM). IDEOM is a tomographic
imaging method that takes advantage of the orbital precession of the Fermi
satellite. Using IDEOM, all-sky reconstructions have been generated for ~sim 4
years of GBM data in the 12-50 keV, 50-100 keV and 100-300 keV energy bands in
search of sources otherwise unmodeled by the GBM occultation analysis. IDEOM
analysis resulted in the detection of 57 sources in the 12-50 keV energy band,
23 sources in the 50-100 keV energy band, and 7 sources in the 100-300 keV
energy band. Seventeen sources were not present in the original GBM-EOT catalog
and have now been added. We also present the first joined averaged spectra for
four persistent sources detected by GBM using EOT and by the Large Area
Telescope (LAT) on Fermi: NGC 1275, 3C 273, Cen A, and the Crab
- …