449,477 research outputs found
Classification under Streaming Emerging New Classes: A Solution using Completely Random Trees
This paper investigates an important problem in stream mining, i.e.,
classification under streaming emerging new classes or SENC. The common
approach is to treat it as a classification problem and solve it using either a
supervised learner or a semi-supervised learner. We propose an alternative
approach by using unsupervised learning as the basis to solve this problem. The
SENC problem can be decomposed into three sub problems: detecting emerging new
classes, classifying for known classes, and updating models to enable
classification of instances of the new class and detection of more emerging new
classes. The proposed method employs completely random trees which have been
shown to work well in unsupervised learning and supervised learning
independently in the literature. This is the first time, as far as we know,
that completely random trees are used as a single common core to solve all
three sub problems: unsupervised learning, supervised learning and model update
in data streams. We show that the proposed unsupervised-learning-focused method
often achieves significantly better outcomes than existing
classification-focused methods
Clustering comparison of point processes with applications to random geometric models
In this chapter we review some examples, methods, and recent results
involving comparison of clustering properties of point processes. Our approach
is founded on some basic observations allowing us to consider void
probabilities and moment measures as two complementary tools for capturing
clustering phenomena in point processes. As might be expected, smaller values
of these characteristics indicate less clustering. Also, various global and
local functionals of random geometric models driven by point processes admit
more or less explicit bounds involving void probabilities and moment measures,
thus aiding the study of impact of clustering of the underlying point process.
When stronger tools are needed, directional convex ordering of point processes
happens to be an appropriate choice, as well as the notion of (positive or
negative) association, when comparison to the Poisson point process is
considered. We explain the relations between these tools and provide examples
of point processes admitting them. Furthermore, we sketch some recent results
obtained using the aforementioned comparison tools, regarding percolation and
coverage properties of the Boolean model, the SINR model, subgraph counts in
random geometric graphs, and more generally, U-statistics of point processes.
We also mention some results on Betti numbers for \v{C}ech and Vietoris-Rips
random complexes generated by stationary point processes. A general observation
is that many of the results derived previously for the Poisson point process
generalise to some "sub-Poisson" processes, defined as those clustering less
than the Poisson process in the sense of void probabilities and moment
measures, negative association or dcx-ordering.Comment: 44 pages, 4 figure
Structure in the 3D Galaxy Distribution: I. Methods and Example Results
Three methods for detecting and characterizing structure in point data, such
as that generated by redshift surveys, are described: classification using
self-organizing maps, segmentation using Bayesian blocks, and density
estimation using adaptive kernels. The first two methods are new, and allow
detection and characterization of structures of arbitrary shape and at a wide
range of spatial scales. These methods should elucidate not only clusters, but
also the more distributed, wide-ranging filaments and sheets, and further allow
the possibility of detecting and characterizing an even broader class of
shapes. The methods are demonstrated and compared in application to three data
sets: a carefully selected volume-limited sample from the Sloan Digital Sky
Survey redshift data, a similarly selected sample from the Millennium
Simulation, and a set of points independently drawn from a uniform probability
distribution -- a so-called Poisson distribution. We demonstrate a few of the
many ways in which these methods elucidate large scale structure in the
distribution of galaxies in the nearby Universe.Comment: Re-posted after referee corrections along with partially re-written
introduction. 80 pages, 31 figures, ApJ in Press. For full sized figures
please download from: http://astrophysics.arc.nasa.gov/~mway/lss1.pd
Random wavelet series based on a tree-indexed Markov chain
We study the global and local regularity properties of random wavelet series
whose coefficients exhibit correlations given by a tree-indexed Markov chain.
We determine the law of the spectrum of singularities of these series, thereby
performing their multifractal analysis. We also show that almost every sample
path displays an oscillating singularity at almost every point and that the
points at which a sample path has at most a given Holder exponent form a set
with large intersection.Comment: 25 page
Local energy statistics in disordered systems: a proof of the local REM conjecture
Recently, Bauke and Mertens conjectured that the local statistics of energies
in random spin systems with discrete spin space should in most circumstances be
the same as in the random energy model. Here we give necessary conditions for
this hypothesis to be true, which we show to hold in wide classes of examples:
short range spin glasses and mean field spin glasses of the SK type. We also
show that, under certain conditions, the conjecture holds even if energy levels
that grow moderately with the volume of the system are considered
- …