505,703 research outputs found
Detecting Baryon Acoustic Oscillations
Baryon Acoustic Oscillations are a feature imprinted in the galaxy
distribution by acoustic waves traveling in the plasma of the early universe.
Their detection at the expected scale in large-scale structures strongly
supports current cosmological models with a nearly linear evolution from
redshift approximately 1000, and the existence of dark energy. Besides, BAOs
provide a standard ruler for studying cosmic expansion. In this paper we focus
on methods for BAO detection using the correlation function measurement. For
each method, we want to understand the tested hypothesis (the hypothesis H0 to
be rejected) and the underlying assumptions. We first present wavelet methods
which are mildly model-dependent and mostly sensitive to the BAO feature. Then
we turn to fully model-dependent methods. We present the most often used method
based on the chi^2 statistic, but we find it has limitations. In general the
assumptions of the chi^2 method are not verified, and it only gives a rough
estimate of the significance. The estimate can become very wrong when
considering more realistic hypotheses, where the covariance matrix of the
measurement depends on cosmological parameters. Instead we propose to use a new
method based on two modifications: we modify the procedure for computing the
significance and make it rigorous, and we modify the statistic to obtain better
results in the case of varying covariance matrix. We verify with simulations
that correct significances are different from the ones obtained using the
classical chi^2 procedure. We also test a simple example of varying covariance
matrix. In this case we find that our modified statistic outperforms the
classical chi^2 statistic when both significances are correctly computed.
Finally we find that taking into account variations of the covariance matrix
can change both BAO detection levels and cosmological parameter constraints
Voxel selection in fMRI data analysis based on sparse representation
Multivariate pattern analysis approaches toward detection of brain regions from fMRI data have been gaining attention recently. In this study, we introduce an iterative sparse-representation-based algorithm for detection of voxels in functional MRI (fMRI) data with task relevant information. In each iteration of the algorithm, a linear programming problem is solved and a sparse weight vector is subsequently obtained. The final weight vector is the mean of those obtained in all iterations. The characteristics of our algorithm are as follows: 1) the weight vector (output) is sparse; 2) the magnitude of each entry of the weight vector represents the significance of its corresponding variable or feature in a classification or regression problem; and 3) due to the convergence of this algorithm, a stable weight vector is obtained. To demonstrate the validity of our algorithm and illustrate its application, we apply the algorithm to the Pittsburgh Brain Activity Interpretation Competition 2007 functional fMRI dataset for selecting the voxels, which are the most relevant to the tasks of the subjects. Based on this dataset, the aforementioned characteristics of our algorithm are analyzed, and a comparison between our method with the univariate general-linear-model-based statistical parametric mapping is performed. Using our method, a combination of voxels are selected based on the principle of effective/sparse representation of a task. Data analysis results in this paper show that this combination of voxels is suitable for decoding tasks and demonstrate the effectiveness of our method
Image feature analysis using the Multiresolution Fourier Transform
The problem of identifying boundary contours or line structures is widely recognised
as an important component in many applications of image analysis and computer
vision. Typical solutions to the problem employ some form of edge detection
followed by line following or, more commonly in recent years, Hough transforms.
Because of the processing requirements of such methods and to try to improve the
robustness of the algorithms, a number of authors have explored the use of multiresolution
approaches to the problem. Non-parametric, iterative approaches such as
relaxation labelling and "Snakes" have also been used.
This thesis presents a boundary detection algorithm based on a multiresolution
image representation, the Multiresolution Fourier Transform (MFT), which represents
an image over a range of spatial/spatial-frequency resolutions. A quadtree based
image model is described in which each leaf is a region which can be modelled using
one of a set of feature classes. Consideration is given to using linear and circular arc
features for this modelling, and frequency domain models are developed for them.
A general model based decision process is presented and shown to be applicable
to detecting local image features, selecting the most appropriate scale for modelling
each region of the image and linking the local features into the region boundary
structures of the image. The use of a consistent inference process for all of the subtasks
used in the boundary detection represents a significant improvement over the adhoc
assemblies of estimation and detection that have been common in previous work.
Although the process is applied using a restricted set of local features, the framework
presented allows for expansion of the number of boundary feature models and the
possible inclusion of models of region properties. Results are presented demonstrating
the effective application of these procedures to a number of synthetic and natural
images
Characterization of groups using composite kernels and multi-source fMRI analysis data: application to schizophrenia
Pattern classification of brain imaging data can enable the automatic detection of differences in cognitive processes of specific groups of interest. Furthermore, it can also give neuroanatomical information related to the regions of the brain that are most relevant to detect these differences by means of feature selection procedures, which are also well-suited to deal with the high dimensionality of brain imaging data. This work proposes the application of recursive feature elimination using a machine learning algorithm based on composite kernels to the classification of healthy controls and patients with schizophrenia. This framework, which evaluates nonlinear relationships between voxels, analyzes whole-brain fMRI data from an auditory task experiment that is segmented into anatomical regions and recursively eliminates the uninformative ones based on their relevance estimates, thus yielding the set of most discriminative brain areas for group classification. The collected data was processed using two analysis methods: the general linear model (GLM) and independent component analysis (ICA). GLM spatial maps as well as ICA temporal lobe and default mode component maps were then input to the classifier. A mean classification accuracy of up to 95% estimated with a leave-two-out cross-validation procedure was achieved by doing multi-source data classification. In addition, it is shown that the classification accuracy rate obtained by using multi-source data surpasses that reached by using single-source data, hence showing that this algorithm takes advantage of the complimentary nature of GLM and ICAPublicad
Novel methods for multi-view learning with applications in cyber security
Modern data is complex. It exists in many different forms, shapes and kinds. Vectors, graphs, histograms, sets, intervals, etc.: they each have distinct and varied structural properties. Tailoring models to the characteristics of various feature representations has been the subject of considerable research. In this thesis, we address the challenge of learning from data that is described by multiple heterogeneous feature representations.
This situation arises often in cyber security contexts. Data from a computer network can be represented by a graph of user authentications, a time series of network traffic, a tree of process events, etc. Each representation provides a complementary view of the holistic state of the network, and so data of this type is referred to as multi-view data. Our motivating problem in cyber security is anomaly detection: identifying unusual observations in a joint feature space, which may not appear anomalous marginally.
Our contributions include the development of novel supervised and unsupervised methods, which are applicable not only to cyber security but to multi-view data in general. We extend the generalised linear model to operate in a vector-valued reproducing kernel Hilbert space implied by an operator-valued kernel function, which can be tailored to the structural characteristics of multiple views of data. This is a highly flexible algorithm, able to predict a wide variety of response types. A distinguishing feature is the ability to simultaneously identify outlier observations with respect to the fitted model. Our proposed unsupervised learning model extends multidimensional scaling to directly map multi-view data into a shared latent space. This vector embedding captures both commonalities and disparities that exist between multiple views of the data. Throughout the thesis, we demonstrate our models using real-world cyber security datasets.Open Acces
Two dimensional statistical linear discriminant analysis for real-time robust vehicle type recognition
Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle
recognitions systems that are solely based on Automatic License Plate Recognition (ALPR) systems. Several car MMR
systems have been proposed in literature. However these approaches are based on feature detection algorithms that can
perform sub-optimally under adverse lighting and/or occlusion conditions. In this paper we propose a real time,
appearance based, car MMR approach using Two Dimensional Linear Discriminant Analysis that is capable of
addressing this limitation. We provide experimental results to analyse the proposed algorithm’s robustness under varying
illumination and occlusions conditions. We have shown that the best performance with the proposed 2D-LDA based car
MMR approach is obtained when the eigenvectors of lower significance are ignored. For the given database of 200 car
images of 25 different make-model classifications, a best accuracy of 91% was obtained with the 2D-LDA approach. We
use a direct Principle Component Analysis (PCA) based approach as a benchmark to compare and contrast the
performance of the proposed 2D-LDA approach to car MMR. We conclude that in general the 2D-LDA based algorithm
supersedes the performance of the PCA based approach
Steganographer Identification
Conventional steganalysis detects the presence of steganography within single
objects. In the real-world, we may face a complex scenario that one or some of
multiple users called actors are guilty of using steganography, which is
typically defined as the Steganographer Identification Problem (SIP). One might
use the conventional steganalysis algorithms to separate stego objects from
cover objects and then identify the guilty actors. However, the guilty actors
may be lost due to a number of false alarms. To deal with the SIP, most of the
state-of-the-arts use unsupervised learning based approaches. In their
solutions, each actor holds multiple digital objects, from which a set of
feature vectors can be extracted. The well-defined distances between these
feature sets are determined to measure the similarity between the corresponding
actors. By applying clustering or outlier detection, the most suspicious
actor(s) will be judged as the steganographer(s). Though the SIP needs further
study, the existing works have good ability to identify the steganographer(s)
when non-adaptive steganographic embedding was applied. In this chapter, we
will present foundational concepts and review advanced methodologies in SIP.
This chapter is self-contained and intended as a tutorial introducing the SIP
in the context of media steganography.Comment: A tutorial with 30 page
- …