804 research outputs found
Classifying LEP Data with Support Vector Algorithms
We have studied the application of different classification algorithms in the
analysis of simulated high energy physics data. Whereas Neural Network
algorithms have become a standard tool for data analysis, the performance of
other classifiers such as Support Vector Machines has not yet been tested in
this environment. We chose two different problems to compare the performance of
a Support Vector Machine and a Neural Net trained with back-propagation:
tagging events of the type e+e- -> ccbar and the identification of muons
produced in multihadronic e+e- annihilation events.Comment: 7 pages, 4 figures, submitted to proceedings of AIHENP99, Crete,
April 199
Synthesis and Characterization of Copolymers of Lantanide Complexes with Styrene
Сopolymers of 2-methyl-5-phenylpentene-1-dione-3,5 with styrene in ratio 5:95, which containing Eu, Yb and Eu, Yb with 1,10-phenanthroline were synthesized at the first time. The luminescence spectra of obtained metal complexes and copolymers in solutions, films and solid state are investigated and analyzed. The solubilization of β-diketonate complexes with phenanthroline was shown to change luminescence intensity in such complexes. Obtained copolymers can be used as potential materials for organic light-emitting devices
A Kernel Method for the Two-sample Problem
We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS). We present two tests based on large deviation bounds for the test statistic, while a third is based on the asymptotic distribution of this statistic. The test statistic can be computed in quadratic time, although efficient linear time approximations are available. Several classical metrics on distributions are recovered when the function space used to compute the difference in expectations is allowed to be more general (eg.~a Banach space). We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests
Local Ranking Problem on the BrowseGraph
The "Local Ranking Problem" (LRP) is related to the computation of a
centrality-like rank on a local graph, where the scores of the nodes could
significantly differ from the ones computed on the global graph. Previous work
has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a
graph where nodes are webpages and edges are browsing transitions. Recently,
this graph has received more and more attention in many different tasks such as
ranking, prediction and recommendation. However, a web-server has only the
browsing traffic performed on its pages (local BrowseGraph) and, as a
consequence, the local computation can lead to estimation errors, which hinders
the increasing number of applications in the state of the art. Also, although
the divergence between the local and global ranks has been measured, the
possibility of estimating such divergence using only local knowledge has been
mainly overlooked. These aspects are of great interest for online service
providers who want to: (i) gauge their ability to correctly assess the
importance of their resources only based on their local knowledge, and (ii)
take into account real user browsing fluxes that better capture the actual user
interest than the static hyperlink network. We study the LRP problem on a
BrowseGraph from a large news provider, considering as subgraphs the
aggregations of browsing traces of users coming from different domains. We show
that the distance between rankings can be accurately predicted based only on
structural information of the local graph, being able to achieve an average
rank correlation as high as 0.8
On landmark selection and sampling in high-dimensional data analysis
In recent years, the spectral analysis of appropriately defined kernel
matrices has emerged as a principled way to extract the low-dimensional
structure often prevalent in high-dimensional data. Here we provide an
introduction to spectral methods for linear and nonlinear dimension reduction,
emphasizing ways to overcome the computational limitations currently faced by
practitioners with massive datasets. In particular, a data subsampling or
landmark selection process is often employed to construct a kernel based on
partial information, followed by an approximate spectral analysis termed the
Nystrom extension. We provide a quantitative framework to analyse this
procedure, and use it to demonstrate algorithmic performance bounds on a range
of practical approaches designed to optimize the landmark selection process. We
compare the practical implications of these bounds by way of real-world
examples drawn from the field of computer vision, whereby low-dimensional
manifold structure is shown to emerge from high-dimensional video data streams.Comment: 18 pages, 6 figures, submitted for publicatio
A framework for space-efficient string kernels
String kernels are typically used to compare genome-scale sequences whose
length makes alignment impractical, yet their computation is based on data
structures that are either space-inefficient, or incur large slowdowns. We show
that a number of exact string kernels, like the -mer kernel, the substrings
kernels, a number of length-weighted kernels, the minimal absent words kernel,
and kernels with Markovian corrections, can all be computed in time and
in bits of space in addition to the input, using just a
data structure on the Burrows-Wheeler transform of the
input strings, which takes time per element in its output. The same
bounds hold for a number of measures of compositional complexity based on
multiple value of , like the -mer profile and the -th order empirical
entropy, and for calibrating the value of using the data
Deep Learning for Forecasting Stock Returns in the Cross-Section
Many studies have been undertaken by using machine learning techniques,
including neural networks, to predict stock returns. Recently, a method known
as deep learning, which achieves high performance mainly in image recognition
and speech recognition, has attracted attention in the machine learning field.
This paper implements deep learning to predict one-month-ahead stock returns in
the cross-section in the Japanese stock market and investigates the performance
of the method. Our results show that deep neural networks generally outperform
shallow neural networks, and the best networks also outperform representative
machine learning models. These results indicate that deep learning shows
promise as a skillful machine learning method to predict stock returns in the
cross-section.Comment: 12 pages, 2 figures, 8 tables, accepted at PAKDD 201
Robust artificial neural networks and outlier detection. Technical report
Large outliers break down linear and nonlinear regression models. Robust
regression methods allow one to filter out the outliers when building a model.
By replacing the traditional least squares criterion with the least trimmed
squares criterion, in which half of data is treated as potential outliers, one
can fit accurate regression models to strongly contaminated data.
High-breakdown methods have become very well established in linear regression,
but have started being applied for non-linear regression only recently. In this
work, we examine the problem of fitting artificial neural networks to
contaminated data using least trimmed squares criterion. We introduce a
penalized least trimmed squares criterion which prevents unnecessary removal of
valid data. Training of ANNs leads to a challenging non-smooth global
optimization problem. We compare the efficiency of several derivative-free
optimization methods in solving it, and show that our approach identifies the
outliers correctly when ANNs are used for nonlinear regression
Neuropathology in COVID-19 autopsies is defined by microglial activation and lesions of the white matter with emphasis in cerebellar and brain stem areas
IntroductionThis study aimed to investigate microglial and macrophage activation in 17 patients who died in the context of a COVID-19 infection in 2020 and 2021.MethodsThrough immunohistochemical analysis, the lysosomal marker CD68 was used to detect diffuse parenchymal microglial activity, pronounced perivascular macrophage activation and macrophage clusters. COVID-19 patients were compared to control patients and grouped regarding clinical aspects. Detection of viral proteins was attempted in different regions through multiple commercially available antibodies.ResultsMicroglial and macrophage activation was most pronounced in the white matter with emphasis in brain stem and cerebellar areas. Analysis of lesion patterns yielded no correlation between disease severity and neuropathological changes. Occurrence of macrophage clusters could not be associated with a severe course of disease or preconditions but represent a more advanced stage of microglial and macrophage activation. Severe neuropathological changes in COVID-19 were comparable to severe Influenza. Hypoxic damage was not a confounder to the described neuropathology. The macrophage/microglia reaction was less pronounced in post COVID-19 patients, but detectable i.e. in the brain stem. Commercially available antibodies for detection of SARS-CoV-2 virus material in immunohistochemistry yielded no specific signal over controls.ConclusionThe presented microglial and macrophage activation might be an explanation for the long COVID syndrome
- …