1,014 research outputs found
Recommended from our members
Automatic discovery of the statistical types of variables in a dataset
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real- world data increases, this assumption becomes too restrictive. Data are often heterogeneous, complex, and improperly or incompletely documented. Surprisingly, despite their practical importance, there is still a lack of tools to automatically discover the statistical types of, as well as appropriate likelihood (noise) models for, the variables in a dataset. In this paper, we fill this gap by proposing a Bayesian method, which accurately discovers the statistical data types in both synthetic and real data.Humboldt Research Fellowship for Postdoctoral Researchers, which funded this research during her stay at the Max Planck Institute for Software Systems.
ATI Grant EP/N510129/1
EPSRC Grant EP/N014162/1
Googl
The Double Strangeness Pentaquark and Other Exotic Hadrons in the Reaction ξb → j/ψφξ
We study the possibility that four Ξ resonances (Ξ(1620), Ξ(1690), Ξ(1820), Ξ(1950)) could
correspond to pentaquark states, in the form of a meson-baryon bound systems. We also explore
the possible existence of doubly strange pentaquarks with hidden charm (Pcss) and find two candidates structured in a similar form, at energies of 4493 MeV and 4630 MeV. The meson-baryon
interaction is built from t-channel meson exchange processes which are evaluated using effective
Lagrangians. Moreover we analyse the Ξb → Ξ J/ψ ϕ decay process, which permits exploring the
existence of the heavy double strange pentaquark, as well as other exotic hadrons, in the three different two-body invariant mass spectra of the emitted particles. In the J/ψϕ mass spectrum, we
analyse the nature of the X(4140) and X(4160) resonances. In the J/ψΞ invariant mass spectrum,
we study the signal produced by the doubly strange pentaquark, where we conclude that it has a good chance to be detected in this reaction if its mass is around 4580 − 4680 MeV. Finally, in the ϕΞ
spectrum we study the likelihood to detect the Ξ(2500) state
Modeling the Dynamics of Online Learning Activity
People are increasingly relying on the Web and social media to find solutions to their problems in a wide range of domains. In this online setting, closely related problems often lead to the same characteristic learning pattern, in which people sharing these problems visit related pieces of information, perform almost identical queries or, more generally, take a series of similar actions. In this paper, we introduce a novel modeling framework for clustering continuous-time grouped streaming data, the hierarchical Dirichlet Hawkes process (HDHP), which allows us to automatically uncover a wide variety of learning patterns from detailed traces of learning activity. Our model allows for efficient inference, scaling to millions of actions taken by thousands of users. Experiments on real data gathered from Stack Overflow reveal that our framework can recover meaningful learning patterns in terms of both content and temporal dynamics, as well as accurately track users' interests and goals over time
Modeling the Dynamics of Online Learning Activity
People are increasingly relying on the Web and social media to find solutions to their problems in a wide range of domains. In this online setting, closely related problems often lead to the same characteristic learning pattern, in which people sharing these problems visit related pieces of information, perform almost identical queries or, more generally, take a series of similar actions. In this paper, we introduce a novel modeling framework for clustering continuous-time grouped streaming data, the hierarchical Dirichlet Hawkes process (HDHP), which allows us to automatically uncover a wide variety of learning patterns from detailed traces of learning activity. Our model allows for efficient inference, scaling to millions of actions taken by thousands of users. Experiments on real data gathered from Stack Overflow reveal that our framework can recover meaningful learning patterns in terms of both content and temporal dynamics, as well as accurately track users' interests and goals over time
Handling incomplete heterogeneous data using VAEs.
Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate for capturing the latent structure of vast amounts of complex high-dimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in real-world applications.
In this paper, we propose a general framework to design VAEs suitable for fitting incomplete heterogenous data. The proposed HI-VAE includes likelihood models for real-valued, positive real valued, interval, categorical, ordinal and count data, and allows accurate estimation (and potentially imputation) of missing data. Furthermore, HI-VAE presents competitive predictive performance in supervised tasks, outperforming supervised models when trained on incomplete data
Primera valoración genética para la disciplina de raid en el caballo de pura raza árabe español
El esquema de selección del caballo de Pura Raza
Árabe fue aprobado por el Ministerio de Agricullura Pesca
y Alimentación en septiembre de 2005. Dentro de él se
especifica que se realizará una selección para mejorar los
caracteres que potencien el alto rendimiento, que de
forma natural, presenta la raza en la disciplina de raid.
Se ha realizado la primera valoración genética para la
disciplina de raid en el caballo de Pura Raza Árabe para
lo cual se ha contado con datos de 249 caballos con un
total de 547 participaciones en raids de diferentes categarías.
La valoración genética se ha realizado para los
caracteres puesto clasificatorio y tiempo de carrera.
Previamente ha sido preciso realizar un estudio de los factores que afectan al rendimiento de esta disciplina. Los
factores que se han incluido en el modelo de valoración
por resultar estadísticamente significativos han sido el año
de celebración de la prueba de raid, la zona geográfica
donde se realiza la prueba y los kilómetros del recorrido.
Además, se han incluido como covariables el número
total de participantes en la prueba de raid para el carácter
puesto clasificatorio y el tiempo medio de carrera para
el carácter tiempo. las heredabilidades obtenidas presentan un valor bajo-medio (0,18 para el puesto clasificatorio
y 0,13 para el tiempo). La evolución del valor genético
para dichos caracteres nos muestra que el progreso
genético ha sido escaso hasta el momento, pero la elevada
variabilidad del carácter asegura un progreso genético
adecuado si se realiza una apropiada intensidad de selección
para dichos caracteres
Challenging selection for consistency in the rank of endurance competitions
Control of the environmental variability by genetic selection offers possibilities for new selection objectives for productive traits. This methodology aims at reducing heterogeneity in productive traits and has been applied to several traits and species for which animal homogeneity is profitable. In horse breeding programmes, rank in competitions is a common selection objective but has been challenging to model. In this study, the parameters of environmental variability for the rank of a horse were computed to analyse the capability of a horse to maintain the best ranking across competitions that consist of long-distance races in which the adapted physical condition of the horse is essential. The genetic component of the environmental variance for the rank in endurance competitions was evaluated, which resulted in proposing a new transformation of horse scores in competitions
- …