82 research outputs found
Validating Large Language Models with ReLM
Although large language models (LLMs) have been touted for their ability to
generate natural-sounding text, there are growing concerns around possible
negative effects of LLMs such as data memorization, bias, and inappropriate
language. Unfortunately, the complexity and generation capacities of LLMs make
validating (and correcting) such concerns difficult. In this work, we introduce
ReLM, a system for validating and querying LLMs using standard regular
expressions. ReLM formalizes and enables a broad range of language model
evaluations, reducing complex evaluation rules to simple regular expression
queries. Our results exploring queries surrounding memorization, gender bias,
toxicity, and language understanding show that ReLM achieves up to 15x higher
system efficiency, 2.5x data efficiency, and increased statistical and
prompt-tuning coverage compared to state-of-the-art ad-hoc queries. ReLM offers
a competitive and general baseline for the increasingly important problem of
LLM validation
Mixed vine copula flows for flexible modeling of neural dependencies
Recordings of complex neural population responses provide a unique opportunity for advancing our understanding of neural information processing at multiple scales and improving performance of brain computer interfaces. However, most existing analytical techniques fall short of capturing the complexity of interactions within the concerted population activity. Vine copula-based approaches have shown to be successful at addressing complex high-order dependencies within the population, disentangled from the single-neuron statistics. However, most applications have focused on parametric copulas which bear the risk of misspecifying dependence structures. In order to avoid this risk, we adopted a fully non-parametric approach for the single-neuron margins and copulas by using Neural Spline Flows (NSF). We validated the NSF framework on simulated data of continuous and discrete types with various forms of dependency structures and with different dimensionality. Overall, NSFs performed similarly to existing non-parametric estimators, while allowing for considerably faster and more flexible sampling which also enables faster Monte Carlo estimation of copula entropy. Moreover, our framework was able to capture low and higher order heavy tail dependencies in neuronal responses recorded in the mouse primary visual cortex during a visual learning task while the animal was navigating a virtual reality environment. These findings highlight an often ignored aspect of complexity in coordinated neuronal activity which can be important for understanding and deciphering collective neural dynamics for neurotechnological applications
Parametric Copula-GP model for analyzing multidimensional neuronal and behavioral relationships
One of the main goals of current systems neuroscience is to understand how neuronal populations integrate sensory information to inform behavior. However, estimating stimulus or behavioral information that is encoded in high-dimensional neuronal populations is challenging. We propose a method based on parametric copulas which allows modeling joint distributions of neuronal and behavioral variables characterized by different statistics and timescales. To account for temporal or spatial changes in dependencies between variables, we model varying copula parameters by means of Gaussian Processes (GP). We validate the resulting Copula-GP framework on synthetic data and on neuronal and behavioral recordings obtained in awake mice. We show that the use of a parametric description of the high-dimensional dependence structure in our method provides better accuracy in mutual information estimation in higher dimensions compared to other non-parametric methods. Moreover, by quantifying the redundancy between neuronal and behavioral variables, our model exposed the location of the reward zone in an unsupervised manner (i.e., without using any explicit cues about the task structure). These results demonstrate that the Copula-GP framework is particularly useful for the analysis of complex multidimensional relationships between neuronal, sensory and behavioral variables
An investigation of galaxy evolution with H-ATLAS and gravitational lenses
This thesis presents a collection of studies which mainly focus on the population
of high-redshift dusty star-forming galaxies (DSFGs) in the context
of galaxy evolution. The sample of DSFGs that is used in this thesis was
discovered as part of the Herschel-Astrophysical Terahertz Large Area Survey (HATLAS;
Eales et al., 2010) which is the largest area extragalactic survey undertaken
with the Herschel Space Telescope.
One Chapter of this Thesis studies the clustering statistics, i.e. the angular
correlation function (ACF), of this population demonstrating that when selected on
the basis of their flux density (i.e. S250¹m È 30 mJy) they exhibit a higher clustering
strength at high redshift (z È 1; r0 Æ 8¡14 Mpc/h) than at low redshfit (z Ç 0.3;
r0 Æ 1¡2 Mpc/h). From a galaxy evolution point of view this it is evident that we
are dealing with two different galaxy populations, where the later is consistent
with being the progenitors of massive early-type galaxies in the local Universe.
This study uses that largest sample of DSFGs (discovered in the H-ATLAS survey)
that has even been used to perform the measurement of their ACF.
Another Chapter of this Thesis studies the properties of the Interstellar medium
(ISM) of one of these high redshift DSFGs (HATLAS J091043.0¡000322) using
a suite of multiwavelength observations. As this object is strongly lensed, we
combined the resolving power of (ALMA) with the enchanced resolution offered
by strong lensing to probe scales down to 300¡700 pc. Our morphological and
kinematical analysis of the ISM components led us to conclude that this object is
more likely undergoing a major-merger event.
Finally, another chapter of this thesis utilizes a sample of Herschel-selected
strongly-lensed galaxies to study the density profiles of the lens population in a
statistical manner. Using both numerically and analytically-derived density distributions
we were able to reproduce the observed distribution of image separations.
Although we were not able to distinguish between the two profiles, we showed
that with a sample »200 lenses that would become possible, highlighting that
the simplicity of our selection of our sample does not introduce any additional
systematics
CalciumGAN: A Generative Adversarial Network Model for Synthesising Realistic Calcium Imaging Data of Neuronal Populations
Calcium imaging has become a powerful and popular technique to monitor the
activity of large populations of neurons in vivo. However, for ethical
considerations and despite recent technical developments, recordings are still
constrained to a limited number of trials and animals. This limits the amount
of data available from individual experiments and hinders the development of
analysis techniques and models for more realistic size of neuronal populations.
The ability to artificially synthesize realistic neuronal calcium signals could
greatly alleviate this problem by scaling up the number of trials. Here we
propose a Generative Adversarial Network (GAN) model to generate realistic
calcium signals as seen in neuronal somata with calcium imaging. To this end,
we adapt the WaveGAN architecture and train it with the Wasserstein distance.
We test the model on artificial data with known ground-truth and show that the
distribution of the generated signals closely resembles the underlying data
distribution. Then, we train the model on real calcium signals recorded from
the primary visual cortex of behaving mice and confirm that the deconvolved
spike trains match the statistics of the recorded data. Together, these results
demonstrate that our model can successfully generate realistic calcium imaging
data, thereby providing the means to augment existing datasets of neuronal
activity for enhanced data exploration and modeling
A search for the lenses in the Herschel Bright Sources (HerBS) sample
Verifying that sub-mm galaxies are gravitationally lensed requires time-expensive observations with oversubscribed high-resolution observatories. Here, we aim to strengthen the evidence of gravitational lensing within the Herschel Bright Sources (HerBS) by cross-comparing their positions to optical (SDSS) and near-infrared (VIKING) surveys, in order to search for the foreground lensing galaxy candidates. Resolved observations of the brightest HerBS sources have already shown that most are lensed, and a galaxy evolution model predicts that ∼76 per cent of the total HerBS sources are lensed, although with the SDSS survey we are only able to identify the likely foreground lenses for 25 per cent of the sources. With the near-infrared VIKING survey, however, we are able to identify the likely foreground lenses for 57 per cent of the sources, and we estimate that 82 per cent of the HerBS sources have lenses on the VIKING images even if we cannot identify the lens in every case. We find that the angular offsets between lens and Herschel source are larger than that expected if the lensing is done by individual galaxies. We also find that the fraction of HerBS sources that are lensed falls with decreasing 500-micron flux density, which is expected from the galaxy evolution model. Finally, we apply our statistical VIKING cross-identification to the entire Herschel-ATLAS catalogue, where we also find that the number of lensed sources falls with decreasing 500-micron flux density
HEAT SHOCK PROTEINS AND ULCERATIVE COLITIS: THE START OF A NEW ERA?
We read with great interest the article written by Abou El Azm
and coworkers, published in the last issue of the Arab Journal of
Gastroenterology [1]. In this article, the authors investigated the
molecular expression of heat shock proteins (HSP) 70 and 90 in
relation to the grades of inflammation and dysplasia in patients
with ulcerative colitis (UC) before and after treatment.
In this study, in agreement with other published studies [2–4],
the authors not only found a potential role for HSP 70 and HSP 90
for assessment of the activity and prognosis of UC, but also such
markers predicted the presence of dysplasia and differentiated it
from reactive atypia [1].
HSP had been found not only a marker of active disease, thus
considering UC as a ‘‘chaperonopathy by mistake’’, but also show
a key role in the psychosocial setting in which inflammatory bowel
diseases manifest themselves [5]. Furthermore, they could represent
a new diagnostic tool to differentiate the different phenotypes
of UC, thus allowing to tailor a targeted approach to better manage
UC patients [6].
However, some unresolved issues still remain about the potential
roles of HSP in both the acute and the longstanding disease. First, it
should be interesting to assess the role of HSP in the infections
associated to UC flares, like Clostridium difficile and Cytomegalovirus
(CMV) infections. In fact, HSP could be investigated as a further
marker of inflammation in case of severe and steroid-refractory
disease; with regard to CMV infection, mucosal levels of HSP could
differentiate when CMV plays a role of direct pathogen or when it
represents merely a ‘‘silent bystander’’. Second, in longstanding
UC, an integrated approach of colorectal cancer surveillance, by
using the advanced endoscopic imaging together with mucosal
markers, like HSP, could result in being markedly helpful, both to
clinicians and pathologist. In fact, current guidelines recommend
that image-enhanced endoscopy (IEE) may increase the yield of
detection of dysplasia, thus representing a reasonable alternative
to the random sampling of colon using standard white light [7].
The use of both IEE and new biomarkers, like HSP, predicting future
occurrence of colonic neoplasia, could lead to a more centralised
approach of UC patients, in which a ‘‘biomarker-based surveillance’’
might play a pivotal rol
- …