207 research outputs found
Bayesian Semi-parametric Expected Shortfall Forecasting in Financial Markets
Bayesian semi-parametric estimation has proven effective for quantile estimation in general and specifically in financial Value at Risk forecasting. Expected short-fall is a competing tail risk measure, involving a conditional expectation beyond a quantile, that has recently been semi-parametrically estimated via asymmetric least squares and so-called expectiles. An asymmetric Gaussian density is proposed allowing a likelihood to be developed that leads to Bayesian semi-parametric estimation and forecasts of expectiles and expected shortfall. Further, the conditional autoregressive expectile class of model is generalised to two fully nonlinear families. Adaptive Markov chain Monte Carlo sampling schemes are employed for estimation in these families. The proposed models are clearly favoured in an empirical study forecasting eleven financial return series: clear evidence of more accurate expected shortfall forecasting, compared to a range of competing methods is found. Further, the most favoured models are those estimated by Bayesian methods
Identifying and Extracting Rare Disease Phenotypes with Large Language Models
Rare diseases (RDs) are collectively common and affect 300 million people
worldwide. Accurate phenotyping is critical for informing diagnosis and
treatment, but RD phenotypes are often embedded in unstructured text and
time-consuming to extract manually. While natural language processing (NLP)
models can perform named entity recognition (NER) to automate extraction, a
major bottleneck is the development of a large, annotated corpus for model
training. Recently, prompt learning emerged as an NLP paradigm that can lead to
more generalizable results without any (zero-shot) or few labeled samples
(few-shot). Despite growing interest in ChatGPT, a revolutionary large language
model capable of following complex human prompts and generating high-quality
responses, none have studied its NER performance for RDs in the zero- and
few-shot settings. To this end, we engineered novel prompts aimed at extracting
RD phenotypes and, to the best of our knowledge, are the first the establish a
benchmark for evaluating ChatGPT's performance in these settings. We compared
its performance to the traditional fine-tuning approach and conducted an
in-depth error analysis. Overall, fine-tuning BioClinicalBERT resulted in
higher performance (F1 of 0.689) than ChatGPT (F1 of 0.472 and 0.591 in the
zero- and few-shot settings, respectively). Despite this, ChatGPT achieved
similar or higher accuracy for certain entities (i.e., rare diseases and signs)
in the one-shot setting (F1 of 0.776 and 0.725). This suggests that with
appropriate prompt engineering, ChatGPT has the potential to match or
outperform fine-tuned language models for certain entity types with just one
labeled sample. While the proliferation of large language models may provide
opportunities for supporting RD diagnosis and treatment, researchers and
clinicians should critically evaluate model outputs and be well-informed of
their limitations
Introducing the Arts and Science Section
The Science and arts are commonly considered to be two vastly different disciplines contrasting each other primarily in terms of methodology. The sciences value precision and the ability to control and replicate results, while arts stem from the fluid and unique expression of one’s self using different mediums. However, despite these characteristics, the ultimate goal of both areas of study is to explore the unknown. For arts, this entails delving into human emotions through abstract thoughts and ideas, while sciences use the same abstractions and imagination to experiment with and create using objects in the natural world. This combination of internal and external explorations define the intrinsic elements of the universe, which is why arts and science belong together. Les sciences et les arts sont souvent considérés comme deux disciplines vastement différents qui se contrastent surtout en terme de méthodologie. Les sciences mettent en valeur la précision et la capacité de contrôler et répéter des résultats alors que les arts proviennent de l’expression fluide et unique de soi-même en utilisant de divers moyens. Cependant, malgré ces caractéristiques, le but ultime des deux domaines d’étude est d’explorer l’inconnu
Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes
PURPOSE: The medical literature relevant to germline genetics is growing
exponentially. Clinicians need tools monitoring and prioritizing the literature
to understand the clinical implications of the pathogenic genetic variants. We
developed and evaluated two machine learning models to classify abstracts as
relevant to the penetrance (risk of cancer for germline mutation carriers) or
prevalence of germline genetic mutations. METHODS: We conducted literature
searches in PubMed and retrieved paper titles and abstracts to create an
annotated dataset for training and evaluating the two machine learning
classification models. Our first model is a support vector machine (SVM) which
learns a linear decision rule based on the bag-of-ngrams representation of each
title and abstract. Our second model is a convolutional neural network (CNN)
which learns a complex nonlinear decision rule based on the raw title and
abstract. We evaluated the performance of the two models on the classification
of papers as relevant to penetrance or prevalence. RESULTS: For penetrance
classification, we annotated 3740 paper titles and abstracts and used 60% for
training the model, 20% for tuning the model, and 20% for evaluating the model.
The SVM model achieves 89.53% accuracy (percentage of papers that were
correctly classified) while the CNN model achieves 88.95 % accuracy. For
prevalence classification, we annotated 3753 paper titles and abstracts. The
SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 %
accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts
as relevant to penetrance or prevalence. By facilitating literature review,
this tool could help clinicians and researchers keep abreast of the burgeoning
knowledge of gene-cancer associations and keep the knowledge bases for clinical
decision support tools up to date
A role for Tbx5 in proepicardial cell migration during cardiogenesis
Transcriptional regulatory cascades during epicardial and coronary vascular development from proepicardial progenitor cells remain to be defined. We have used immunohistochemistry of human embryonic tissues to demonstrate that the TBX5 transcription factor is expressed not only in the myocardium, but also throughout the embryonic epicardium and coronary vasculature. TBX5 is not expressed in other human fetal vascular beds. Furthermore, immunohistochemical analyses of human embryonic tissues reveals that unlike their epicardial counterparts, delaminating epicardial-derived cells do not express TBX5 as they migrate through the subepicardium before undergoing epithelial-mesenchymal transformation required for coronary vasculogenesis. In the chick, Tbx5 is expressed in the embryonic proepicardial organ (PEO), which is composed of the epicardial and coronary vascular progenitor cells. Retrovirus-mediated overexpression of human TBX5 inhibits cell incorporation of infected proepicardial cells into the nascent chick epicardium and coronary vasculature. TBX5 overexpression as well as antisense-mediated knockdown of chick Tbx5 produce a cell-autonomous defect in the PEO that prevents proepicardial cell migration. Thus, both increasing and decreasing Tbx5 dosage impairs development of the proepicardium. Culture of explanted PEOs demonstrates that untreated chick proepicardial cells downregulate Tbx5 expression during cell migration. Therefore, we propose that Tbx5 participates in regulation of proepicardial cell migration, a critical event in the establishment of the epicardium and coronary vasculature
Sexual dimorphism in the social behaviour of Cntnap2-null mice correlates with disrupted synaptic connectivity and increased microglial activity in the anterior cingulate cortex
A biological understanding of the apparent sex bias in autism is lacking. Here we have identified Cntnap2 KO mice as a model system to help better understand this dimorphism. Using this model, we observed social deficits in juvenile male KO mice only. These male-specific social deficits correlated with reduced spine densities of Layer 2/3 and Layer 5 pyramidal neurons in the Anterior Cingulate Cortex, a forebrain region prominently associated with the control of social behaviour. Furthermore, in male KO mice, microglia showed an increased activated morphology and phagocytosis of synaptic structures compared to WT mice, whereas no differences were seen in female KO and WT mice. Our data suggest that sexually dimorphic microglial activity may be involved in the aetiology of ASD, disrupting the development of neural circuits that control social behaviour by overpruning synapses at a developmentally critical period
- …