149 research outputs found
Deep Reference Mining From Scholarly Literature in the Arts and Humanities
We consider the task of reference mining: the detection, extraction and classification of references within the full text of scholarly publications. Reference mining brings forward specific challenges, such as the need to capture the morphology of highly abbreviated words and the dependence among the elements of a reference, both following codified reference styles. This task is particularly difficult, and little explored, with respect to the literature in the arts and humanities, where references are mostly given in footnotes. We apply a deep learning architecture for reference mining from the full text of scholarly publications. We explore and discuss three architectural components: word and character-level word embeddings, different prediction layers (Softmax and Conditional Random Fields) and multi-task over single-task learning. Our best model uses both pre-trained word embeddings and characters embeddings, and a BiLSTM-CRF architecture. We test our solution on a dataset of annotated references from the historiography on Venice and, using a linear-chain CRF classifier as a baseline, we show that this deep learning architecture improves by a considerable margin. Furthermore, multi-task learning performs almost on par with a single-task approach. We thus confirm that there are important gains to be had by adopting deep learning for the task of reference mining
Consumer masculinity ideology: conceptualization and initial findings on men's emerging body concerns
Men's body concerns have been increasing in recent decades, as contemporary men express what had almost exclusively been feminine concerns over body appearance. Although traditional masculinity can account for some body concerns, it cannot fully explain their increased prevalence or changing forms. This project examines recent shifts from a production-centered to a consumerist culture, and suggests that this societal change manifests in the emergence of a consumer masculinity ideology. We argue that this new ideology, in which proper masculinity is established, communicated and validated through consumption, is instrumental in explaining men's contemporary body concerns. We provide initial empirical support for the utility of this construct in samples of predominantly ethnic majority, heterosexual, and highly-educated British and Israeli men (N=191, Mage=33.57, SDage=10.24; N=185, Mage=36.05, SDage=11.88, respectively) . In both samples, a preliminary measure of this ideology, the Consumer Masculinity Inventory (CMI), mostly confirmed the predicted associations with measures of traditional masculinity and materialist values, as well as with men's behavioral investment in personal aesthetics and self-labeling as metrosexual. Generally supporting the hypotheses, CMI scores also uniquely predicted most indices of men's body concerns (e.g., self-objectification, drives for muscularity and leanness) beyond measures of traditional masculinity and materialist values. Additionally, CMI scores partially mediated the predictive contributions of traditional masculinity to these body concerns. These preliminary findings highlight the potential contribution of this novel conceptualization and operationalization for psychological research and practice. Future research should thus consider the impact of consumer masculinity on the well-being and body concerns of contemporary men
The X-ray Position and Optical Counterpart of the Accretion-Powered Millisecond Pulsar XTE J1814-338
We report the precise optical and X-ray localization of the 3.2 ms
accretion-powered X-ray pulsar XTE J1814-338 with data from the Chandra X-Ray
Observatory as well as optical observations conducted during the 2003 June
discovery outburst. Optical imaging of the field during the outburst of this
soft X-ray transient reveals an R = 18 star at the X-ray position. This star is
absent (R > 20) from an archival 1989 image of the field and brightened during
the 2003 outburst, and we therefore identify it as the optical counterpart of
XTE J1814-338. The best source position derived from optical astrometry is R.A.
= 18h13m39.s04, Dec.= -33d46m22.3s (J2000). The featureless X-ray spectrum of
the pulsar in outburst is best fit by an absorbed power-law (with photon index
= 1.41 +- 0.06) plus blackbody (with kT = 0.95 +- 0.13 keV) model, where the
blackbody component contributes approximately 10% of the source flux. The
optical broad-band spectrum shows evidence for an excess of infrared emission
with respect to an X-ray heated accretion disk model, suggesting a significant
contribution from the secondary or from a synchrotron-emitting region. A
follow-up observation performed when XTE J1814-338 was in quiescence reveals no
counterpart to a limiting magnitude of R = 23.3. This suggests that the
secondary is an M3 V or later-type star, and therefore very unlikely to be
responsible for the soft excess, making synchroton emission a more reasonable
candidate.Comment: Accepted for publication in ApJ. 6 pages; 3 figure
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
As large language models (LLMs) perform more difficult tasks, it becomes
harder to verify the correctness and safety of their behavior. One approach to
help with this issue is to prompt LLMs to externalize their reasoning, e.g., by
having them generate step-by-step reasoning as they answer a question
(Chain-of-Thought; CoT). The reasoning may enable us to check the process that
models use to perform tasks. However, this approach relies on the stated
reasoning faithfully reflecting the model's actual reasoning, which is not
always the case. To improve over the faithfulness of CoT reasoning, we have
models generate reasoning by decomposing questions into subquestions.
Decomposition-based methods achieve strong performance on question-answering
tasks, sometimes approaching that of CoT while improving the faithfulness of
the model's stated reasoning on several recently-proposed metrics. By forcing
the model to answer simpler subquestions in separate contexts, we greatly
increase the faithfulness of model-generated reasoning over CoT, while still
achieving some of the performance gains of CoT. Our results show it is possible
to improve the faithfulness of model-generated reasoning; continued
improvements may lead to reasoning that enables us to verify the correctness
and safety of LLM behavior.Comment: For few-shot examples and prompts, see
https://github.com/anthropics/DecompositionFaithfulnessPape
Language Models (Mostly) Know What They Know
We study whether language models can evaluate the validity of their own
claims and predict which questions they will be able to answer correctly. We
first show that larger models are well-calibrated on diverse multiple choice
and true/false questions when they are provided in the right format. Thus we
can approach self-evaluation on open-ended sampling tasks by asking models to
first propose answers, and then to evaluate the probability "P(True)" that
their answers are correct. We find encouraging performance, calibration, and
scaling for P(True) on a diverse array of tasks. Performance at self-evaluation
further improves when we allow models to consider many of their own samples
before predicting the validity of one specific possibility. Next, we
investigate whether models can be trained to predict "P(IK)", the probability
that "I know" the answer to a question, without reference to any particular
proposed answer. Models perform well at predicting P(IK) and partially
generalize across tasks, though they struggle with calibration of P(IK) on new
tasks. The predicted P(IK) probabilities also increase appropriately in the
presence of relevant source materials in the context, and in the presence of
hints towards the solution of mathematical word problems. We hope these
observations lay the groundwork for training more honest models, and for
investigating how honesty generalizes to cases where models are trained on
objectives other than the imitation of human writing.Comment: 23+17 pages; refs added, typos fixe
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
We describe our early efforts to red team language models in order to
simultaneously discover, measure, and attempt to reduce their potentially
harmful outputs. We make three main contributions. First, we investigate
scaling behaviors for red teaming across 3 model sizes (2.7B, 13B, and 52B
parameters) and 4 model types: a plain language model (LM); an LM prompted to
be helpful, honest, and harmless; an LM with rejection sampling; and a model
trained to be helpful and harmless using reinforcement learning from human
feedback (RLHF). We find that the RLHF models are increasingly difficult to red
team as they scale, and we find a flat trend with scale for the other model
types. Second, we release our dataset of 38,961 red team attacks for others to
analyze and learn from. We provide our own analysis of the data and find a
variety of harmful outputs, which range from offensive language to more subtly
harmful non-violent unethical outputs. Third, we exhaustively describe our
instructions, processes, statistical methodologies, and uncertainty about red
teaming. We hope that this transparency accelerates our ability to work
together as a community in order to develop shared norms, practices, and
technical standards for how to red team language models
The GH receptor exon 3 deletion is a marker of male-specific exceptional longevity associated with increased GH sensitivity and taller stature
Although both growth hormone (GH) and insulin-like growth factor 1 (IGF-1) signaling were shown to regulate life span in lower organisms, the role of GH signaling in human longevity remains unclear. Because a GH receptor exon 3 deletion (d3-GHR) appears to modulate GH sensitivity in humans, we hypothesized that this polymorphism could play a role in human longevity. We report a linear increased prevalence of d3-GHR homozygosity with age in four independent cohorts of long-lived individuals: 841 participants [567 of the Longevity Genes Project (LGP) (8% increase; P = 0.01), 152 of the Old Order Amish (16% increase; P = 0.02), 61 of the Cardiovascular Health Study (14.2% increase; P = 0.14), and 61 of the French Long-Lived Study (23.5% increase; P = 0.02)]. In addition, mega analysis of males in all cohorts resulted in a significant positive trend with age (26% increase; P = 0.007), suggesting sexual dimorphism for GH action in longevity. Further, on average, LGP d3/d3 homozygotes were 1 inch taller than the wild-type (WT) allele carriers (P = 0.05) and also showed lower serum IGF-1 levels (P = 0.003). Multivariate regression analysis indicated that the presence of d3/d3 genotype adds approximately 10 years to life span. The LGP d3/d3-GHR transformed lymphocytes exhibited superior growth and extracellular signalâregulated kinase activation, to GH treatment relative to WT GHR lymphocytes (P < 0.01), indicating a GH dose response. The d3-GHR variant is a common genetic polymorphism that modulates GH responsiveness throughout the life span and positively affects male longevity
PSR J1723â2837: AN ECLIPSING BINARY RADIO MILLISECOND PULSAR
ABSTRACT We present a study of PSR J1723â2837, an eclipsing, 1.86 ms millisecond binary radio pulsar discovered in the Parkes Multibeam survey. Radio timing indicates that the pulsar has a circular orbit with a 15 hr orbital period, a lowmass companion, and a measurable orbital period derivative. The eclipse fraction of âŒ15% during the pulsar's orbit is twice the Roche lobe size inferred for the companion. The timing behavior is significantly affected by unmodeled systematics of astrophysical origin, and higher-order orbital period derivatives are needed in the timing solution to account for these variations. We have identified the pulsar's (non-degenerate) companion using archival ultraviolet, optical, and infrared survey data and new optical photometry. Doppler shifts from optical spectroscopy confirm the star's association with the pulsar and indicate a pulsar-to-companion mass ratio of 3.3 ± 0.5, corresponding to a companion mass range of 0.4 to 0.7 M â and an orbital inclination angle range of between 30 âą and 41 âą , assuming a pulsar mass range of 1.4-2.0 M â . Spectroscopy indicates a spectral type of G for the companion and an inferred Roche-lobe-filling distance that is consistent with the distance estimated from radio dispersion. The features of PSR J1723â2837 indicate that it is likely a "redback" system. Unlike the five other Galactic redbacks discovered to date, PSR J1723â2837 has not been detected as a Îł -ray source with Fermi. This may be due to an intrinsic spin-down luminosity that is much smaller than the measured value if the unmeasured contribution from proper motion is large
The international WAO/EAACI guideline for the management of hereditary angioedemaâThe 2021 revision and update
Hereditary angioedema (HAE) is a rare and disabling disease for which early diagnosis and effective therapy are critical. This revision and update of the global WAO/EAACI guideline on the diagnosis and management of HAE provides up-to-date guidance for the management of HAE. For this update and revision of the guideline, an international panel of experts reviewed the existing evidence, developed 28 recommendations, and established consensus by an online DELPHI process. The goal of these recommendations and guideline is to help physicians and their patients in making rational decisions in the management of HAE with deficient C1 inhibitor (type 1) and HAE with dysfunctional C1 inhibitor (type 2), by providing guidance on common and important clinical issues, such as: (1) How should HAE be diagnosed? (2) When should HAE patients receive prophylactic on top of on-demand treatment and what treatments should be used? (3) What are the goals of treatment? (4) Should HAE management be different for special HAE patient groups such as children or pregnant/breast-feeding women? and (5) How should HAE patients monitor their disease activity, impact, and control? It is also the intention of this guideline to help establish global standards for the management of HAE and to encourage and facilitate the use of recommended diagnostics and therapies for all patients
- âŠ