145 research outputs found

    Deep Reference Mining From Scholarly Literature in the Arts and Humanities

    Get PDF
    We consider the task of reference mining: the detection, extraction and classification of references within the full text of scholarly publications. Reference mining brings forward specific challenges, such as the need to capture the morphology of highly abbreviated words and the dependence among the elements of a reference, both following codified reference styles. This task is particularly difficult, and little explored, with respect to the literature in the arts and humanities, where references are mostly given in footnotes. We apply a deep learning architecture for reference mining from the full text of scholarly publications. We explore and discuss three architectural components: word and character-level word embeddings, different prediction layers (Softmax and Conditional Random Fields) and multi-task over single-task learning. Our best model uses both pre-trained word embeddings and characters embeddings, and a BiLSTM-CRF architecture. We test our solution on a dataset of annotated references from the historiography on Venice and, using a linear-chain CRF classifier as a baseline, we show that this deep learning architecture improves by a considerable margin. Furthermore, multi-task learning performs almost on par with a single-task approach. We thus confirm that there are important gains to be had by adopting deep learning for the task of reference mining

    Consumer masculinity ideology: conceptualization and initial findings on men's emerging body concerns

    Get PDF
    Men's body concerns have been increasing in recent decades, as contemporary men express what had almost exclusively been feminine concerns over body appearance. Although traditional masculinity can account for some body concerns, it cannot fully explain their increased prevalence or changing forms. This project examines recent shifts from a production-centered to a consumerist culture, and suggests that this societal change manifests in the emergence of a consumer masculinity ideology. We argue that this new ideology, in which proper masculinity is established, communicated and validated through consumption, is instrumental in explaining men's contemporary body concerns. We provide initial empirical support for the utility of this construct in samples of predominantly ethnic majority, heterosexual, and highly-educated British and Israeli men (N=191, Mage=33.57, SDage=10.24; N=185, Mage=36.05, SDage=11.88, respectively) . In both samples, a preliminary measure of this ideology, the Consumer Masculinity Inventory (CMI), mostly confirmed the predicted associations with measures of traditional masculinity and materialist values, as well as with men's behavioral investment in personal aesthetics and self-labeling as metrosexual. Generally supporting the hypotheses, CMI scores also uniquely predicted most indices of men's body concerns (e.g., self-objectification, drives for muscularity and leanness) beyond measures of traditional masculinity and materialist values. Additionally, CMI scores partially mediated the predictive contributions of traditional masculinity to these body concerns. These preliminary findings highlight the potential contribution of this novel conceptualization and operationalization for psychological research and practice. Future research should thus consider the impact of consumer masculinity on the well-being and body concerns of contemporary men

    The X-ray Position and Optical Counterpart of the Accretion-Powered Millisecond Pulsar XTE J1814-338

    Get PDF
    We report the precise optical and X-ray localization of the 3.2 ms accretion-powered X-ray pulsar XTE J1814-338 with data from the Chandra X-Ray Observatory as well as optical observations conducted during the 2003 June discovery outburst. Optical imaging of the field during the outburst of this soft X-ray transient reveals an R = 18 star at the X-ray position. This star is absent (R > 20) from an archival 1989 image of the field and brightened during the 2003 outburst, and we therefore identify it as the optical counterpart of XTE J1814-338. The best source position derived from optical astrometry is R.A. = 18h13m39.s04, Dec.= -33d46m22.3s (J2000). The featureless X-ray spectrum of the pulsar in outburst is best fit by an absorbed power-law (with photon index = 1.41 +- 0.06) plus blackbody (with kT = 0.95 +- 0.13 keV) model, where the blackbody component contributes approximately 10% of the source flux. The optical broad-band spectrum shows evidence for an excess of infrared emission with respect to an X-ray heated accretion disk model, suggesting a significant contribution from the secondary or from a synchrotron-emitting region. A follow-up observation performed when XTE J1814-338 was in quiescence reveals no counterpart to a limiting magnitude of R = 23.3. This suggests that the secondary is an M3 V or later-type star, and therefore very unlikely to be responsible for the soft excess, making synchroton emission a more reasonable candidate.Comment: Accepted for publication in ApJ. 6 pages; 3 figure

    Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

    Full text link
    As large language models (LLMs) perform more difficult tasks, it becomes harder to verify the correctness and safety of their behavior. One approach to help with this issue is to prompt LLMs to externalize their reasoning, e.g., by having them generate step-by-step reasoning as they answer a question (Chain-of-Thought; CoT). The reasoning may enable us to check the process that models use to perform tasks. However, this approach relies on the stated reasoning faithfully reflecting the model's actual reasoning, which is not always the case. To improve over the faithfulness of CoT reasoning, we have models generate reasoning by decomposing questions into subquestions. Decomposition-based methods achieve strong performance on question-answering tasks, sometimes approaching that of CoT while improving the faithfulness of the model's stated reasoning on several recently-proposed metrics. By forcing the model to answer simpler subquestions in separate contexts, we greatly increase the faithfulness of model-generated reasoning over CoT, while still achieving some of the performance gains of CoT. Our results show it is possible to improve the faithfulness of model-generated reasoning; continued improvements may lead to reasoning that enables us to verify the correctness and safety of LLM behavior.Comment: For few-shot examples and prompts, see https://github.com/anthropics/DecompositionFaithfulnessPape

    Language Models (Mostly) Know What They Know

    Full text link
    We study whether language models can evaluate the validity of their own claims and predict which questions they will be able to answer correctly. We first show that larger models are well-calibrated on diverse multiple choice and true/false questions when they are provided in the right format. Thus we can approach self-evaluation on open-ended sampling tasks by asking models to first propose answers, and then to evaluate the probability "P(True)" that their answers are correct. We find encouraging performance, calibration, and scaling for P(True) on a diverse array of tasks. Performance at self-evaluation further improves when we allow models to consider many of their own samples before predicting the validity of one specific possibility. Next, we investigate whether models can be trained to predict "P(IK)", the probability that "I know" the answer to a question, without reference to any particular proposed answer. Models perform well at predicting P(IK) and partially generalize across tasks, though they struggle with calibration of P(IK) on new tasks. The predicted P(IK) probabilities also increase appropriately in the presence of relevant source materials in the context, and in the presence of hints towards the solution of mathematical word problems. We hope these observations lay the groundwork for training more honest models, and for investigating how honesty generalizes to cases where models are trained on objectives other than the imitation of human writing.Comment: 23+17 pages; refs added, typos fixe

    Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

    Full text link
    We describe our early efforts to red team language models in order to simultaneously discover, measure, and attempt to reduce their potentially harmful outputs. We make three main contributions. First, we investigate scaling behaviors for red teaming across 3 model sizes (2.7B, 13B, and 52B parameters) and 4 model types: a plain language model (LM); an LM prompted to be helpful, honest, and harmless; an LM with rejection sampling; and a model trained to be helpful and harmless using reinforcement learning from human feedback (RLHF). We find that the RLHF models are increasingly difficult to red team as they scale, and we find a flat trend with scale for the other model types. Second, we release our dataset of 38,961 red team attacks for others to analyze and learn from. We provide our own analysis of the data and find a variety of harmful outputs, which range from offensive language to more subtly harmful non-violent unethical outputs. Third, we exhaustively describe our instructions, processes, statistical methodologies, and uncertainty about red teaming. We hope that this transparency accelerates our ability to work together as a community in order to develop shared norms, practices, and technical standards for how to red team language models

    The GH receptor exon 3 deletion is a marker of male-specific exceptional longevity associated with increased GH sensitivity and taller stature

    Get PDF
    Although both growth hormone (GH) and insulin-like growth factor 1 (IGF-1) signaling were shown to regulate life span in lower organisms, the role of GH signaling in human longevity remains unclear. Because a GH receptor exon 3 deletion (d3-GHR) appears to modulate GH sensitivity in humans, we hypothesized that this polymorphism could play a role in human longevity. We report a linear increased prevalence of d3-GHR homozygosity with age in four independent cohorts of long-lived individuals: 841 participants [567 of the Longevity Genes Project (LGP) (8% increase; P = 0.01), 152 of the Old Order Amish (16% increase; P = 0.02), 61 of the Cardiovascular Health Study (14.2% increase; P = 0.14), and 61 of the French Long-Lived Study (23.5% increase; P = 0.02)]. In addition, mega analysis of males in all cohorts resulted in a significant positive trend with age (26% increase; P = 0.007), suggesting sexual dimorphism for GH action in longevity. Further, on average, LGP d3/d3 homozygotes were 1 inch taller than the wild-type (WT) allele carriers (P = 0.05) and also showed lower serum IGF-1 levels (P = 0.003). Multivariate regression analysis indicated that the presence of d3/d3 genotype adds approximately 10 years to life span. The LGP d3/d3-GHR transformed lymphocytes exhibited superior growth and extracellular signal–regulated kinase activation, to GH treatment relative to WT GHR lymphocytes (P < 0.01), indicating a GH dose response. The d3-GHR variant is a common genetic polymorphism that modulates GH responsiveness throughout the life span and positively affects male longevity

    PSR J1723−2837: AN ECLIPSING BINARY RADIO MILLISECOND PULSAR

    Get PDF
    ABSTRACT We present a study of PSR J1723−2837, an eclipsing, 1.86 ms millisecond binary radio pulsar discovered in the Parkes Multibeam survey. Radio timing indicates that the pulsar has a circular orbit with a 15 hr orbital period, a lowmass companion, and a measurable orbital period derivative. The eclipse fraction of ∌15% during the pulsar&apos;s orbit is twice the Roche lobe size inferred for the companion. The timing behavior is significantly affected by unmodeled systematics of astrophysical origin, and higher-order orbital period derivatives are needed in the timing solution to account for these variations. We have identified the pulsar&apos;s (non-degenerate) companion using archival ultraviolet, optical, and infrared survey data and new optical photometry. Doppler shifts from optical spectroscopy confirm the star&apos;s association with the pulsar and indicate a pulsar-to-companion mass ratio of 3.3 ± 0.5, corresponding to a companion mass range of 0.4 to 0.7 M ⊙ and an orbital inclination angle range of between 30 ‱ and 41 ‱ , assuming a pulsar mass range of 1.4-2.0 M ⊙ . Spectroscopy indicates a spectral type of G for the companion and an inferred Roche-lobe-filling distance that is consistent with the distance estimated from radio dispersion. The features of PSR J1723−2837 indicate that it is likely a &quot;redback&quot; system. Unlike the five other Galactic redbacks discovered to date, PSR J1723−2837 has not been detected as a Îł -ray source with Fermi. This may be due to an intrinsic spin-down luminosity that is much smaller than the measured value if the unmeasured contribution from proper motion is large

    The international WAO/EAACI guideline for the management of hereditary angioedema—The 2021 revision and update

    Get PDF
    Hereditary angioedema (HAE) is a rare and disabling disease for which early diagnosis and effective therapy are critical. This revision and update of the global WAO/EAACI guideline on the diagnosis and management of HAE provides up-to-date guidance for the management of HAE. For this update and revision of the guideline, an international panel of experts reviewed the existing evidence, developed 28 recommendations, and established consensus by an online DELPHI process. The goal of these recommendations and guideline is to help physicians and their patients in making rational decisions in the management of HAE with deficient C1 inhibitor (type 1) and HAE with dysfunctional C1 inhibitor (type 2), by providing guidance on common and important clinical issues, such as: (1) How should HAE be diagnosed? (2) When should HAE patients receive prophylactic on top of on-demand treatment and what treatments should be used? (3) What are the goals of treatment? (4) Should HAE management be different for special HAE patient groups such as children or pregnant/breast-feeding women? and (5) How should HAE patients monitor their disease activity, impact, and control? It is also the intention of this guideline to help establish global standards for the management of HAE and to encourage and facilitate the use of recommended diagnostics and therapies for all patients
    • 

    corecore