109 research outputs found

    Revisiting the Importance of Encoding Logic Rules in Sentiment Classification

    Full text link
    We analyze the performance of different sentiment classification models on syntactically complex inputs like A-but-B sentences. The first contribution of this analysis addresses reproducible research: to meaningfully compare different models, their accuracies must be averaged over far more random seeds than what has traditionally been reported. With proper averaging in place, we notice that the distillation model described in arXiv:1603.06318v4 [cs.LG], which incorporates explicit logic rules for sentiment classification, is ineffective. In contrast, using contextualized ELMo embeddings (arXiv:1802.05365v2 [cs.CL]) instead of logic rules yields significantly better performance. Additionally, we provide analysis and visualizations that demonstrate ELMo's ability to implicitly learn logic rules. Finally, a crowdsourced analysis reveals how ELMo outperforms baseline models even on sentences with ambiguous sentiment labels.Comment: EMNLP 2018 Camera Read

    A continuum mechanics model for fatigue life prediction with pre- corrosion and sequential corrosion fatigue

    Get PDF
    We present a continuum model to predict pre-corrosion fatigue which is a prevalent damage mechanism in aerospace structures under operational conditions. It is assumed that the process of corrosion and fatigue sometimes exist separately to a large extent. In this scenario, it is assumed that when an aircraft is in fight at high altitude, cyclic loading due to engine vibration and flutter is at its maximum, whereas the corrosive processes due to moisture or temperature are minimal. And when the aircraft is on the ground the corrosive process is at its maximum, whereas vibration loading is non-existent. For demonstration purposes, we study the effect of prior corrosion on fatigue life of aluminum alloy 7075-T6. In this work, we employ Continuum Damage Mechanics (CDM) as the modeling platform to study the fatigue crack initiation and growth from a pre-existing corrosion pit. In the CDM approach, a crack is assumed to initiate when damage variable, D, attains a critical value Dc. We use the corrosion-free fatigue data to calibrate Dc0 for 7075-T6. This value for critical damage signifies the failure of a representative value element (RVE) when corrosion is non-existent, see Fig. 1. In other words, the corrosion exposure time is zero, t = 0. The corrosion RVE starts to corrode as time elapses. The effect of corrosion is shown by increased in surface roughness. At initial times of exposure, damage occurs as corrosion pits and increased surface roughness. As time passes, pits grow in size and spread over the entire surface of RVE. After long time of exposure, the RVE will corrode in a self-similar manner, meaning that we assume that surface roughness reaches a limit value while uniform surface recession continues. We refer to this model as the concept of corroded RVE as shown in Fig. 1. We used this model to predict the fatigue life of 7075-T6 exposed for 0, 96, 768 and 1536 hrs in the prehesion spray. The predicted results are in a reasonable agreement with experimental data. We further tested the model for life prediction of sequential corrosion-fatigue scenarios where corrosion and fatigue occur in sequence. For maximum stress of σmax = 340 MPa, load ratio of R = 0.1 and exposure time of texp = 100 hrs, the model predicts 17% increase in fatigue life for sequential corrosion-fatigue than the pre-corrosion fatigue. This result is interesting since it shows the interaction between corrosion and fatigue cycles. The result infers that if the corrosion time is spread over the fatigue cycles the life may increase. Please click Additional Files below to see the full abstract

    Closure of fatigue cracks at high strains

    Get PDF
    Experiments were conducted on smooth specimens to study the closure behavior of short cracks at high cyclic strains under completely reversed cycling. Testing procedures and methodology, and closure measurement techniques, are described in detail. The strain levels chosen for the study cover from predominantly elastic to grossly plastic strains. Crack closure measurements are made at different crack lengths. The study reveals that, at high strains, cracks close only as the lowest stress level in the cycle is approached. The crack opening is observed to occur in the compressive part of the loading cycle. The applied stress needed to open a short crack under high strain is found to be less than for cracks under small scale yielding. For increased plastic deformations, the value of sigma sub op/sigma sub max is observed to decrease and approaches the value of R. Comparison of the experimental results with existing analysis is made and indicates the limitations of the small scale yielding approach where gross plastic deformation behavior occurs

    Large language models effectively leverage document-level context for literary translation, but critical errors persist

    Full text link
    Large language models (LLMs) are competitive with the state of the art on a wide range of sentence-level translation datasets. However, their ability to translate paragraphs and documents remains unexplored because evaluation in these settings is costly and difficult. We show through a rigorous human evaluation that asking the Gpt-3.5 (text-davinci-003) LLM to translate an entire literary paragraph (e.g., from a novel) at once results in higher-quality translations than standard sentence-by-sentence translation across 18 linguistically-diverse language pairs (e.g., translating into and out of Japanese, Polish, and English). Our evaluation, which took approximately 350 hours of effort for annotation and analysis, is conducted by hiring translators fluent in both the source and target language and asking them to provide both span-level error annotations as well as preference judgments of which system's translations are better. We observe that discourse-level LLM translators commit fewer mistranslations, grammar errors, and stylistic inconsistencies than sentence-level approaches. With that said, critical errors still abound, including occasional content omissions, and a human translator's intervention remains necessary to ensure that the author's voice remains intact. We publicly release our dataset and error annotations to spur future research on evaluation of document-level literary translation.Comment: preprint (31 pages

    Generating Question-Answer Hierarchies

    Full text link
    The process of knowledge acquisition can be viewed as a question-answer game between a student and a teacher in which the student typically starts by asking broad, open-ended questions before drilling down into specifics (Hintikka, 1981; Hakkarainen and Sintonen, 2002). This pedagogical perspective motivates a new way of representing documents. In this paper, we present SQUASH (Specificity-controlled Question-Answer Hierarchies), a novel and challenging text generation task that converts an input document into a hierarchy of question-answer pairs. Users can click on high-level questions (e.g., "Why did Frodo leave the Fellowship?") to reveal related but more specific questions (e.g., "Who did Frodo leave with?"). Using a question taxonomy loosely based on Lehnert (1978), we classify questions in existing reading comprehension datasets as either "general" or "specific". We then use these labels as input to a pipelined system centered around a conditional neural language model. We extensively evaluate the quality of the generated QA hierarchies through crowdsourced experiments and report strong empirical results.Comment: ACL camera ready + technical note on pipeline modifications for demo (15 pages
    • …
    corecore