185 research outputs found

    AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications

    Full text link
    To automatically correct handwritten assignments, the traditional approach is to use an OCR model to recognize characters and compare them to answers. The OCR model easily gets confused on recognizing handwritten Chinese characters, and the textual information of the answers is missing during the model inference. However, teachers always have these answers in mind to review and correct assignments. In this paper, we focus on the Chinese cloze tests correction and propose a multimodal approach (named AiM). The encoded representations of answers interact with the visual information of students' handwriting. Instead of predicting 'right' or 'wrong', we perform the sequence labeling on the answer text to infer which answer character differs from the handwritten content in a fine-grained way. We take samples of OCR datasets as the positive samples for this task, and develop a negative sample augmentation method to scale up the training data. Experimental results show that AiM outperforms OCR-based methods by a large margin. Extensive studies demonstrate the effectiveness of our multimodal approach.Comment: Accepted to COLING 202

    Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters

    Full text link
    Writing assistance is an application closely related to human life and is also a fundamental Natural Language Processing (NLP) research field. Its aim is to improve the correctness and quality of input texts, with character checking being crucial in detecting and correcting wrong characters. From the perspective of the real world where handwriting occupies the vast majority, characters that humans get wrong include faked characters (i.e., untrue characters created due to writing errors) and misspelled characters (i.e., true characters used incorrectly due to spelling errors). However, existing datasets and related studies only focus on misspelled characters mainly caused by phonological or visual confusion, thereby ignoring faked characters which are more common and difficult. To break through this dilemma, we present Visual-C3^3, a human-annotated Visual Chinese Character Checking dataset with faked and misspelled Chinese characters. To the best of our knowledge, Visual-C3^3 is the first real-world visual and the largest human-crafted dataset for the Chinese character checking scenario. Additionally, we also propose and evaluate novel baseline methods on Visual-C3^3. Extensive empirical results and analyses show that Visual-C3^3 is high-quality yet challenging. The Visual-C3^3 dataset and the baseline methods will be publicly available to facilitate further research in the community.Comment: Work in progres

    A Semi-Analytical Model for the Formation and Evolution of Radio Relics in Galaxy Clusters

    Full text link
    Radio relics are Mpc-sized synchrotron sources located in the peripheral regions of galaxy clusters. Models based on the diffuse shock acceleration (DSA) scenario have been widely accepted to explain the formation of radio relics. However, a critical challenge to these models is that most observed shocks seem too weak to generate detectable emission, unless fossil electrons, a population of mildly energetic electrons that have been accelerated previously, are included in the models. To address this issue, we present a new semi-analytical model to describe the formation and evolution of radio relics by incorporating fossil relativistic electrons into DSA theory, which is constrained by a sample of 14 observed relics, and employ the Press-Schechter formalism to simulate the relics in a 20×2020^{\circ} \times 20^{\circ} sky field at 50, 158, and 1400 MHz, respectively. Results show that fossil electrons contribute significantly to the radio emission, which can generate radiation four orders of magnitude brighter than that solely produced by thermal electrons at 158 MHz, and the power distribution of our simulated radio relic catalog can reconcile the observed P1400MvirP_{1400}-M_{\mathrm{vir}} relation. We predict that 7.1%7.1\% clusters with Mvir>1.2×1014MM_{\mathrm{vir}} > 1.2\times 10^{14}\,\mathrm{M}_{\odot} would host relics at 158 MHz, which is consistent with the result of 10±6%10 \pm 6\% given by the LoTSS DR2. It is also found that radio relics are expected to cause severe foreground contamination in future EoR experiments, similar to that of radio halos. The possibility of AGN providing seed fossil relativistic electrons is evaluated by calculating the number of radio-loud AGNs that a shock is expected to encounter during its propagation.Comment: 15 pages, 20 figures. Accepted for publication in MNRAS. Comments welcom

    Genetic prediction of the causal relationship between schizophrenia and tumors: a Mendelian randomized study

    Get PDF
    BackgroundPatients with schizophrenia are at a higher risk of developing cancer. However, the causal relationship between schizophrenia and different tumor types remains unclear.MethodsUsing a two-sample, two-way Mendelian randomization method, we used publicly available genome-wide association analysis (GWAS) aggregate data to study the causal relationship between schizophrenia and different cancer risk factors. These tumors included lung adenocarcinoma, lung squamous cell carcinoma, small-cell lung cancer, gastric cancer, alcohol-related hepatocellular cancer, tumors involving the lungs, breast, thyroid gland, pancreas, prostate, ovaries and cervix, endometrium, colon and colorectum, and bladder. We used the inverse variance weighting (IVW) method to determine the causal relationship between schizophrenia and different tumor risk factors. In addition, we conducted a sensitivity test to evaluate the effectiveness of the causality.ResultsAfter adjusting for heterogeneity, evidence of a causal relationship between schizophrenia and lung cancer risk was observed (odds ratio [OR]=1.001, 95% confidence interval [CI], 1.000–1.001; P=0.0155). In the sensitivity analysis, the causal effect of schizophrenia on the risk of lung cancer was consistent in both direction and degree. However, no evidence of causality or reverse causality between schizophrenia and other tumors was found.ConclusionThis study elucidated a causal relationship between the genetic predictors of schizophrenia and the risk of lung cancer, thereby providing a basis for the prevention, pathogenesis, and treatment of schizophrenia in patients with lung cancer

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Full text link
    Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

    Time-dependent water permeation behavior of concrete under constant hydraulic pressure

    Get PDF
    In the present work, a concrete permeability testing setup was designed to study the behavior of hydraulic concrete subjected to constant hydraulic pressure. The results show that when concrete is subjected to high enough constant hydraulic pressure, it will be permeated, and after it reaches its maximum permeation rate, the permeability coefficient will gradually decrease towards a stable value. A time-dependent model of permeability coefficient for concrete subjected to hydraulic pressure is proposed. It is indicated that the decrease of the permeability coefficient with permeation time conforms well to the negative-exponential decrease model

    Memories of the Gold Foreign Exchange Market Based on a Moving V-Statistic and Wavelet-Based Multiresolution Analysis

    No full text
    Memory in finance is the foundation of a well-established forecasting model, and new financial theory research shows that the stochastic memory model depends on different time windows. To accurately identify the multivariate long memory model in the financial market, this paper proposes the concept of a moving V-statistic on the basis of a modified R/S method to determine whether the time series has a long-range dependence and subsequently to apply wavelet-based multiresolution analysis to study the multifractality of the financial time series to determine the initial data windows. Finally, we check the moving V-statistic estimation in wavelet analysis in the same condition; the paper selects the volatilities of the gold foreign exchange rates to evaluate the moving V-statistic. According to the results, the method of testing memory established in this paper can identify the breakpoint of the memories effectively. Furthermore, this method can provide support for forecasting returns in the financial market
    corecore