4,753 research outputs found

    Beyond MLE: Convex Learning for Text Generation

    Full text link
    Maximum likelihood estimation (MLE) is a statistical method used to estimate the parameters of a probability distribution that best explain the observed data. In the context of text generation, MLE is often used to train generative language models, which can then be used to generate new text. However, we argue that MLE is not always necessary and optimal, especially for closed-ended text generation tasks like machine translation. In these tasks, the goal of model is to generate the most appropriate response, which does not necessarily require it to estimate the entire data distribution with MLE. To this end, we propose a novel class of training objectives based on convex functions, which enables text generation models to focus on highly probable outputs without having to estimate the entire data distribution. We investigate the theoretical properties of the optimal predicted distribution when applying convex functions to the loss, demonstrating that convex functions can sharpen the optimal distribution, thereby enabling the model to better capture outputs with high probabilities. Experiments on various text generation tasks and models show the effectiveness of our approach. It enables autoregressive models to bridge the gap between greedy and beam search, and facilitates the learning of non-autoregressive models with a maximum improvement of 9+ BLEU points. Moreover, our approach also exhibits significant impact on large language models (LLMs), substantially enhancing their generative capability on various tasks. Source code is available at \url{https://github.com/ictnlp/Convex-Learning}.Comment: NeurIPS 202

    Non-autoregressive Streaming Transformer for Simultaneous Translation

    Full text link
    Simultaneous machine translation (SiMT) models are trained to strike a balance between latency and translation quality. However, training these models to achieve high quality while maintaining low latency often leads to a tendency for aggressive anticipation. We argue that such issue stems from the autoregressive architecture upon which most existing SiMT models are built. To address those issues, we propose non-autoregressive streaming Transformer (NAST) which comprises a unidirectional encoder and a non-autoregressive decoder with intra-chunk parallelism. We enable NAST to generate the blank token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and train it to maximize the non-monotonic latent alignment with an alignment-based latency loss. Experiments on various SiMT benchmarks demonstrate that NAST outperforms previous strong autoregressive SiMT baselines.Comment: EMNLP 2023 main conference; Source code is available at https://github.com/ictnlp/NAS

    Butane-1,2,3,4-tetra­carboxylic acid–4,4′-bipyridine (1/2)

    Get PDF
    The hydro­thermal reaction of butane-1,2,3,4-tetra­carboxylic acid (H4butca), 4,4′-bipyridine (bipy) and Mn(SO4)2·H2O afforded a new co-crystal, C8H10O8·2C10H8N2 or H4butca·2(bipy), in which strong O—H⋯N hydrogen-bonding and weak π–π stacking [centroid–centroid distance = 3.8459 (19) Å] inter­actions assemble the organic mol­ecules into a three-dimensional supra­molecular framework. C—H⋯O inter­actions are also present. The whole mol­ecule has inversion symmetry

    Discriminating bipartite mixed states by local operations

    Full text link
    Unambiguous state discrimination of two mixed bipartite states via local operations and classical communications (LOCC) is studied and compared with the result of a scheme realized via global measurement. We show that the success probability of a global scheme for mixed-state discrimination can be achieved perfectly by the local scheme. In addition, we simulate this discrimination via a pair of pure entangled bipartite states. This simulation is perfect for local rather than global schemes due to the existence of entanglement and global coherence in the pure states. We also prove that LOCC protocol and the sequential state discrimination (SSD) can be interpreted in a unified view. We then hybridize the LOCC protocol with three protocols (SSD, reproducing and broadcasting) relying on classical communications. Such hybridizations extend the gaps between the optimal success probability of global and local schemes, which can be eliminated only for the SSD rather than the other two protocols

    Molecular phylogeny of the antiangiogenic and neurotrophic serpin, pigment epithelium derived factor in vertebrates

    Get PDF
    BACKGROUND: Pigment epithelium derived factor (PEDF), a member of the serpin family, regulates cell proliferation, promotes survival of neurons, and blocks growth of new blood vessels in mammals. Defining the molecular phylogeny of PEDF by bioinformatic analysis is one approach to understanding the link between its gene structure and its function in these biological processes. RESULTS: From a comprehensive search of available DNA databases we identified a single PEDF gene in all vertebrate species examined. These included four mammalian and six non-mammalian vertebrate species in which PEDF had not previously been described. A five gene cluster around PEDF was found in an approximate 100 kb region in mammals, birds, and amphibians. In ray-finned fish these genes are scattered over three chromosomes although only one PEDF gene was consistently found. The PEDF gene is absent in invertebrates including Drosophila melanogaster (D. melanogaster), Caenorhabditis elegans (C. elegans), and sea squirt (C. intestinalis). The PEDF gene is transcribed in all vertebrate phyla, suggesting it is biologically active throughout vertebrate evolution. The multiple actions of PEDF are likely conserved in evolution since it has the same gene structure across phyla, although the size of the gene ranges from 48.3 kb in X. tropicalis to 2.9 kb in fugu, with human PEDF at a size of 15.6 kb. A strong similarity in the proximal 200 bp of the PEDF promoter in mammals suggests the existence of a possible regulatory region across phyla. Using a non-synonymous/synonymous substitution rate ratio we show that mammalian and fish PEDFs have similar ratios of <0.13, reflecting a strong purifying selection of PEDF gene. A large number of repetitive transposable elements of the SINE and LINE class were found with random distribution in both the promoter and introns of mammalian PEDF. CONCLUSION: The PEDF gene first appears in vertebrates and our studies suggest that the regulation and biological actions of this gene are preserved across vertebrates. This comprehensive analysis of the PEDF gene across phyla provides new information that will aid further characterization of common functional motifs of this serpin in biological processes

    The therapeutic evaluation and mechanism on treating bronchial hyper-responsiveness cough by ziyinqingre prescription

    Get PDF
    Objective: Discussing the effects of Ziyinqingre prescription on the level of airway resistance (Rrs), airway response threshold (Dmin), airway conductance (sGrs) and the level of inflammatory cytokines interleukin-4 (IL-4) and interferon-γ (IFN-γ) of the bronchial hyper-responsiveness (BHR) cough patients.Method: 84 subjects diagnosed as BHR were randomly divided into 42 Chinese Traditional medicine group and 42 control group. The Chinese Traditional Medicine group received Ziyinqingre prescription twice a day and the control group received 10mg Montelukast Sodium tablets once a day for two weeks. Observe the clinical symptoms improvement and the changes of the level of the Rrs, Dmin, sGrs and IL-4, IFN-γ.Results: After receiving the medicine, the symptoms of the Chinese medicine group were obviously alleviated, the outcome was more satisfied than that of the control group. Compared with the control group, the level of Dmin increased and sGrs level decreased more obviously (P&lt;0.05); the level of IL-4 decreased and IFN-γlevel increased more obviously in the Chinese medicine group (P&lt;0.05).Conclusion: Ziyinqingre prescription can not only improve BHR patients’ symptoms, but reduce the level of bronchial responsiveness, which proved a better curative effect of Chinese medicine. The mechanism is probably due to relieving the airway inflammation by keeping the balance between Th1 and Th2 cells.Keywords: Ziyinqingre prescription; cough; bronchial hyper-responsiveness; therapeutic mechanis

    GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

    Full text link
    Instruction tuning large language model (LLM) on image-text pairs has achieved unprecedented vision-language multimodal abilities. However, their vision-language alignments are only built on image-level, the lack of region-level alignment limits their advancements to fine-grained multimodal understanding. In this paper, we propose instruction tuning on region-of-interest. The key design is to reformulate the bounding box as the format of spatial instruction. The interleaved sequences of visual features extracted by the spatial instruction and the language embedding are input to LLM, and trained on the transformed region-text data in instruction tuning format. Our region-level vision-language model, termed as GPT4RoI, brings brand new conversational and interactive experience beyond image-level understanding. (1) Controllability: Users can interact with our model by both language and spatial instructions to flexibly adjust the detail level of the question. (2) Capacities: Our model supports not only single-region spatial instruction but also multi-region. This unlocks more region-level multimodal capacities such as detailed region caption and complex region reasoning. (3) Composition: Any off-the-shelf object detector can be a spatial instruction provider so as to mine informative object attributes from our model, like color, shape, material, action, relation to other objects, etc. The code, data, and demo can be found at https://github.com/jshilong/GPT4RoI.Comment: Code has been released at https://github.com/jshilong/GPT4Ro
    corecore