145 research outputs found

    Leveraging Key Information Modeling to Improve Less-Data Constrained News Headline Generation via Duality Fine-Tuning

    Full text link
    Recent language generative models are mostly trained on large-scale datasets, while in some real scenarios, the training datasets are often expensive to obtain and would be small-scale. In this paper we investigate the challenging task of less-data constrained generation, especially when the generated news headlines are short yet expected by readers to keep readable and informative simultaneously. We highlight the key information modeling task and propose a novel duality fine-tuning method by formally defining the probabilistic duality constraints between key information prediction and headline generation tasks. The proposed method can capture more information from limited data, build connections between separate tasks, and is suitable for less-data constrained generation tasks. Furthermore, the method can leverage various pre-trained generative regimes, e.g., autoregressive and encoder-decoder models. We conduct extensive experiments to demonstrate that our method is effective and efficient to achieve improved performance in terms of language modeling metric and informativeness correctness metric on two public datasets.Comment: Accepted by AACL-IJCNLP 2022 main conferenc

    UER: A Heuristic Bias Addressing Approach for Online Continual Learning

    Full text link
    Online continual learning aims to continuously train neural networks from a continuous data stream with a single pass-through data. As the most effective approach, the rehearsal-based methods replay part of previous data. Commonly used predictors in existing methods tend to generate biased dot-product logits that prefer to the classes of current data, which is known as a bias issue and a phenomenon of forgetting. Many approaches have been proposed to overcome the forgetting problem by correcting the bias; however, they still need to be improved in online fashion. In this paper, we try to address the bias issue by a more straightforward and more efficient method. By decomposing the dot-product logits into an angle factor and a norm factor, we empirically find that the bias problem mainly occurs in the angle factor, which can be used to learn novel knowledge as cosine logits. On the contrary, the norm factor abandoned by existing methods helps remember historical knowledge. Based on this observation, we intuitively propose to leverage the norm factor to balance the new and old knowledge for addressing the bias. To this end, we develop a heuristic approach called unbias experience replay (UER). UER learns current samples only by the angle factor and further replays previous samples by both the norm and angle factors. Extensive experiments on three datasets show that UER achieves superior performance over various state-of-the-art methods. The code is in https://github.com/FelixHuiweiLin/UER.Comment: 9 pages, 12 figures, ACM MM202

    Notch2 controls hepatocyte-derived cholangiocarcinoma formation in mice.

    Get PDF
    Liver cancer comprises a group of malignant tumors, among which hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) are the most common. ICC is especially pernicious and associated with poor clinical outcome. Studies have shown that a subset of human ICCs may originate from mature hepatocytes. However, the mechanisms driving the trans-differentiation of hepatocytes into malignant cholangiocytes remain poorly defined. We adopted lineage tracing techniques and an established murine hepatocyte-derived ICC model by hydrodynamic injection of activated forms of AKT (myr-AKT) and Yap (YapS127A) proto-oncogenes. Wild-type, Notch1 flox/flox , and Notch2 flox/flox mice were used to investigate the role of canonical Notch signaling and Notch receptors in AKT/Yap-driven ICC formation. Human ICC and HCC cell lines were transfected with siRNA against Notch2 to determine whether Notch2 regulates biliary marker expression in liver tumor cells. We found that AKT/Yap-induced ICC formation is hepatocyte derived and this process is strictly dependent on the canonical Notch signaling pathway in vivo. Deletion of Notch2 in AKT/Yap-induced tumors switched the phenotype from ICC to hepatocellular adenoma-like lesions, while inactivation of Notch1 in hepatocytes did not result in significant histomorphological changes. Finally, in vitro studies revealed that Notch2 silencing in ICC and HCC cell lines down-regulates the expression of Sox9 and EpCAM biliary markers. Notch2 is the major determinant of hepatocyte-derived ICC formation in mice

    Adopting a Theophylline-Responsive Riboswitch for Flexible Regulation and Understanding of Glycogen Metabolism in Synechococcus elongatus PCC7942

    Get PDF
    Cyanobacteria are supposed to be promising photosynthetic microbial platforms that recycle carbon dioxide driven into biomass and bioproducts by solar energy. Glycogen synthesis serves as an essential natural carbon sink mechanism, storing a large portion of energy and organic carbon source of photosynthesis. Engineering glycogen metabolism to harness and rewire carbon flow is an important strategy to optimize efficacy of cyanobacteria platforms. ADP-glucose pyrophosphorylase (GlgC) catalyzes the rate-limiting step for glycogen synthesis. However, knockout of glgC fails to promote cell growth or photosynthetic production in cyanobacteria, on the contrary, glgC deficiency impairs cellular fitness and robustness. In this work, we adopted a theophylline-responsive riboswitch to engineer and control glgC expression in Synechococcus elongatus PCC7942 and achieved flexible regulation of intracellular GlgC abundance and glycogen storage. With this approach, glycogen synthesis and glycogen contents in PCC7942 cells could be regulated in a range from about 40 to 300% of wild type levels. In addition, the results supported a positive role of glycogen metabolism in cyanobacteria cellular robustness. When glycogen storage was reduced, cellular physiology and growth under standard conditions was not impaired, while cellular tolerance toward environmental stresses was weakened. While when glycogen synthesis was enhanced, cells of PCC7942 displayed optimized cellular robustness. Our findings emphasize the significance of glycogen metabolism for cyanobacterial physiology and the importance of flexible approaches for engineering and understanding cellular physiology and metabolism

    MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining

    Full text link
    Text images contain both visual and linguistic information. However, existing pre-training techniques for text recognition mainly focus on either visual representation learning or linguistic knowledge learning. In this paper, we propose a novel approach MaskOCR to unify vision and language pre-training in the classical encoder-decoder recognition framework. We adopt the masked image modeling approach to pre-train the feature encoder using a large set of unlabeled real text images, which allows us to learn strong visual representations. In contrast to introducing linguistic knowledge with an additional language model, we directly pre-train the sequence decoder. Specifically, we transform text data into synthesized text images to unify the data modalities of vision and language, and enhance the language modeling capability of the sequence decoder using a proposed masked image-language modeling scheme. Significantly, the encoder is frozen during the pre-training phase of the sequence decoder. Experimental results demonstrate that our proposed method achieves superior performance on benchmark datasets, including Chinese and English text images

    Diseño de 1600 ML de adoquinado, ubicado en los barrios: anexo a la villa Victoria de julio, Antonio Mendoza y Rubén Ulloa; en el casco urbano de Tipitapa, municipio de Managua

    Get PDF
    El desarrollo de nuestro país se basa en elementos fundamentales, como: agricultura industria, ganadería, comercio, turismo, etc. Pero el factor determinante entre estos es el sistema nacional de transporte es decir: transporte terrestre, transporte aéreo, transporte marítimo, etc. el cual es el enlace principal para el desarrollo de la sociedad. En Nicaragua el transporte terrestre es el más utilizado por la población, y debido al aumento de la movilización de vehículos con motores más potentes por las vías, obliga a la modernización de la infraestructura vial, permitiendo un tránsito más seguro y eficiente. El incremento de la red vial está vinculado directamente con la economía de nuestro país, pues su papel es primordial en las actividades que se realizan a diario en los diferentes sectores que aportan a la economía nacional. Actualmente la construcción de nuevas vías de comunicación, rehabilitación de carreteras y mejoras de los caminos ya existentes debe ser una necesidad para los gobiernos, ya que constituyen un componente fundamental para el bienestar y desarrollo de la sociedad, además su diseño debe adoptar las condiciones necesarias para obtener una obra de calidad; cumpliéndose en el todos los principios y normas correspondientes al diseño de carreteras. El presente trabajo denominado ‘‘Diseño de 1600 ML de calle, ubicados en los barrios: Anexo la Villa Rubén Ulloa, Villa Victoria de Julio y Antonio Mendoza localizados en el casco urbano de Tipitapa, municipio de Managua’’. Muestra en su contenido los estudios, métodos y normas aplicables para elaborar: el diseño geométrico de la vía, diseño hidráulico y de la estructura de pavimento, tomando en cuenta las especificaciones correspondientes al diseño de carreteras en Nicaragua

    BmC/EBPZ gene is essential for the larval growth and development of silkworm, Bombyx mori

    Get PDF
    The genetic male sterile line (GMS) of the silkworm Bombyx mori is a recessive mutant that is naturally mutated from the wild-type 898WB strain. One of the major characteristics of the GMS mutant is its small larvae. Through positional cloning, candidate genes for the GMS mutant were located in a region approximately 800.5 kb long on the 24th linkage group of the silkworm. One of the genes was Bombyx mori CCAAT/enhancer-binding protein zeta (BmC/EBPZ), which is a member of the basic region-leucine zipper transcription factor family. Compared with the wild-type 898WB strain, the GMS mutant features a 9 bp insertion in the 3′end of open reading frame sequence of BmC/EBPZ gene. Moreover, the high expression level of the BmC/EBPZ gene in the testis suggests that the gene is involved in the regulation of reproduction-related genes. Using the CRISPR/Cas9-mediated knockout system, we found that the BmC/EBPZ knockout strains had the same phenotypes as the GMS mutant, that is, the larvae were small. However, the larvae of BmC/EBPZ knockout strains died during the development of the third instar. Therefore, the BmC/EBPZ gene was identified as the major gene responsible for GMS mutation

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore