27 research outputs found

    Neural Responding Machine for Short-Text Conversation

    Full text link
    We propose Neural Responding Machine (NRM), a neural network-based response generator for Short-Text Conversation. NRM takes the general encoder-decoder framework: it formalizes the generation of response as a decoding process based on the latent representation of the input text, while both encoding and decoding are realized with recurrent neural networks (RNN). The NRM is trained with a large amount of one-round conversation data collected from a microblogging service. Empirical study shows that NRM can generate grammatically correct and content-wise appropriate responses to over 75% of the input text, outperforming state-of-the-arts in the same setting, including retrieval-based and SMT-based models.Comment: accepted as a full paper at ACL 201

    Multimodal Convolutional Neural Networks for Matching Image and Sentence

    Full text link
    In this paper, we propose multimodal convolutional neural networks (m-CNNs) for matching image and sentence. Our m-CNN provides an end-to-end framework with convolutional architectures to exploit image representation, word composition, and the matching relations between the two modalities. More specifically, it consists of one image CNN encoding the image content, and one matching CNN learning the joint representation of image and sentence. The matching CNN composes words to different semantic fragments and learns the inter-modal relations between image and the composed fragments at different levels, thus fully exploit the matching relations between image and sentence. Experimental results on benchmark databases of bidirectional image and sentence retrieval demonstrate that the proposed m-CNNs can effectively capture the information necessary for image and sentence matching. Specifically, our proposed m-CNNs for bidirectional image and sentence retrieval on Flickr30K and Microsoft COCO databases achieve the state-of-the-art performances.Comment: Accepted by ICCV 201

    Paraphrase Generation with Deep Reinforcement Learning

    Full text link
    Automatic generation of paraphrases from a given sentence is an important yet challenging task in natural language processing (NLP), and plays a key role in a number of applications such as question answering, search, and dialogue. In this paper, we present a deep reinforcement learning approach to paraphrase generation. Specifically, we propose a new framework for the task, which consists of a \textit{generator} and an \textit{evaluator}, both of which are learned from data. The generator, built as a sequence-to-sequence learning model, can produce paraphrases given a sentence. The evaluator, constructed as a deep matching model, can judge whether two sentences are paraphrases of each other. The generator is first trained by deep learning and then further fine-tuned by reinforcement learning in which the reward is given by the evaluator. For the learning of the evaluator, we propose two methods based on supervised learning and inverse reinforcement learning respectively, depending on the type of available training data. Empirical study shows that the learned evaluator can guide the generator to produce more accurate paraphrases. Experimental results demonstrate the proposed models (the generators) outperform the state-of-the-art methods in paraphrase generation in both automatic evaluation and human evaluation.Comment: EMNLP 201

    Neural Generative Question Answering

    Full text link
    This paper presents an end-to-end neural network model, named Neural Generative Question Answering (GENQA), that can generate answers to simple factoid questions, based on the facts in a knowledge-base. More specifically, the model is built on the encoder-decoder framework for sequence-to-sequence learning, while equipped with the ability to enquire the knowledge-base, and is trained on a corpus of question-answer pairs, with their associated triples in the knowledge-base. Empirical study shows the proposed model can effectively deal with the variations of questions and answers, and generate right and natural answers by referring to the facts in the knowledge-base. The experiment on question answering demonstrates that the proposed model can outperform an embedding-based QA model as well as a neural dialogue model trained on the same data.Comment: Accepted by IJCAI 201

    Fe/MOF based platform for NIR laser induced efficient PDT/PTT of cancer

    Get PDF
    Introduction: Photodynamic therapy (PDT) and photothermal therapy (PTT) are widely used in the treatment of tumors. However, their application in the treatment of clinical tumors is limited by the complexity and irreversible hypoxia environment generated by tumor tissues. To overcome this limitation, a nanoparticle composed of indocyanine green (ICG) and Fe-MOF-5 was developed.Methods: We prepared F-I@FM5 and measured its morphology, particle size, and stability. Its enzyme like ability and optical effect was verified. Then we used MTT, staining and flow cytometry to evaluated the anti-tumor effect on EMT-6 cells in vitro. Finally, the anti-tumor effect in vivo has been studied on EMT-6 tumor bearing mice.Results: For the composite nanoparticle, we confirmed that Fe-MOF-5 has the best nanozyme activity. In addition, it has excellent photothermal conversion efficiency and generates reactive oxygen species (ROS) under near-infrared light irradiation (808 nm). The composite nanoparticle showed good tumor inhibition effect in vitro and in vivo, which was superior to the free ICG or Fe-MOF-5 alone. Besides, there was no obvious cytotoxicity in major organs within the effective therapeutic concentration.Discussion: Fe-MOF-5 has the function of simulating catalase, which can promote the decomposition of excessive H2O2 in the tumor microenvironment and produce oxygen to improve the hypoxic environment. The improvement of tumor hypoxia can enhance the efficacy of PDT and PTT. This research not only provides an efficient and stable anti-tumor nano platform, but also has broad application prospects in the field of tumor therapy, and provides a new idea for the application of MOF as an important carrier material in the field of photodynamic therapy

    Baichuan 2: Open Large-scale Language Models

    Full text link
    Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.Comment: Baichuan 2 technical report. Github: https://github.com/baichuan-inc/Baichuan

    Neural Machine Translation with Reconstruction

    No full text
    Although end-to-end Neural Machine Translation (NMT) has achieved remarkable progress in the past two years, it suffers from a major drawback: translations generated by NMT systems often lack of adequacy. It has been widely observed that NMT tends to repeatedly translate some source words while mistakenly ignoring other words. To alleviate this problem, we propose a novel encoder-decoder-reconstructor framework for NMT. The reconstructor, incorporated into the NMT model, manages to reconstruct the input source sentence from the hidden layer of the output target sentence, to ensure that the information in the source side is transformed to the target side as much as possible. Experiments show that the proposed framework significantly improves the adequacy of NMT output and achieves superior translation result over state-of-the-art NMT and statistical MT systems

    Assessment of small strain modulus in soil using advanced computational models

    No full text
    Abstract Small-strain shear modulus ( G0G_0 G 0 ) of soils is a crucial dynamic parameter that significantly impacts seismic site response analysis and foundation design. G0G_0 G 0 is susceptible to multiple factors, including soil uniformity coefficient ( CuC_u C u ), void ratio (e), mean particle size ( d50d_{50} d 50 ), and confining stress ( σ′\sigma ' σ ′ ). This study aims to establish a G0G_0 G 0 database and suggests three advanced computational models for G0G_0 G 0 prediction. Nine performance indicators, including four new indices, are employed to calculate and analyze the model’s performance. The XGBoost model outperforms the other two models, with all three models achieving R2R^2 R 2 values exceeding 0.9, RMSE values below 30, MAE values below 25, VAF values surpassing 80%, and ARE values below 50%. Compared to the empirical formula-based traditional prediction models, the model proposed in this study exhibits better performance in IOS, IOA, a20-index, and PI metrics values. The model has higher prediction accuracy and better generalization ability
    corecore