27 research outputs found
Neural Responding Machine for Short-Text Conversation
We propose Neural Responding Machine (NRM), a neural network-based response
generator for Short-Text Conversation. NRM takes the general encoder-decoder
framework: it formalizes the generation of response as a decoding process based
on the latent representation of the input text, while both encoding and
decoding are realized with recurrent neural networks (RNN). The NRM is trained
with a large amount of one-round conversation data collected from a
microblogging service. Empirical study shows that NRM can generate
grammatically correct and content-wise appropriate responses to over 75% of the
input text, outperforming state-of-the-arts in the same setting, including
retrieval-based and SMT-based models.Comment: accepted as a full paper at ACL 201
Multimodal Convolutional Neural Networks for Matching Image and Sentence
In this paper, we propose multimodal convolutional neural networks (m-CNNs)
for matching image and sentence. Our m-CNN provides an end-to-end framework
with convolutional architectures to exploit image representation, word
composition, and the matching relations between the two modalities. More
specifically, it consists of one image CNN encoding the image content, and one
matching CNN learning the joint representation of image and sentence. The
matching CNN composes words to different semantic fragments and learns the
inter-modal relations between image and the composed fragments at different
levels, thus fully exploit the matching relations between image and sentence.
Experimental results on benchmark databases of bidirectional image and sentence
retrieval demonstrate that the proposed m-CNNs can effectively capture the
information necessary for image and sentence matching. Specifically, our
proposed m-CNNs for bidirectional image and sentence retrieval on Flickr30K and
Microsoft COCO databases achieve the state-of-the-art performances.Comment: Accepted by ICCV 201
Paraphrase Generation with Deep Reinforcement Learning
Automatic generation of paraphrases from a given sentence is an important yet
challenging task in natural language processing (NLP), and plays a key role in
a number of applications such as question answering, search, and dialogue. In
this paper, we present a deep reinforcement learning approach to paraphrase
generation. Specifically, we propose a new framework for the task, which
consists of a \textit{generator} and an \textit{evaluator}, both of which are
learned from data. The generator, built as a sequence-to-sequence learning
model, can produce paraphrases given a sentence. The evaluator, constructed as
a deep matching model, can judge whether two sentences are paraphrases of each
other. The generator is first trained by deep learning and then further
fine-tuned by reinforcement learning in which the reward is given by the
evaluator. For the learning of the evaluator, we propose two methods based on
supervised learning and inverse reinforcement learning respectively, depending
on the type of available training data. Empirical study shows that the learned
evaluator can guide the generator to produce more accurate paraphrases.
Experimental results demonstrate the proposed models (the generators)
outperform the state-of-the-art methods in paraphrase generation in both
automatic evaluation and human evaluation.Comment: EMNLP 201
Neural Generative Question Answering
This paper presents an end-to-end neural network model, named Neural
Generative Question Answering (GENQA), that can generate answers to simple
factoid questions, based on the facts in a knowledge-base. More specifically,
the model is built on the encoder-decoder framework for sequence-to-sequence
learning, while equipped with the ability to enquire the knowledge-base, and is
trained on a corpus of question-answer pairs, with their associated triples in
the knowledge-base. Empirical study shows the proposed model can effectively
deal with the variations of questions and answers, and generate right and
natural answers by referring to the facts in the knowledge-base. The experiment
on question answering demonstrates that the proposed model can outperform an
embedding-based QA model as well as a neural dialogue model trained on the same
data.Comment: Accepted by IJCAI 201
Fe/MOF based platform for NIR laser induced efficient PDT/PTT of cancer
Introduction: Photodynamic therapy (PDT) and photothermal therapy (PTT) are widely used in the treatment of tumors. However, their application in the treatment of clinical tumors is limited by the complexity and irreversible hypoxia environment generated by tumor tissues. To overcome this limitation, a nanoparticle composed of indocyanine green (ICG) and Fe-MOF-5 was developed.Methods: We prepared F-I@FM5 and measured its morphology, particle size, and stability. Its enzyme like ability and optical effect was verified. Then we used MTT, staining and flow cytometry to evaluated the anti-tumor effect on EMT-6 cells in vitro. Finally, the anti-tumor effect in vivo has been studied on EMT-6 tumor bearing mice.Results: For the composite nanoparticle, we confirmed that Fe-MOF-5 has the best nanozyme activity. In addition, it has excellent photothermal conversion efficiency and generates reactive oxygen species (ROS) under near-infrared light irradiation (808Â nm). The composite nanoparticle showed good tumor inhibition effect in vitro and in vivo, which was superior to the free ICG or Fe-MOF-5 alone. Besides, there was no obvious cytotoxicity in major organs within the effective therapeutic concentration.Discussion: Fe-MOF-5 has the function of simulating catalase, which can promote the decomposition of excessive H2O2 in the tumor microenvironment and produce oxygen to improve the hypoxic environment. The improvement of tumor hypoxia can enhance the efficacy of PDT and PTT. This research not only provides an efficient and stable anti-tumor nano platform, but also has broad application prospects in the field of tumor therapy, and provides a new idea for the application of MOF as an important carrier material in the field of photodynamic therapy
Baichuan 2: Open Large-scale Language Models
Large language models (LLMs) have demonstrated remarkable performance on a
variety of natural language tasks based on just a few examples of natural
language instructions, reducing the need for extensive feature engineering.
However, most powerful LLMs are closed-source or limited in their capability
for languages other than English. In this technical report, we present Baichuan
2, a series of large-scale multilingual language models containing 7 billion
and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
Baichuan 2 matches or outperforms other open-source models of similar size on
public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan
2 excels in vertical domains such as medicine and law. We will release all
pre-training model checkpoints to benefit the research community in better
understanding the training dynamics of Baichuan 2.Comment: Baichuan 2 technical report. Github:
https://github.com/baichuan-inc/Baichuan
Predicting the concentration of indoor culturable fungi using a kernel-based extreme learning machine (K-ELM)
Neural Machine Translation with Reconstruction
Although end-to-end Neural Machine Translation (NMT) has achieved remarkable progress in the past two years, it suffers from a major drawback: translations generated by NMT systems often lack of adequacy. It has been widely observed that NMT tends to repeatedly translate some source words while mistakenly ignoring other words. To alleviate this problem, we propose a novel encoder-decoder-reconstructor framework for NMT. The reconstructor, incorporated into the NMT model, manages to reconstruct the input source sentence from the hidden layer of the output target sentence, to ensure that the information in the source side is transformed to the target side as much as possible. Experiments show that the proposed framework significantly improves the adequacy of NMT output and achieves superior translation result over state-of-the-art NMT and statistical MT systems
Assessment of small strain modulus in soil using advanced computational models
Abstract Small-strain shear modulus ( G 0 ) of soils is a crucial dynamic parameter that significantly impacts seismic site response analysis and foundation design. G 0 is susceptible to multiple factors, including soil uniformity coefficient ( C u ), void ratio (e), mean particle size ( d 50 ), and confining stress ( σ ′ ). This study aims to establish a G 0 database and suggests three advanced computational models for G 0 prediction. Nine performance indicators, including four new indices, are employed to calculate and analyze the model’s performance. The XGBoost model outperforms the other two models, with all three models achieving R 2 values exceeding 0.9, RMSE values below 30, MAE values below 25, VAF values surpassing 80%, and ARE values below 50%. Compared to the empirical formula-based traditional prediction models, the model proposed in this study exhibits better performance in IOS, IOA, a20-index, and PI metrics values. The model has higher prediction accuracy and better generalization ability