5 research outputs found

    Tibetan Sentence Boundaries Automatic Disambiguation Based on Bidirectional Encoder Representations from Transformers on Byte Pair Encoding Word Cutting Method

    No full text
    Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and statistical learning, as well as the combination of the two, which have high requirements on the corpus and the linguistic foundation of the researchers and are more costly to annotate manually. In this study, we explore Tibetan SBD using deep learning technology. Initially, we analyze Tibetan characteristics and various subword techniques, selecting Byte Pair Encoding (BPE) and Sentencepiece (SP) for text segmentation and training the Bidirectional Encoder Representations from Transformers (BERT) pre-trained language model. Secondly, we studied the Tibetan SBD based on different BERT pre-trained language models, which mainly learns the ambiguity of the shad (ā€œą¼ā€) in different positions in modern Tibetan texts and determines through the model whether the shad (ā€œą¼ā€) in the texts has the function of segmenting sentences. Meanwhile, this study introduces four models, BERT-CNN, BERT-RNN, BERT-RCNN, and BERT-DPCNN, based on the BERT model for performance comparison. Finally, to verify the performance of the pre-trained language models on the SBD task, this study conducts SBD experiments on both the publicly available Tibetan pre-trained language model TiBERT and the multilingual pre-trained language model (Multi-BERT). The experimental results show that the F1 score of the BERT (BPE) model trained in this study reaches 95.32% on 465,669 Tibetan sentences, nearly five percentage points higher than BERT (SP) and Multi-BERT. The SBD method based on pre-trained language models in this study lays the foundation for establishing datasets for the later tasks of Tibetan pre-training, summary extraction, and machine translation

    MHlinker: Research on a Joint Extraction Method of Fault Entity Relationship for Mine Hoist

    No full text
    Triplet extraction is the key technology to automatically construct knowledge graphs. Extracting the triplet of mechanical equipment fault relationships is of great significance in constructing the fault diagnosis of a mine hoist. The pipeline triple extraction method will bring problems such as error accumulation and information redundancy. The existing joint learning methods cannot be applied to fault texts with more overlapping relationships, ignoring the particularity of professional knowledge in the field of complex mechanical equipment faults. Therefore, based on the Chinese pre-trained language model BERT Whole Word Masking (BERT-wwm), this paper proposes a joint entity and relation extraction model MHlinker (Mine Hoist linker, MHlinker) for the mine hoist fault field. This method uses BERT-wwm as the underlying encoder. In the entity recognition stage, the classification matrix is constructed using the multi-head extraction paradigm, which effectively solves the problem of entity nesting. The results show that this method enhances the modelā€™s ability to extract fault relationships as a whole. When the small-scale manually labeled mine hoist fault text data set is tested, the extraction effect of entities and relationships is significantly improved compared with several baseline models

    Generation of Hepatocytes and Nonparenchymal Cell Codifferentiation System from Human-Induced Pluripotent Stem Cells

    No full text
    To date, hepatocytes derived from human-induced pluripotent stem cells (hiPSC) provide a potentially unlimited resource for clinical application and drug development. However, most hiPSC-derived hepatocyte-like cells initiated differentiation from highly purified definitive endoderm, which are insufficient to accurately replicate the complex regulation of signals among multiple cells and tissues during liver organogenesis, thereby displaying an immature phenotypic and short survival time in vitro. Here, we described a protocol to achieve codifferentiation of endoderm-derived hepatocytes and mesoderm-derived nonparenchymal cells by the inclusion of BMP4 into hepatic differentiation medium, which has a beneficial effect on the hepatocyte maturation and lifespan in vitro. Our codifferentiation system suggests the important role of nonparenchymal cells in liver organogenesis. Hopefully, these hepatocytes described here provide a promising approach in the therapy of liver diseases

    Melatonin inhibits bladder tumorigenesis by suppressing PPARĪ³/ENO1-mediated glycolysis

    No full text
    Abstract Melatonin is a well-known natural hormone, which shows a potential anticancer effect in many human cancers. Bladder cancer (BLCA) is one of the most malignant human cancers in the world. Chemoresistance is an increasingly prominent phenomenon that presents an obstacle to the clinical treatment of BLCA. There is an urgent need to investigate novel drugs to improve the current clinical status. In our study, we comprehensively explored the inhibitory effect of melatonin on BLCA and found that it could suppress glycolysis process. Moreover, we discovered that ENO1, a glycolytic enzyme involved in the ninth step of glycolysis, was the downstream effector of melatonin and could be a predictive biomarker of BLCA. We also proved that enhanced glycolysis simulated by adding exogenous pyruvate could induce gemcitabine resistance, and melatonin treatment or silencing of ENO1 could intensify the cytotoxic effect of gemcitabine on BLCA cells. Excessive accumulation of reactive oxygen species (ROS) mediated the inhibitory effect of melatonin on BLCA cells. Additionally, we uncovered that PPARĪ³ was a novel upstream regulator of ENO1, which mediated the downregulation of ENO1 caused by melatonin. Our study offers a fresh perspective on the anticancer effect of melatonin and encourages further studies on clinical chemoresistance
    corecore