40 research outputs found
Assertion Detection Large Language Model In-context Learning LoRA Fine-tuning
In this study, we aim to address the task of assertion detection when
extracting medical concepts from clinical notes, a key process in clinical
natural language processing (NLP). Assertion detection in clinical NLP usually
involves identifying assertion types for medical concepts in the clinical text,
namely certainty (whether the medical concept is positive, negated, possible,
or hypothetical), temporality (whether the medical concept is for present or
the past history), and experiencer (whether the medical concept is described
for the patient or a family member). These assertion types are essential for
healthcare professionals to quickly and clearly understand the context of
medical conditions from unstructured clinical texts, directly influencing the
quality and outcomes of patient care. Although widely used, traditional
methods, particularly rule-based NLP systems and machine learning or deep
learning models, demand intensive manual efforts to create patterns and tend to
overlook less common assertion types, leading to an incomplete understanding of
the context. To address this challenge, our research introduces a novel
methodology that utilizes Large Language Models (LLMs) pre-trained on a vast
array of medical data for assertion detection. We enhanced the current method
with advanced reasoning techniques, including Tree of Thought (ToT), Chain of
Thought (CoT), and Self-Consistency (SC), and refine it further with Low-Rank
Adaptation (LoRA) fine-tuning. We first evaluated the model on the i2b2 2010
assertion dataset. Our method achieved a micro-averaged F-1 of 0.89, with 0.11
improvements over the previous works. To further assess the generalizability of
our approach, we extended our evaluation to a local dataset that focused on
sleep concept extraction. Our approach achieved an F-1 of 0.74, which is 0.31
higher than the previous method
Long Non-Coding RNA PVT1/miR-150/ HIG2 Axis Regulates the Proliferation, Invasion and the Balance of Iron Metabolism of Hepatocellular Carcinoma
Background/Aims: To investigate the biological roles and underlying molecular mechanisms of long non-coding RNA (lncRNA) PVT1 in Hepatocellular carcinoma (HCC). Methods: qRT-PCR was performed to measure the expression of miRNA and mRNA. Western blot was performed to measure the protein expression. CCK-8 assay was performed to determine cell proliferation. Flow cytometry was performed to detect cell apoptosis. Wounding-healing assay and Transwell assay was performed to detect cell migration and invasion. Dual luciferase reporter assay was performed to verify the target relationship. Quantichrom iron assay was performed to check uptake level of cellular iron. Results: PVT1 expression was up-regulated in HCC tissues and cell lines. Function studies revealed that PVT1 knockdown significantly suppressed cell proliferation, migration and invasion, and induced cell apoptosis in vitro. Furthermore, PVT1 could directly bind to microRNA (miR)-150 and down-regulate miR-150 expression. Hypoxia-inducible protein 2 (HIG2) was found to be one target gene of miR-150, and PVT1 knockdown could inhibit the expression of HIG2 through up-regulating miR-150 expression. In addition, the expression of miR-150 was down-regulated, while the expression of HIG2 was up-regulated in HCC tissues and cell lines. Moreover, inhibition of miR-150 could partly reverse the biological effects of PVT1 knockdown on proliferation, motility, apoptosis and iron metabolism in vitro, which might be associated with dysregulation of HIG2. In vivo results showed that PVT1 knockdown suppressed tumorigenesis and iron metabolism disorder by regulating the expression of miR-150 and HIG2. Conclusion: Taken together, the present study demonstrates that PVT1/miR-150/HIG2 axis may lead to a better understanding of HCC pathogenesis and provide potential therapeutic targets for HCC
Optimization of Warehouse Picking Based on TSP
Modern warehouse management has entered the era of information intelligence, which requires the establishment of a set of accurate, reasonable and efficient warehouse management methods, to improve the efficiency of workers. In this paper, a series of TSP (Traveling salesman problem) algorithms are used to further optimize the process efficiency of goods removal from the warehouse. Based on Freudian algorithm, the distance problem between cargo grid and recheck station was solved in the first problem. Founded on the genetic algorithm in TSP, specific application problems were solved in two, three and four questions. For problem 1, by observing the relation between the coordinates given and the actual route, all the coordinates of cargo grid were divided into two kinds: odd and even columns, and the two kinds of coordinates were processed, respectively. After coordinate processing, the distance matrix between each lattice can be obtained by using Freudian algorithm. At the same time, due to the small number of recheck stations and their regular distribution, the distance matrix between each recheck station and each cargo grid can be obtained by manually classifying the coordinates of the recheck stations and applying Freud algorithm again. Output all the matrices in the same EXCEL table, the distance matrix between the 3013 elements can be obtained. (See Appendix 1 for the distance matrix and Appendix 1 for the algorithm). For problem 2, this problem is a unidirectional TSP problem by macro analysis. Firstly, the distance matrix of the required point was called from the solution of problem 1, which was imported into LINGO program to establish 0-1 decision variables, and the objective function is established to solve according to the principles of single-direction connection and loop-breaking, to obtain the connection sequence and loop distance. Select the cargo lattice back to the initial recheck station and connect it with the nearest recheck station, to complete the task of breaking the ring, solving the connection sequence and the total distance length. The outbound time can be divided into three parts: (1) journey time; (2) Pick up time; (3) Packing time, group calculation, sum can get the result. The total distance is 382.5m (about 1254.92 ft) and the total minimum time is 462 seconds (about 7 and a half minutes). For problem 3, distance matrix of the required point was called from the solution of problem 1, and imported into LINGO program, afterwards, 0-1 decision variable and objective function were established according to the principles of single-direction connection and loop-breaking to solve, and then the connection sequence and loop distance was obtained. Select the cargo lattice back to the initial recheck station and connect it with the nearest recheck station, to complete the task of breaking the ring, solving the connection sequence and the total distance length. Compared with Question 2, Question 3 specifies the available recheck station and the total time needed to complete the shipment. During the calculation of each task order, it was classified and discussed (double-starting point operation), and the overall optimization was carried out according to the starting point and end point of each task order, and the total shortest time was 2288.6 seconds (about 38 minutes) (See the attachment for the results of this question). For question 4, queuing theory is needed. To reduce the total outbound time, it is necessary to give priority to orders with abbreviated time. Firstly, the shortest outbound time of 49 task orders should be calculated, and the time should be arranged in order from short to long. After that, according to the order of the task list, nine pickers pick goods in a certain order. Hence, a variable should be introduced to preserve the status of the recheck table and the remaining time currently occupied
The Smart Grids in China—A Review
The concept of the smart grid has been gaining more and more attention worldwide since it was proposed by the U.S. Electric Power Research Institute in 2001. Recently, it has been propelled again by the promotion of low carbon economies in developing countries. To satisfy the exponential increase in electricity demand and alleviate environmental degradation caused by fossil fuel-based power generation, China has made great efforts in constructing a smart grid as a substitution of traditional energy-intensive power grid. In the 12th Five-Year Plan in particular, it was stated that emphasis should be placed on the development of renewable energy and smart grids. The objective of this paper is to provide an insight into the current research on smart grids, and shed light on the development of smart grids in China, based on the analysis of which, the obstacles and barriers in the development process are identified. Finally, policy prospects on the construction of smart grids in China are proposed from the aspects of technology, administration and management
Credit Decision Problems of MSMEs (Medium, Small and Micro-sized Enterprises)
Micro, small and medium enterprises (MSMEs) have become an important force in driving the country’s market economy in the 21st century. However, because of the following drawbacks: i.e. single enterprise capital chain, unstable economy, high risk, etc., banks will take many risks if they lend to MSMEs. Therefore, it is necessary to build a sound bank credit decision system to promote the development of MSMEs. The analytic hierarchy process (AHP) is employed to classify the importance level such as credit rating and enterprise strength are used as first-grade indexes, and six indicators in terms of total sales and total profits are used as the second-grade indexes. Then, the eigenvalue method is used to obtain the importance weights of each level of indicators, and the weights of each influencing factor at each level are then calculated to achieve a quantitative analysis of credit risk and rating of each enterprise’s credit risk. This paper combines the existing loan pricing and loan interest rates to give preferential interest rates and higher loan amounts to enterprises with excellent credit risk ratings, and to give certain risky interest rates and lower loan amounts to enterprises with medium credit risk ratings.Based on the model, a quantitative analysis of the credit risk of 302 non-credit record enterprises is carried out and the bank’ credit strategy is provided when the total annual credit is 100 million yuan. Finally, this paper comprehensively considers the impact of credit risk and unexpected factors (e.g., the COVID-19) on enterprises, and provides the bank’s credit adjustment strategy when the total annual credit is 100 million yuan
Robust Textual Data Streams Mining Based on Continuous Transfer Learning
Abstract In textual data stream environment, concept drift can occur at any time, existing approaches partitioning streams into chunks can have problem if the chunk boundary does not coincide with the change point which is impossible to predict. Since concept drift can occur at any point of the streams, it will certainly occur within chunks, which is called random concept drift. The paper proposed an approach, which is called chunk level-based concept drift method (CLCD), that can overcome this chunking problem by continuously monitoring chunk characteristics to revise the classifier based on transfer learning in positive and unlabeled (PU) textual data stream environment. Our proposed approach works in three steps. In the first step, we propose core vocabulary-based criteria to justify and identify random concept drift. In the second step, we put forward the extension of LELC (PU learning by extracting likely positive and negative microclusters)[1], called soft-LELC, to extract representative examples from unlabeled data, and assign a confidence score to each extracted example. The assigned confidence score represents the degree of belongingness of an example towards its corresponding class. In the third step, we set up a transfer learning-based SVM to build an accurate classifier for the chunks where concept drift is identified in the first step. Extensive experiments have shown that CLCD can capture random concept drift, and outperforms state-of-the-art methods in positive and unlabeled textual data stream environments