49 research outputs found

    LEEC: A Legal Element Extraction Dataset with an Extensive Domain-Specific Label System

    Full text link
    As a pivotal task in natural language processing, element extraction has gained significance in the legal domain. Extracting legal elements from judicial documents helps enhance interpretative and analytical capacities of legal cases, and thereby facilitating a wide array of downstream applications in various domains of law. Yet existing element extraction datasets are limited by their restricted access to legal knowledge and insufficient coverage of labels. To address this shortfall, we introduce a more comprehensive, large-scale criminal element extraction dataset, comprising 15,831 judicial documents and 159 labels. This dataset was constructed through two main steps: first, designing the label system by our team of legal experts based on prior legal research which identified critical factors driving and processes generating sentencing outcomes in criminal cases; second, employing the legal knowledge to annotate judicial documents according to the label system and annotation guideline. The Legal Element ExtraCtion dataset (LEEC) represents the most extensive and domain-specific legal element extraction dataset for the Chinese legal system. Leveraging the annotated data, we employed various SOTA models that validates the applicability of LEEC for Document Event Extraction (DEE) task. The LEEC dataset is available on https://github.com/THUlawtech/LEEC

    Expression of the chemokine receptor CXCR4 in human hepatocellular carcinoma and its role in portal vein tumor thrombus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This study was conducted to investigate the expression of CXCR4 in portal vein tumor thrombus (PVTT) tissue and its possible role in the invasiveness of tumor thrombus cells.</p> <p>Methods</p> <p>We detected differential expression of CXCR4 between PVTT and hepatocellular carcinoma (HCC) by an immunohistochemical assay. Lentivirus-mediated RNA interference and a migration assay were performed on human primary cells derived from PVTT to study the impact of CXCR4 on the invasiveness of HCC.</p> <p>Results</p> <p>The expression of CXCR4 in tumor thrombus tissue was higher than that in HCC tissue. The invasion ratio of PVTT cells was significantly decreased (P < 0.05) after being infected with a CXCR4-targeting siRNA lentivirus, indicating that downregulation of CXCR4 by lentivirus-mediated RNA interference significantly impaired the invasive potential of PVTT.</p> <p>Conclusions</p> <p>These results indicate that CXCR4 is an effective curative target for hepatocellular carcinomas with PVTT.</p

    Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies

    Get PDF
    BACKGROUND: Ion Torrent and Ion Proton are semiconductor-based sequencing technologies that feature rapid sequencing speed and low upfront and operating costs, thanks to the avoidance of modified nucleotides and optical measurements. Despite of these advantages, however, Ion semiconductor sequencing technologies suffer much reduced sequencing accuracy at the genomic loci with homopolymer repeats of the same nucleotide. Such limitation significantly reduces its efficiency for the biological applications aiming at accurately identifying various genetic variants. RESULTS: In this study, we propose a Bayesian inference-based method that takes the advantage of the signal distributions of the electrical voltages that are measured for all the homopolymers of a fixed length. By cross-referencing the length of homopolymers in the reference genome and the voltage signal distribution derived from the experiment, the proposed integrated model significantly improves the alignment accuracy around the homopolymer regions. CONCLUSIONS: Besides improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies with the proposed model, similar strategies can also be used on other high-throughput sequencing technologies that share similar limitations

    Virtual AP based indoor localization in area without linear constraints

    No full text
    For narrow areas such as corridors, the positioning accuracy can be significantly improved by using behavioral landmarks and electronic indoor maps. However, in wide indoor areas without linear constraints,such as offices, shops and airport halls, the use of inertial sensors and indoor map cannot achieve a significant performance gain. Virtual access point (AP) is the "virtual position" of AP calculated by using the simplified formula of wireless signal attenuation with reference points.Based on this, a fingerprint point clustering algorithm based on virtual AP coordinates has been proposed in offline phase. An AP selection algorithm based on eight-diagram has been proposed to reduce the calculation amount of online positioning. Finally, experiments conducted in different office buildings demonstrated that the positioning accuracy of the proposed algorithms is considerably better than that of the traditional methods

    Road Extraction from High-spatial-resolution Remote Sensing Image by Combining GVF Snake with Salient Features

    No full text
    The road information in the high-spatial-resolution remote sensing image is of great significance for updating the GIS database. Through analyzing the road features shown in the remote sensing image, this paper presents a road detection method based on salient features and gradient vector flow (GVF) Snake. According to the visual cognition theory, the road geometric and radiation features are viewed as salient features in this paper. First, the saliency map is calculated by fusing the color-based and structure-based contrasts, the maximum saliency value is regarded as the seeds of the GVF Snake. Then, a region-growing algorithm is applied to compute the initial boundaries, the energy function of the GVF Snake is minimized by iterative solution of the gradient vector flow model to get the final road information.Experimental results show that the proposed method could enhance the computational efficiency and has good detection accuracy

    Design of Extensible Structured Interferometric Array Utilizing the “Coarray” Concept

    No full text
    The optimum placement of receiving telescope antennas is a central topic for designing radio interferometric arrays, and this determines the performance of the obtained information. A variety of arrays are designed for different purposes, and they perform poorly in scalability. In this paper, we consider a subclass of structured sparse arrays, namely nested arrays, and examine the important role of “coarray” in interferometric synthesis imaging, which is utilized to design nested array configurations for a complete uniform Fourier plane coverage in both supersynthesis and instantaneous modes. Both nested arrays and the theory of the coarray have rich research achievements, and we apply them to astronomy to design arrays with good scalability and imaging performance. Simulated celestial source image retrieval performance validates the effectiveness of nested interferometric arrays

    An Improved Weighted K-Nearest Neighbor Algorithm for Indoor Localization

    No full text
    The weighted K-nearest neighbor (WKNN) algorithm is the most commonly used algorithm for indoor localization. Traditional WKNN algorithms adopt received signal strength (RSS) spatial distance (usually Euclidean distance and Manhattan distance) to select reference points (RPs) for position determination. It may lead to inaccurate position estimation because the relationship of received signal strength and distance is exponential. To improve the position accuracy, this paper proposes an improved weighted K-nearest neighbor algorithm. The spatial distance and physical distance of RSS are used for RP selection, and a fusion weighted algorithm based on these two distances is used for position calculation. The experimental results demonstrate that the proposed algorithm outperforms traditional algorithms, such as K-nearest neighbor (KNN), Euclidean distance-based WKNN (E-WKNN), and physical distance-based WKNN (P-WKNN). Compared with the KNN, E-WKNN, and P-WKNN algorithms, the positioning accuracy of the proposed method is improved by about 29.4%, 23.5%, and 20.7%, respectively. Compared with some recently improved WKNN algorithms, our proposed algorithm can also obtain a better positioning performance

    Real-time comprehensive driving ability evaluation algorithm for intelligent assisted driving

    No full text
    To meet the needs of the human-machine co-driving decision problem in the intelligent assisted driving system for real-time comprehensive driving ability evaluation of drivers, this paper proposes a real-time comprehensive driving ability evaluation method that integrates driving skill, driving state, and driving style. Firstly, by analyzing the driving experiment data obtained based on the intelligent driving simulation platform (the experiment can effectively distinguish the driver's driving skills and avoid the interference of driving style), the feature values that significantly represent driving skills and driving state are selected, and the time correlation between driving state and driving skills is pointed out. Furthermore, the concept of relativity in comprehensive driving ability evaluation is further proposed. Under this concept, the natural driving trajectory dataset-HighD is used to establish the distribution map of feature values of the human driver group as the evaluation benchmark to realize the relative evaluation of driving skill and driving state. Similarly, HighD is used to establish a distribution map of human driver style feature values as an evaluation benchmark to achieve relative driving style evaluation. Finally, a comprehensive driving ability evaluation model with a “punishment” and “affirmation” mechanism is proposed. The experimental comparative analysis shows that the evaluation algorithm proposed in this paper can take into account the driver's driving skill, driving state, and driving style in the real-time comprehensive driving ability evaluation, and draw differential evaluation conclusions based on the “punishment” and “affirmation” mechanism model to achieve a comprehensive and objective evaluation of the driver's driving ability. It can meet the needs of human-machine shared driving decisions for driver's driving ability evaluation

    Spectroscopic Detection of Rice Leaf Blast Infection at Different Leaf Positions at The Early Stages With Solar-Induced Chlorophyll Fluorescence

    No full text
    ObjectiveRice blast is considered as the most destructive disease that threatens global rice production and causes severe economic losses worldwide. The detection of rice blast in an early manner plays an important role in resistance breeding and plant protection. At present, most studies on rice blast detection have been devoted to its symptomatic stage, while none of previous studies have used solar-induced chlorophyll fluorescence (SIF) to monitor rice leaf blast (RLB) at early stages. This research was conducted to investigate the early identification of RLB infected leaves based on solar-induced chlorophyll fluorescence at different leaf positions.MethodsGreenhouse experiments and field trials were conducted separately in Nanjing and Nantong in July and August, 2021, in order to record SIF data of the top 1th to 4th leaves of rice plants at jointing and heading stages with an Analytical Spectral Devices (ASD) spectrometer coupled with a FluoWat leaf clip and a halogen lamp. At the same time, the disease severity levels of the measured samples were manually collected according to the GB/T 15790-2009 standard. After the continuous wavelet transform (CWT) of SIF spectra, separability assessment and feature selection were applied to SIF spectra. Wavelet features sensitive to RLB were extracted, and the sensitive features and their identification accuracy of infected leaves for different leaf positions were compared. Finally, RLB identification models were constructed based on linear discriminant analysis (LDA).Results and DiscussionThe results showed that the upward and downward SIF in the far-red region of infected leaves at each leaf position were significantly higher than those of healthy leaves. This may be due to the infection of the fungal pathogen Magnaporthe oryzae, which may have destroyed the chloroplast structure, and ultimately inhibited the primary reaction of photosynthesis. In addition, both the upward and downward SIF in the red region and the far-red region increased with the decrease of leaf position. The sensitive wavelet features varied by leaf position, while most of them were distributed in the steep slope of the SIF spectrum and wavelet scales 3, 4 and 5. The sensitive features of the top 1th leaf were mainly located at 665-680 nm, 755-790 nm and 815-830 nm. For the top 2th leaf, the sensitive features were mainly found at 665-680 nm and 815-830 nm. For the top 3th one, most of the sensitive features lay at 690 nm, 755-790 nm and 815-830 nm, and the sensitive bands around 690 nm were observed. The sensitive features of the top 4th leaf were primarily located at 665-680 nm, 725 nm and 815-830 nm, and the sensitive bands around 725 nm were observed. The wavelet features of the common sensitive region (665-680 nm), not only had physiological significance, but also coincided with the chlorophyll absorption peak that allowed for reasonable spectral interpretation. There were differences in the accuracy of RLB identification models at different leaf positions. Based on the upward and downward SIF, the overall accuracies of the top 1th leaf were separately 70% and 71%, which was higher than other leaf positions. As a result, the top 1th leaf was an ideal indicator leaf to diagnose RLB in the field. The classification accuracy of SIF wavelet features were higher than the original SIF bands. Based on CWT and feature selection, the overall accuracy of the upward and downward optimal features of the top 1th to 4th leaves reached 70.13%、63.70%、64.63%、64.53% and 70.90%、63.12%、62.00%、64.02%, respectively. All of them were higher than the canopy monitoring feature F760, whose overall accuracy was 69.79%, 61.31%, 54.41%, 61.33% and 69.99%, 58.79%, 54.62%, 60.92%, respectively. This may be caused by the differences in physiological states of the top four leaves. In addition to RLB infection, the SIF data of some top 3th and top 4th leaves may also be affected by leaf senescence, while the SIF data of top 1th leaf, the latest unfolding leaf of rice plants was less affected by other physical and chemical parameters. This may explain why the top 1th leaf responded to RLB earlier than other leaves. The results also showed that the common sensitive features of the four leaf positions were also concentrated on the steep slope of the SIF spectrum, with better classification performance around 675 and 815 nm. The classification accuracy of the optimal common features, ↑WF832,3 and ↓WF809,3, reached 69.45%, 62.19%, 60.35%, 63.00% and 69.98%, 62.78%, 60.51%, 61.30% for the top 1th to top 4th leaf positions, respectively. The optimal common features, ↑WF832,3 and ↓WF809,3, were both located in wavelet scale 3 and 800-840nm, which may be related to the destruction of the cell structure in response to Magnaporthe oryzae infection.ConclusionsIn this study, the SIF spectral response to RLB was revealed, and the identification models of the top 1th leaf were found to be most precise among the top four leaves. In addition, the common wavelet features sensitive to RLB, ↑WF832,3 and ↓WF809,3, were extracted with the identification accuracy of 70%. The results proved the potential of CWT and SIF for RLB detection, which can provide important reference and technical support for the early, rapid and non-destructive diagnosis of RLB in the field
    corecore