25 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Remote Sensing Time Series Classification Based on Self-Attention Mechanism and Time Sequence Enhancement

    No full text
    Nowadays, in the field of data mining, time series data analysis is a very important and challenging subject. This is especially true for time series remote sensing classification. The classification of remote sensing images is an important source of information for land resource planning and management, rational development, and protection. Many experts and scholars have proposed various methods to classify time series data, but when these methods are applied to real remote sensing time series data, there are some deficiencies in classification accuracy. Based on previous experience and the processing methods of time series in other fields, we propose a neural network model based on a self-attention mechanism and time sequence enhancement to classify real remote sensing time series data. The model is mainly divided into five parts: (1) memory feature extraction in subsequence blocks; (2) self-attention layer among blocks; (3) time sequence enhancement; (4) spectral sequence relationship extraction; and (5) a simplified ResNet neural network. The model can simultaneously consider the three characteristics of time series local information, global information, and spectral series relationship information to realize the classification of remote sensing time series. Good experimental results have been obtained by using our model

    Inter-Comparison of Four Models for Detecting Forest Fire Disturbance from MOD13A2 Time Series

    No full text
    Many models for change point detection from time series remote sensing images have been developed to date. For forest ecosystems, fire disturbance detection models have always been an important topic. However, due to a lack of benchmark datasets, it is difficult to determine which model is appropriate. Therefore, we collected and generated a benchmark dataset specifically for forest fire disturbance detection, named CUG-FFireMCD1. The CUG-FFireMCD1 contains a total of 132 pieces of MODIS MOD13A2 time series, and each time series contains at least one fire disturbance. The occurrence time for a forest fire disturbance was determined using the National Cryosphere DesertDataCenter(NCDC) website, and the precise latitude and longitude coordinates were determined using the FireCCI51 dataset. In addition, we selected four commonly used time series change detection models and validate the advantages and limitations of the four models through dataset analysis. Finally, we use the detection results of the models and their applicable scenarios to label the additional change points. The four models we used are breaks for additive season and trend (BFAST), Prophet, continuous change detection and classification (CCDC), and Landsat-based detection of trends in disturbance and recovery (LandTrendR). The experiments show that the BFAST outperformed the other three models in forest fire disturbance detection from MOD13A2 time series, with the successful-detection-proportion rate of 96.2% with the benchmark dataset. The detection effect of the Prophet model is not as good as that of BFAST, but it also performs well, with the successful-detection-proportion rate of 87.9%. The detection results of CCDC and LandTrendR are similar, and the detection success rate is lower than that of BFAST and Prophet, but their detection results can be used as data support for labeling work. However, to apply them perfectly to MOD13A2 time series change detection, it is best to do some model adaptation. In summary, the CUG-FFireMCD1 data were verified using different types of time series change detection models, and the change points we marked are credible. The CUG-FFireMCD1 will surely provide a reliable benchmark for model optimization and the accuracy verification of remote sensing time series change detection

    Inter-Comparison of Four Models for Detecting Forest Fire Disturbance from MOD13A2 Time Series

    No full text
    Many models for change point detection from time series remote sensing images have been developed to date. For forest ecosystems, fire disturbance detection models have always been an important topic. However, due to a lack of benchmark datasets, it is difficult to determine which model is appropriate. Therefore, we collected and generated a benchmark dataset specifically for forest fire disturbance detection, named CUG-FFireMCD1. The CUG-FFireMCD1 contains a total of 132 pieces of MODIS MOD13A2 time series, and each time series contains at least one fire disturbance. The occurrence time for a forest fire disturbance was determined using the National Cryosphere DesertDataCenter(NCDC) website, and the precise latitude and longitude coordinates were determined using the FireCCI51 dataset. In addition, we selected four commonly used time series change detection models and validate the advantages and limitations of the four models through dataset analysis. Finally, we use the detection results of the models and their applicable scenarios to label the additional change points. The four models we used are breaks for additive season and trend (BFAST), Prophet, continuous change detection and classification (CCDC), and Landsat-based detection of trends in disturbance and recovery (LandTrendR). The experiments show that the BFAST outperformed the other three models in forest fire disturbance detection from MOD13A2 time series, with the successful-detection-proportion rate of 96.2% with the benchmark dataset. The detection effect of the Prophet model is not as good as that of BFAST, but it also performs well, with the successful-detection-proportion rate of 87.9%. The detection results of CCDC and LandTrendR are similar, and the detection success rate is lower than that of BFAST and Prophet, but their detection results can be used as data support for labeling work. However, to apply them perfectly to MOD13A2 time series change detection, it is best to do some model adaptation. In summary, the CUG-FFireMCD1 data were verified using different types of time series change detection models, and the change points we marked are credible. The CUG-FFireMCD1 will surely provide a reliable benchmark for model optimization and the accuracy verification of remote sensing time series change detection

    Application Progresses on Near-infrared Spectroscopy in Quality Detection of Edible Fungi

    No full text
    Edible fungi is a kind of large fungi with high protein, low fat and medicinal value. It has become one of the common food ingredients on the table because of its characteristics of more nutrition, fresh flavor, low calories and multi-functions. With the continuous upgrading of the consumption concept, high quality and nutrition have become necessary requirements for edible fungi consumption. Near-infrared spectroscopy (NIRS) technology has been widely used in the quality detection of edible fungi because of its advantages of fast, non-destructive, multi-component simultaneous detection, green and pollution-free. This paper comprehensively summarizes the research and application progresses of NIRS technology combined with chemometrics in quality evaluation of edible fungi in terms of physical and chemical components, active ingredients, variety identification, origin tracing, pathogen contamination and adulteration recognition in recent five years. At the same time, the development strategy of NIRS technology in the quality detection of edible fungi is proposed, which will provide methodology and ideas reference for improving the detection theory of NIRS technology and developing special testing equipment
    corecore