28 research outputs found
Capturing Evolution Genes for Time Series Data
The modeling of time series is becoming increasingly critical in a wide
variety of applications. Overall, data evolves by following different patterns,
which are generally caused by different user behaviors. Given a time series, we
define the evolution gene to capture the latent user behaviors and to describe
how the behaviors lead to the generation of time series. In particular, we
propose a uniform framework that recognizes different evolution genes of
segments by learning a classifier, and adopt an adversarial generator to
implement the evolution gene by estimating the segments' distribution.
Experimental results based on a synthetic dataset and five real-world datasets
show that our approach can not only achieve a good prediction results (e.g.,
averagely +10.56% in terms of F1), but is also able to provide explanations of
the results.Comment: a preprint version. arXiv admin note: text overlap with
arXiv:1703.10155 by other author
Climatic and Soil Factors Shape the Demographical History and Genetic Diversity of a Deciduous Oak (Quercus liaotungensis) in Northern China
Past and current climatic changes have affected the demography, patterns of genetic diversity, and genetic structure of extant species. The study of these processes provides valuable information to forecast evolutionary changes and to identify conservation priorities. Here, we sequenced two functional nuclear genes and four chloroplast DNA regions for 105 samples from 21 populations of Quercus liaotungensis across its distribution range. Coalescent-based Bayesian analysis, approximate Bayesian computation (ABC), and ecological niche modeling (ENM) were integrated to investigate the genetic patterns and demographical history of this species. Association estimates including Mantel tests and multiple linear regressions were used to infer the effects of geographical and ecological factors on temporal genetic variation and diversity of this oak species. Based on multiple loci, Q. liaotungensis populations clustered into two phylogenetic groups; this grouping pattern could be the result of adaptation to habitats with different temperature and precipitation seasonality conditions. Demographical reconstructions and ENMs suggest an expansion decline trend of this species during the Quaternary climatic oscillations. Association analyses based on nuclear data indicated that intraspecific genetic differentiation of Q. liaotungensis was clearly correlated with ecological distance; specifically, the genetic diversity of this species was significantly correlated with temperature seasonality and soil pH, but negatively correlated with precipitation. Our study highlights the impact of Pleistocene climate oscillations on the demographic history of a tree species in Northern China, and suggests that climatic and soil conditions are the major factors shaping the genetic diversity and population structure of Q. liaotungensis
Remote Sensing Image Detection Based on YOLOv4 Improvements
Remote sensing image target object detection and recognition are widely used both in military and civil fields. There are many models proposed for this purpose, but their effectiveness on target object detection in remote sensing images is not ideal due to the influence of climate conditions, obstacles and confusing objects presented in images, image clarity, and associated problems with small-target and multi-target detection and recognition. Therefore, how to accurately detect target objects in images is an urgent problem to be solved. To this end, a novel model, called YOLOv4_CE, is proposed in this paper, based on the classical YOLOv4 model with added improvements, resulting from replacing the backbone feature-extraction CSPDarknet53 network with a ConvNeXt-S network, replacing the Complete Intersection over Union (CIoU) loss with the Efficient Intersection over Union (EIoU) loss, and adding a coordinate attention mechanism to YOLOv4, as to improve its remote sensing image detection capabilities. The results, obtained through experiments conducted on two open data sets, demonstrate that the proposed YOLOv4_CE model outperforms, in this regard, both the original YOLOv4 model and four other state-of-the-art models, namely Faster R-CNN, Gliding Vertex, Oriented R-CNN, and EfficientDet, in terms of the mean average precision (mAP) and F1 score, by achieving respective values of 95.03% and 0.933 on the NWPU VHR-10 data set, and 95.89% and 0.937 on the RSOD data set.National Key Research and Development Program of China under Grant 2017YFE0135700;
MES under Grant No. D01-168/28.07.2022 for NCDSC part of the Bulgarian National Roadmap on RIs; Telecommunications Research Centre (TRC), University of Limerick, Ireland
Multi-tissue integrative analysis of personal epigenomes
Evaluating the impact of genetic variants on transcriptional regulation is a central goal in biological science that has been constrained by reliance on a single reference genome. To address this, we constructed phased, diploid genomes for four cadaveric donors (using long-read sequencing) and systematically charted noncoding regulatory elements and transcriptional activity across more than 25 tissues from these donors. Integrative analysis revealed over a million variants with allele-specific activity, coordinated, locus-scale allelic imbalances, and structural variants impacting proximal chromatin structure. We relate the personal genome analysis to the ENCODE encyclopedia, annotating allele- and tissue-specific elements that are strongly enriched for variants impacting expression and disease phenotypes. These experimental and statistical approaches, and the corresponding EN-TEx resource, provide a framework for personalized functional genomics
Complex Knowledge Graph Embeddings Based on Convolution and Translation
Link prediction involves the use of entities and relations that already exist in a knowledge graph to reason about missing entities or relations. Different approaches have been proposed to date for performing this task. This paper proposes a combined use of the translation-based approach with the Convolutional Neural Network (CNN)-based approach, resulting in a novel model, called ConCMH. In the proposed model, first, entities and relations are embedded into the complex space, followed by a vector multiplication of entity embeddings and relational embeddings and taking the real part of the results to generate a feature matrix of their interaction. Next, a 2D convolution is used to extract features from this matrix and generate feature maps. Finally, the feature vectors are transformed into predicted entity embeddings by obtaining the inner product of the feature mapping and the entity embedding matrix. The proposed ConCMH model is compared against state-of-the-art models on the four most commonly used benchmark datasets and the obtained experimental results confirm its superiority in the majority of cases
Complex knowledge graph embeddings based on convolution and translation
Link prediction involves the use of entities and relations that already exist in a knowledge graph to reason about missing entities or relations. Different approaches have been proposed to date for performing this task. This paper proposes a combined use of the translation-based approach with the Convolutional Neural Network (CNN)-based approach, resulting in a novel model, called ConCMH. In the proposed model, first, entities and relations are embedded into the complex space, followed by a vector multiplication of entity embeddings and relational embeddings and taking the real part of the results to generate a feature matrix of their interaction. Next, a 2D convolution is used to extract features from this matrix and generate feature maps. Finally, the feature vectors are transformed into predicted entity embeddings by obtaining the inner product of the feature mapping and the entity embedding matrix. The proposed ConCMH model is compared against state-of-the-art models on the four most commonly used benchmark datasets and the obtained experimental results confirm its superiority in the majority of cases. </p
Development of Chloroplast and Nuclear DNA Markers for Chinese Oaks (Quercus Subgenus Quercus) and Assessment of Their Utility as DNA Barcodes
Chloroplast DNA (cpDNA) is frequently used for species demography, evolution, and species discrimination of plants. However, the lack of efficient and universal markers often brings particular challenges for genetic studies across different plant groups. In this study, chloroplast genomes from two closely related species (Quercus rubra and Castanea mollissima) in Fagaceae were compared to explore universal cpDNA markers for the Chinese oak species in Quercus subgenus Quercus, a diverse species group without sufficient molecular differentiation. With the comparison, nine and 14 plastid markers were selected as barcoding and phylogeographic candidates for the Chinese oaks. Five (psbA-trnH, matK-trnK, ycf3-trnS, matK, and ycf1) of the nine plastid candidate barcodes, with the addition of newly designed ITS and a single-copy nuclear gene (SAP), were then tested on 35 Chinese oak species employing four different barcoding approaches (genetic distance-, BLAST-, character-, and tree-based methods). The four methods showed different species identification powers with character-based method performing the best. Of the seven barcodes tested, a barcoding gap was absent in all of them across the Chinese oaks, while ITS and psbA-trnH provided the highest species resolution (30.30%) with the character- and BLAST-based methods, respectively. The six-marker combination (psbA-trnH + matK-trnK + matK + ycf1 + ITS + SAP) showed the best species resolution (84.85%) using the character-based method for barcoding the Chinese oaks. The barcoding results provided additional implications for taxonomy of the Chinese oaks in subg. Quercus, basically identifying three major infrageneric clades of the Chinese oaks (corresponding to Groups Quercus, Cerris, and Ilex) referenced to previous phylogenetic classification of Quercus. While the morphology-based allocations proposed for the Chinese oaks in subg. Quercus were challenged. A low variation rate of the chloroplast genome, and complex speciation patterns involving incomplete lineage sorting, interspecific hybridization and introgression, possibly have negative impacts on the species assignment and phylogeny of oak species
Detection of River Floating Garbage Based on Improved YOLOv5
The random dumping of garbage in rivers has led to the continuous deterioration of water quality and affected people’s living environment. The accuracy of detection of garbage floating in rivers is greatly affected by factors such as floating speed, night/daytime natural light, viewing angle and position, etc. This paper proposes a novel detection model, called YOLOv5_CBS, for the detection of garbage objects floating in rivers, based on improvements of the YOLOv5 model. Firstly, a coordinate attention (CA) mechanism is added to the original C3 module (without compressing the number of channels in the bottleneck), forming a new C3-CA-Uncompress Bottleneck (CCUB) module for improving the size of the receptive field and allowing the model to pay more attention to important parts of the processed images. Then, the Path Aggregation Network (PAN) in YOLOv5 is replaced with a Bidirectional Feature Pyramid Network (BiFPN), as proposed by other researchers, to enhance the depth of information mining and improve the feature extraction capability and detection performance of the model. In addition, the Complete Intersection over Union (CIoU) loss function, which was originally used in YOLOv5 for the calculation of location score of the compound loss, is replaced with the SCYLLA-IoU (SIoU) loss function, so as to speed up the model convergence and improve its regression precision. The results, obtained through experiments conducted on two datasets, demonstrate that the proposed YOLOv5_CBS model outperforms the original YOLOv5 model, along with three other state-of-the-art models (Faster R-CNN, YOLOv3, and YOLOv4), when used for river floating garbage objects detection, in terms of the recall, average precision, and F1 score achieved by reaching respective values of 0.885, 90.85%, and 0.8669 on the private dataset, and 0.865, 92.18%, and 0.9006 on the Flow-Img public dataset
Detection of River Floating Garbage Based on Improved YOLOv5
The random dumping of garbage in rivers has led to the continuous deterioration of water quality and affected people’s living environment. The accuracy of detection of garbage floating in rivers is greatly affected by factors such as floating speed, night/daytime natural light, viewing angle and position, etc. This paper proposes a novel detection model, called YOLOv5_CBS, for the detection of garbage objects floating in rivers, based on improvements of the YOLOv5 model. Firstly, a coordinate attention (CA) mechanism is added to the original C3 module (without compressing the number of channels in the bottleneck), forming a new C3-CA-Uncompress Bottleneck (CCUB) module for improving the size of the receptive field and allowing the model to pay more attention to important parts of the processed images. Then, the Path Aggregation Network (PAN) in YOLOv5 is replaced with a Bidirectional Feature Pyramid Network (BiFPN), as proposed by other researchers, to enhance the depth of information mining and improve the feature extraction capability and detection performance of the model. In addition, the Complete Intersection over Union (CIoU) loss function, which was originally used in YOLOv5 for the calculation of location score of the compound loss, is replaced with the SCYLLA-IoU (SIoU) loss function, so as to speed up the model convergence and improve its regression precision. The results, obtained through experiments conducted on two datasets, demonstrate that the proposed YOLOv5_CBS model outperforms the original YOLOv5 model, along with three other state-of-the-art models (Faster R-CNN, YOLOv3, and YOLOv4), when used for river floating garbage objects detection, in terms of the recall, average precision, and F1 score achieved by reaching respective values of 0.885, 90.85%, and 0.8669 on the private dataset, and 0.865, 92.18%, and 0.9006 on the Flow-Img public dataset