451 research outputs found
Identification of Physical Processes and Unknown Parameters of 3D Groundwater Contaminant Problems via Theory-guided U-net
Identification of unknown physical processes and parameters of groundwater
contaminant sources is a challenging task due to their ill-posed and non-unique
nature. Numerous works have focused on determining nonlinear physical processes
through model selection methods. However, identifying corresponding nonlinear
systems for different physical phenomena using numerical methods can be
computationally prohibitive. With the advent of machine learning (ML)
algorithms, more efficient surrogate models based on neural networks (NNs) have
been developed in various disciplines. In this work, a theory-guided U-net
(TgU-net) framework is proposed for surrogate modeling of three-dimensional
(3D) groundwater contaminant problems in order to efficiently elucidate their
involved processes and unknown parameters. In TgU-net, the underlying governing
equations are embedded into the loss function of U-net as soft constraints. For
the considered groundwater contaminant problem, sorption is considered to be a
potential process of an uncertain type, and three equilibrium sorption isotherm
types (i.e., linear, Freundlich, and Langmuir) are considered. Different from
traditional approaches in which one model corresponds to one equation, these
three sorption types are modeled through only one TgU-net surrogate. The three
mentioned sorption terms are integrated into one equation by assigning
indicators. Accurate predictions illustrate the satisfactory generalizability
and extrapolability of the constructed TgU-net. Furthermore, based on the
constructed TgU-net surrogate, a data assimilation method is employed to
identify the physical process and parameters simultaneously. This work shows
the possibility of governing equation discovery of physical problems that
contain multiple and even uncertain processes by using deep learning and data
assimilation methods.Comment: 40 pages, 19 figure
Effects of tea garden soil on aroma components and related gene expression in tea leaves
In order to explore the effect of soil on the synthesis of aroma components in tea leaves, tea seedlings replanted in tea rhizosphere soil of different ages were used as research materials. Tea seedlings were replanted in soils aged 0, 4, 9, and 30 years, and after one year of growth, 34, 37, 29, and 26 substances were detected in the tea leaves, respectively, using gas chromatography-mass spectrometry (GC-MS). The relative contents of terpenoids and alcohols in the tea leaves dropped from 66.40% to 44.52% and 5.21% to 2.61%, respectively, as the age of the rhizosphere soil increased. Aldehydes, esters, and nitrogen compounds increased from 3.80% to 22.36%, 1.33% to 12.02%, and 3.13% to 19.96%, respectively, as the age of the rhizosphere soil increased. Gene differential expression measured by fluorescence quantitative PCR (qRT-PCR) showed that the number of nerolidol synthetase and linalool synthase genes in tea leaves increased significantly, and the terpineol synthetase, phellandrene synthase, myrcene synthetase, ocimene synthase, limonene synthetase, germacrene synthase, and farnesene synthase genes declined significantly with the increase in soil age. In summary, as the number of years tea had been planted in the soil increased, the soil significantly affected the expression of terpene synthase genes in tea leaves, and then the composition and content of aroma substances in tea leaves changed. The results provide a theoretical basis for the improvement of tea quality
GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching
Beyond the text detection and recognition tasks in image text spotting, video
text spotting presents an augmented challenge with the inclusion of tracking.
While advanced end-to-end trainable methods have shown commendable performance,
the pursuit of multi-task optimization may pose the risk of producing
sub-optimal outcomes for individual tasks. In this paper, we highlight a main
bottleneck in the state-of-the-art video text spotter: the limited recognition
capability. In response to this issue, we propose to efficiently turn an
off-the-shelf query-based image text spotter into a specialist on video and
present a simple baseline termed GoMatching, which focuses the training efforts
on tracking while maintaining strong recognition performance. To adapt the
image text spotter to video datasets, we add a rescoring head to rescore each
detected instance's confidence via efficient tuning, leading to a better
tracking candidate pool. Additionally, we design a long-short term matching
module, termed LST-Matcher, to enhance the spotter's tracking capability by
integrating both long- and short-term matching results via Transformer. Based
on the above simple designs, GoMatching achieves impressive performance on two
public benchmarks, e.g., setting a new record on the ICDAR15-video dataset, and
one novel test set with arbitrary-shaped text, while saving considerable
training budgets. The code will be released at
https://github.com/Hxyz-123/GoMatching
Scalable Mask Annotation for Video Text Spotting
Video text spotting refers to localizing, recognizing, and tracking textual
elements such as captions, logos, license plates, signs, and other forms of
text within consecutive video frames. However, current datasets available for
this task rely on quadrilateral ground truth annotations, which may result in
including excessive background content and inaccurate text boundaries.
Furthermore, methods trained on these datasets often produce prediction results
in the form of quadrilateral boxes, which limits their ability to handle
complex scenarios such as dense or curved text. To address these issues, we
propose a scalable mask annotation pipeline called SAMText for video text
spotting. SAMText leverages the SAM model to generate mask annotations for
scene text images or video frames at scale. Using SAMText, we have created a
large-scale dataset, SAMText-9M, that contains over 2,400 video clips sourced
from existing datasets and over 9 million mask annotations. We have also
conducted a thorough statistical analysis of the generated masks and their
quality, identifying several research topics that could be further explored
based on this dataset. The code and dataset will be released at
\url{https://github.com/ViTAE-Transformer/SAMText}.Comment: Technical report. Work in progres
- …