70 research outputs found
PDF-VQA: A New Dataset for Real-World VQA on PDF Documents
Document-based Visual Question Answering examines the document understanding
of document images in conditions of natural language questions. We proposed a
new document-based VQA dataset, PDF-VQA, to comprehensively examine the
document understanding from various aspects, including document element
recognition, document layout structural understanding as well as contextual
understanding and key information extraction. Our PDF-VQA dataset extends the
current scale of document understanding that limits on the single document page
to the new scale that asks questions over the full document of multiple pages.
We also propose a new graph-based VQA model that explicitly integrates the
spatial and hierarchically structural relationships between different document
elements to boost the document structural understanding. The performances are
compared with several baselines over different question types and
tasks\footnote{The full dataset will be released after paper acceptance
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Recognizing the layout of unstructured digital documents is crucial when
parsing the documents into the structured, machine-readable format for
downstream applications. Recent studies in Document Layout Analysis usually
rely on computer vision models to understand documents while ignoring other
information, such as context information or relation of document components,
which are vital to capture. Our Doc-GCN presents an effective way to harmonize
and integrate heterogeneous aspects for Document Layout Analysis. We first
construct graphs to explicitly describe four main aspects, including syntactic,
semantic, density, and appearance/visual information. Then, we apply graph
convolutional networks for representing each aspect of information and use
pooling to integrate them. Finally, we aggregate each aspect and feed them into
2-layer MLPs for document layout component classification. Our Doc-GCN achieves
new state-of-the-art results in three widely used DLA datasets.Comment: Accepted by COLING 202
Training Robust Spiking Neural Networks on Neuromorphic Data with Spatiotemporal Fragments
Neuromorphic vision sensors (event cameras) are inherently suitable for
spiking neural networks (SNNs) and provide novel neuromorphic vision data for
this biomimetic model. Due to the spatiotemporal characteristics, novel data
augmentations are required to process the unconventional visual signals of
these cameras. In this paper, we propose a novel Event SpatioTemporal Fragments
(ESTF) augmentation method. It preserves the continuity of neuromorphic data by
drifting or inverting fragments of the spatiotemporal event stream to simulate
the disturbance of brightness variations, leading to more robust spiking neural
networks. Extensive experiments are performed on prevailing neuromorphic
datasets. It turns out that ESTF provides substantial improvements over pure
geometric transformations and outperforms other event data augmentation
methods. It is worth noting that the SNNs with ESTF achieve the
state-of-the-art accuracy of 83.9\% on the CIFAR10-DVS dataset.Comment: Accepted by ICASSP 202
D-IF: Uncertainty-aware Human Digitization via Implicit Distribution Field
Realistic virtual humans play a crucial role in numerous industries, such as
metaverse, intelligent healthcare, and self-driving simulation. But creating
them on a large scale with high levels of realism remains a challenge. The
utilization of deep implicit function sparks a new era of image-based 3D
clothed human reconstruction, enabling pixel-aligned shape recovery with fine
details. Subsequently, the vast majority of works locate the surface by
regressing the deterministic implicit value for each point. However, should all
points be treated equally regardless of their proximity to the surface? In this
paper, we propose replacing the implicit value with an adaptive uncertainty
distribution, to differentiate between points based on their distance to the
surface. This simple ``value to distribution'' transition yields significant
improvements on nearly all the baselines. Furthermore, qualitative results
demonstrate that the models trained using our uncertainty distribution loss,
can capture more intricate wrinkles, and realistic limbs. Code and models are
available for research purposes at https://github.com/psyai-net/D-IF_release
Research Progress in the Regulation Mechanisms of White and Brown Adipose Tissue in the Body by Functionally Active Factors
Brown adipose tissue (BAT) improves the metabolic level of the body by promoting energy expenditure, which can contribute to the prevention and treatment of metabolic diseases such as obesity and diabetes, and BAT has become a new target for the treatment of metabolic diseases. BAT activity enhancement in the body is a hot topic but also a challenge for researchers, and research and analysis of functionally active factors in foods that regulate BAT can help to develop new nutritional activators. In this paper, we summarize the development and thermogenesis of BAT and thermogenesis-related factors, and review active ingredients in foods that regulate brown fat and their mechanisms of action, and briefly introduce the effects of white adipose tissue (WAT) and BAT on the body’s health. We also discuss recent developments in understanding the role of BAT in regulating energy metabolic balance and various diseases in the body. We hope that the present review will provide a theoretical basis for future development of brown adipose nutritional activators and improvement of individualized healthy dietary management programs in order to prevent and treat various diseases
Training Stronger Spiking Neural Networks with Biomimetic Adaptive Internal Association Neurons
As the third generation of neural networks, spiking neural networks (SNNs)
are dedicated to exploring more insightful neural mechanisms to achieve
near-biological intelligence. Intuitively, biomimetic mechanisms are crucial to
understanding and improving SNNs. For example, the associative long-term
potentiation (ALTP) phenomenon suggests that in addition to learning mechanisms
between neurons, there are associative effects within neurons. However, most
existing methods only focus on the former and lack exploration of the internal
association effects. In this paper, we propose a novel Adaptive Internal
Association~(AIA) neuron model to establish previously ignored influences
within neurons. Consistent with the ALTP phenomenon, the AIA neuron model is
adaptive to input stimuli, and internal associative learning occurs only when
both dendrites are stimulated at the same time. In addition, we employ weighted
weights to measure internal associations and introduce intermediate caches to
reduce the volatility of associations. Extensive experiments on prevailing
neuromorphic datasets show that the proposed method can potentiate or depress
the firing of spikes more specifically, resulting in better performance with
fewer spikes. It is worth noting that without adding any parameters at
inference, the AIA model achieves state-of-the-art performance on
DVS-CIFAR10~(83.9\%) and N-CARS~(95.64\%) datasets.Comment: Accepted by ICASSP 202
Form-NLU: Dataset for the Form Language Understanding
Compared to general document analysis tasks, form document structure
understanding and retrieval are challenging. Form documents are typically made
by two types of authors; A form designer, who develops the form structure and
keys, and a form user, who fills out form values based on the provided keys.
Hence, the form values may not be aligned with the form designer's intention
(structure and keys) if a form user gets confused. In this paper, we introduce
Form-NLU, the first novel dataset for form structure understanding and its key
and value information extraction, interpreting the form designer's intent and
the alignment of user-written value on it. It consists of 857 form images, 6k
form keys and values, and 4k table keys and values. Our dataset also includes
three form types: digital, printed, and handwritten, which cover diverse form
appearances and layouts. We propose a robust positional and logical
relation-based form key-value information extraction framework. Using this
dataset, Form-NLU, we first examine strong object detection models for the form
layout understanding, then evaluate the key information extraction task on the
dataset, providing fine-grained results for different types of forms and keys.
Furthermore, we examine it with the off-the-shelf pdf layout extraction tool
and prove its feasibility in real-world cases.Comment: Accepted by SIGIR 202
Molecular epidemiology and antimicrobial resistance of outbreaks of Klebsiella pneumoniae clinical mastitis in Chinese dairy farms
Klebsiella pneumoniae is an opportunistic pathogen that causes serious infections in humans and animals. However, the availability of epidemiological information on clinical mastitis due to K. pneumoniae is limited. To acquire new information regarding K. pneumoniae mastitis, data were mined about K. pneumoniae strains on dairy cattle farms (farms A to H) in 7 Chinese provinces in 2021. Hypermucoviscous strains of K. pneumoniae were obtained by the string test. MICs of antimicrobial agents were determined via the broth microdilution method. Ten antimicrobial resistance genes and virulence genes were identified by PCR. The prevalence of K. pneumoniae was 35.91% (65/181), and 100% of the bacteria were sensitive to enrofloxacin. Nine antimicrobial resistance genes and virulence genes were identified and compared among farms. The hypermucoviscous phenotype was present in 94.44% of isolates from farm B, which may be a function of the rmpA virulence gene. Based on these data, the multidrug-resistant strains SD-14 and HB-21 were chosen and sequenced. Genotypes were assayed for K. pneumoniae isolates from different countries and different hosts using multilocus sequence typing (MLST). Ninety-four sequence types (STs) were found, and 6 STs present a risk for spreading in specific regions. Interestingly, ST43 was observed in bovine isolates for the first time. Our study partially reveals the current distribution characteristics of bovine K. pneumoniae in China and may provide a theoretical basis for the prevention and treatment of bovine K. pneumoniae mastitis
- …