33 research outputs found
SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
In this paper, we propose a scribble-based video colorization network with
temporal aggregation called SVCNet. It can colorize monochrome videos based on
different user-given color scribbles. It addresses three common issues in the
scribble-based video colorization area: colorization vividness, temporal
consistency, and color bleeding. To improve the colorization quality and
strengthen the temporal consistency, we adopt two sequential sub-networks in
SVCNet for precise colorization and temporal smoothing, respectively. The first
stage includes a pyramid feature encoder to incorporate color scribbles with a
grayscale frame, and a semantic feature encoder to extract semantics. The
second stage finetunes the output from the first stage by aggregating the
information of neighboring colorized frames (as short-range connections) and
the first colorized frame (as a long-range connection). To alleviate the color
bleeding artifacts, we learn video colorization and segmentation
simultaneously. Furthermore, we set the majority of operations on a fixed small
image resolution and use a Super-resolution Module at the tail of SVCNet to
recover original sizes. It allows the SVCNet to fit different image resolutions
at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo
benchmarks. The experimental results demonstrate that SVCNet produces both
higher-quality and more temporally consistent videos than other well-known
video colorization approaches. The codes and models can be found at
https://github.com/zhaoyuzhi/SVCNet.Comment: accepted by IEEE Transactions on Image Processing (TIP
VCGAN: Video Colorization with Hybrid Generative Adversarial Network
We propose a hybrid recurrent Video Colorization with Hybrid Generative
Adversarial Network (VCGAN), an improved approach to video colorization using
end-to-end learning. The VCGAN addresses two prevalent issues in the video
colorization domain: Temporal consistency and unification of colorization
network and refinement network into a single architecture. To enhance
colorization quality and spatiotemporal consistency, the mainstream of
generator in VCGAN is assisted by two additional networks, i.e., global feature
extractor and placeholder feature extractor, respectively. The global feature
extractor encodes the global semantics of grayscale input to enhance
colorization quality, whereas the placeholder feature extractor acts as a
feedback connection to encode the semantics of the previous colorized frame in
order to maintain spatiotemporal consistency. If changing the input for
placeholder feature extractor as grayscale input, the hybrid VCGAN also has the
potential to perform image colorization. To improve the consistency of far
frames, we propose a dense long-term loss that smooths the temporal disparity
of every two remote frames. Trained with colorization and temporal losses
jointly, VCGAN strikes a good balance between color vividness and video
continuity. Experimental results demonstrate that VCGAN produces higher-quality
and temporally more consistent colorful videos than existing approaches.Comment: Submitted Major Revision Manuscript of IEEE Transactions on
Multimedia (TMM
Characterization of Non-heading Mutation in Heading Chinese Cabbage (Brassica rapa L. ssp. pekinensis)
Heading is a key agronomic trait of Chinese cabbage. A non-heading mutant with flat growth of heading leaves (fg-1) was isolated from an EMS-induced mutant population of the heading Chinese cabbage inbred line A03. In fg-1 mutant plants, the heading leaves are flat similar to rosette leaves. The epidermal cells on the adaxial surface of these leaves are significantly smaller, while those on the abaxial surface are much larger than in A03 plants. The segregation of the heading phenotype in the F2 and BC1 population suggests that the mutant trait is controlled by a pair of recessive alleles. Phytohormone analysis at the early heading stage showed significant decreases in IAA, ABA, JA and SA, with increases in methyl IAA and trans-Zeatin levels, suggesting they may coordinate leaf adaxial-abaxial polarity, development and morphology in fg-1. RNA-sequencing analysis at the early heading stage showed a decrease in expression levels of several auxin transport (BrAUX1, BrLAXs, and BrPINs) and responsive genes. Transcript levels of important ABA responsive genes, including BrABF3, were up-regulated in mid-leaf sections suggesting that both auxin and ABA signaling pathways play important roles in regulating leaf heading. In addition, a significant reduction in BrIAMT1 transcripts in fg-1 might contribute to leaf epinastic growth. The expression profiles of 19 genes with known roles in leaf polarity were significantly different in fg-1 leaves compared to wild type, suggesting that these genes might also regulate leaf heading in Chinese cabbage. In conclusion, leaf heading in Chinese cabbage is controlled through a complex network of hormone signaling and abaxial-adaxial patterning pathways. These findings increase our understanding of the molecular basis of head formation in Chinese cabbage
Applying generative AI with retrieval augmented generation to summarize and extract key clinical information from electronic health records
Background: Malnutrition is a prevalent issue in aged care facilities (RACFs), leading to adverse health outcomes. The ability to efficiently extract key clinical information from a large volume of data in electronic health records (EHR) can improve understanding about the extent of the problem and developing effective interventions. This research aimed to test the efficacy of zero-shot prompt engineering applied to generative artificial intelligence (AI) models on their own and in combination with retrieval augmented generation (RAG), for the automating tasks of summarizing both structured and unstructured data in EHR and extracting important malnutrition information. Methodology: We utilized Llama 2 13B model with zero-shot prompting. The dataset comprises unstructured and structured EHRs related to malnutrition management in 40 Australian RACFs. We employed zero-shot learning to the model alone first, then combined it with RAG to accomplish two tasks: generate structured summaries about the nutritional status of a client and extract key information about malnutrition risk factors. We utilized 25 notes in the first task and 1,399 in the second task. We evaluated the model\u27s output of each task manually against a gold standard dataset. Result: The evaluation outcomes indicated that zero-shot learning applied to generative AI model is highly effective in summarizing and extracting information about nutritional status of RACFs’ clients. The generated summaries provided concise and accurate representation of the original data with an overall accuracy of 93.25%. The addition of RAG improved the summarization process, leading to a 6% increase and achieving an accuracy of 99.25%. The model also proved its capability in extracting risk factors with an accuracy of 90%. However, adding RAG did not further improve accuracy in this task. Overall, the model has shown a robust performance when information was explicitly stated in the notes; however, it could encounter hallucination limitations, particularly when details were not explicitly provided. Conclusion: This study demonstrates the high performance and limitations of applying zero-shot learning to generative AI models to automatic generation of structured summarization of EHRs data and extracting key clinical information. The inclusion of the RAG approach improved the model performance and mitigated the hallucination problem
Dark Current Noise Correction Method Based on Dark Pixels for LWIR QWIP Detection Systems
The long-wave infrared (LWIR) quantum-well photodetector (QWIP) operates at low temperatures, but is prone to focal plane temperature changes when imaging in complex thermal environments. This causes dark current changes and generates low-frequency temporal dark current noise. To address this, a dark current noise correction method based on dark pixels is proposed. First, dark pixels were constructed in a QWIP system and the response components of imaging pixels and dark pixels were analyzed. Next, the feature data of dark pixels and imaging pixels were collected and preprocessed, after which a recurrent neural network (RNN) was used to fit the dark current response model. Target data were collected and input into the dark current response model to obtain dark level correction values and correct the original data. Finally, after calculation and correction, temporal noise was reduced by 49.02% on average. The proposed method uses the characteristics of dark pixels to reduce dark current temporal noise, which is difficult using conventional radiation calibrations; this is helpful in promoting the application of QWIPs in LWIR remote sensing
Dark Current Noise Correction Method Based on Dark Pixels for LWIR QWIP Detection Systems
The long-wave infrared (LWIR) quantum-well photodetector (QWIP) operates at low temperatures, but is prone to focal plane temperature changes when imaging in complex thermal environments. This causes dark current changes and generates low-frequency temporal dark current noise. To address this, a dark current noise correction method based on dark pixels is proposed. First, dark pixels were constructed in a QWIP system and the response components of imaging pixels and dark pixels were analyzed. Next, the feature data of dark pixels and imaging pixels were collected and preprocessed, after which a recurrent neural network (RNN) was used to fit the dark current response model. Target data were collected and input into the dark current response model to obtain dark level correction values and correct the original data. Finally, after calculation and correction, temporal noise was reduced by 49.02% on average. The proposed method uses the characteristics of dark pixels to reduce dark current temporal noise, which is difficult using conventional radiation calibrations; this is helpful in promoting the application of QWIPs in LWIR remote sensing
A Five-Step Workflow to Manually Annotate Unstructured Data into Training Dataset for Natural Language Processing
Natural Language Processing (NLP) is a powerful technique for extracting valuable information from unstructured electronic health records (EHRs). However, a prerequisite for NLP is the availability of high-quality annotated datasets. To date, there is a lack of effective methods to guide the research effort of manually annotating unstructured datasets, which can hinder NLP performance. Therefore, this study develops a five-step workflow for manually annotating unstructured datasets, including (1) annotator training and familiarising with the text corpus, (2) vocabulary identification, (3) annotation schema development, (4) annotation execution, and (5) result validation. This framework was then applied to annotate agitation symptoms from the unstructured EHRs of 40 Australian residential aged care facilities. The annotated corpus achieved an accuracy rate of 96%. This suggests that our proposed annotation workflow can be used in manual data processing to develop annotated training corpus for developing NLP algorithms
Extracting Symptoms of Agitation in Dementia from Free-Text Nursing Notes Using Advanced Natural Language Processing
Nursing staff record observations about older people under their care in free-text nursing notes. These notes contain older people\u27s care needs, disease symptoms, frequency of symptom occurrence, nursing actions, etc. Therefore, it is vital to develop a technique to uncover important data from these notes. This study developed and evaluated a deep learning and transfer learning-based named entity recognition (NER) model for extracting symptoms of agitation in dementia from the nursing notes. We employed a Clinical BioBERT model for word embedding. Then we applied bidirectional long-short-term memory (BiLSTM) and conditional random field (CRF) models for NER on nursing notes from Australian residential aged care facilities. The proposed NER model achieves satisfactory performance in extracting symptoms of agitation in dementia with a 75% F1 score and 78% accuracy. We will further develop machine learning models to recommend the optimal nursing actions to manage agitation
Spatial context-aware network for salient object detection
Salient Object Detection (SOD) is a fundamental problem in the field of computer vision. This paper presents a novel Spatial Context-Aware Network (SCA-Net) for SOD in images. Compared with other recent deep learning based SOD algorithms, SCA-Net can more effectively aggregate multi-level deep features. A Long-Path Context Module (LPCM) is employed to grant better discrimination ability to feature maps that incorporate coarse global information. Consequently, a more accurate initial saliency map can be obtained to facilitate subsequent predictions. SCA-Net also adopts a Short-Path Context Module (SPCM) to progressively enforce the interaction between local contextual cues and global features. Extensive experiments on five large-scale benchmarks demonstrate that SCA-Net achieves favorable performance against very recent state-of-the-art algorithms
Machine Learning Model to Extract Malnutrition Data from Nursing Notes
Malnutrition is a severe health problem that is prevalent in older people residing in residential aged care facilities. Recent advancements in machine learning have made it possible to extract key insight from electronic health records. To date, few researchers applied these techniques to classify nursing notes automatically. Therefore, we propose a model based on ClinicalBioBert to identify malnutrition notes. We evaluated our approach with two mainstream approaches. Our approach had the highest F1-score of 0.90