48 research outputs found
ULTRASOUND-BASED STRUCTURAL AND FUNCTIONAL MODIFICATIONS OF OAT PROTEIN: COMPLEMENTARY EFFECTS OF HIGH PRESSURE, CYSTEINE, AND FLAVONOIDS
Oat (Avena sativa L.), a specialty cereal grain belonging to the Poaceae family, is relatively abundant in nutritious protein (12–20%) and dietary fiber (β-glucan). The balanced hydrophilic/hydrophobic amino acid profile in oat protein relative to other plant sources provides numerous possibilities for formulating oat protein isolate (OPI) into a wide range of food products. Yet, there has been limited research on OPI and the application of this valuable protein source is hampered by its low solubility at neutral and mildly acidic conditions. This dissertation research aimed to elucidate the efficacy of high-intensity ultrasound (HIU) for structural modification of OPI and investigate the complementary roles of high pressure, disulfide-breaking agents (mainly cysteine), and flavonoids for the ultimate improvement of OPI functionalities, including solubility, emulsification, and gelation.
OPI (mainly 12S globulins) was isolated from oat groats by alkaline extraction–isoelectric precipitation. Because protein functionality is influenced by environmental factors, such as pH, ionic strength, and temperature, it is essential to understand the surface properties of native protein (charge, polarity, and amphiphilicity). Therefore, Experiment 1 was conducted to examine the solubilization behavior of OPI in relation to surface properties when subjected to varying pH and ionic strength conditions. A characteristic U-shaped solubility curve was observed within pH 2.0–8.0 at low ionic strengths (I \u3c 0.01) and a minimum solubility was found at pH 4.0–5.0 and I of 0.03–0.2. Particle size, ξ-potential, and SDS–PAGE patterns supported the solubility profile.
In seeking effective modification strategies, Experiment 2 was carried out to determine the effect of internal disulfide cleavage on OPI solubility and emulsifying activity. Differential scanning calorimetry revealed that disulfide bonds contributed to the remarkable high stability (Tm of 111.1 °C) of OPI but impeded its molecular flexibility and functionality. Hence, cysteine was applied to disrupt inter-subunit S–S bonds of OPI. Cysteine at concentrations higher than 1.0 mM/mg protein induced disulfide cleavages, and the 12S globulin subunit dissociation reached a maximum (80%) at 3.3 to 6.7 mM/mg protein. Correspondingly, emulsions prepared with cysteine-treated OPI showed superior interfacial protein coverage (0.170 m2/g compared to 0.092 m2/g protein for control) and reduced emulsion particle size from 4722 to 2238 nm.
While cysteine was able to modulate OPI emulsifying activity, more robust techniques were necessary to modify protein structure for functionality improvement. Hence, Experiment 3 sought to apply high-intensity ultrasound (HIU) to dissociate 12S globulins and disrupt tertiary structure to overcome the low-solubility constraints. Through cavitation-induced microscopic shearing and pressure, HIU treatment (5 minutes at 70% amplitude) significantly reduced OPI particle size (up to 37%) and increased solubility (up to 48%) at pH 5.5–8. Fluorescence spectrometry and hydrophobicity measurements revealed protein particle dissociation and major conformational changes, with the exposure of previously occluded hydrophilic/hydrophobic groups predisposing OPI to increased interfacial activity.
To enhance the effectiveness of HIU, high pressure (HP) at 30 MPa was applied to complement the effect on modulating OPI functionality (Experiment 4). With further increased emulsifying activity of OPI by the HUI+HP combination treatment at 20 and 60 °C, the emulsions prepared exhibited an additional particle size reduction (from 1653 to 955 nm). Confocal microscopy revealed increased distribution uniformity of oil droplets and emulsion stability. Although in theory both HP and the moderate temperature (60 °C) could aid HIU in dissociating protein aggregates, the improvement on emulsifying activity remained predominated by the HIU process.
Based on the above findings, cysteine was reintroduced into Experiment 5 to assist in the formation of oat protein-based gel networks with the pretreatment of HIU. Native OPI samples treated with cysteine alone up to 400 mg/g protein had a significantly increased gel strength (from 0.0049 to 0.23 N) and decreased cooking loss (up to 44%). For HIU treated OPI, the efficacy of cysteine increased 3-fold (gel strength improved to 0.68 N). The combination of cysteine and HIU also enhanced the rheological properties (storage modulus), structural, and water-immobilizing capacity of the heat-induced gel.
Polyphenols are naturally present in oat groats and are increasingly utilized to modulate plant protein functionality, Experiment 6 was conducted to investigate the potential collaborative effect of three di-phenol flavonoids with varying hydroxyl groups (1-kaempferol, 2-quercetin, and 3-myricetin) with HIU for enhancing OPI emulsifying activity. Binding with flavonoids (fluorescence quenching) subjected HIU treated OPI to further increased hydrophobicity with 0.5 mmol/L myricetin producing the maximum effect. The HIU+flavonoid treatment improved the emulsion stability. Additionally, all flavonoid-treated samples exhibited a higher oxidative stability against free radicals.
Overall, this dissertation research demonstrated that natural chemical compounds (cysteine and flavonoids) and physical treatment (high pressure) provided synergistic effects for HIU on increasing structural flexibility and improving amphiphilicity of oat globulins. The physicochemical changes led to enhanced emulsifying and gelling properties accentuating the potential of these structure-modifying techniques for oat protein
Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Talking face generation aims at generating photo-realistic video portraits of
a target person driven by input audio. Due to its nature of one-to-many mapping
from the input audio to the output video (e.g., one speech content may have
multiple feasible visual appearances), learning a deterministic mapping like
previous works brings ambiguity during training, and thus causes inferior
visual results. Although this one-to-many mapping could be alleviated in part
by a two-stage framework (i.e., an audio-to-expression model followed by a
neural-rendering model), it is still insufficient since the prediction is
produced without enough information (e.g., emotions, wrinkles, etc.). In this
paper, we propose MemFace to complement the missing information with an
implicit memory and an explicit memory that follow the sense of the two stages
respectively. More specifically, the implicit memory is employed in the
audio-to-expression model to capture high-level semantics in the
audio-expression shared space, while the explicit memory is employed in the
neural-rendering model to help synthesize pixel-level details. Our experimental
results show that our proposed MemFace surpasses all the state-of-the-art
results across multiple scenarios consistently and significantly.Comment: Project page: see https://memoryface.github.i
Feature-based encoding of face identity by single neurons in the human medial temporal lobe
Neurons in the human medial temporal lobe (MTL) that are selective for the identity of specific people are classically thought to encode identity invariant to visual features. However, it remains largely unknown how visual information from higher visual cortex is translated into a semantic representation of an individual person. Here, we show that some MTL neurons are selective to multiple different face identities on the basis of shared features that form clusters in the representation of a deep neural network trained to recognize faces. Contrary to prevailing views, we find that these neurons represent an individual’s face with feature-based encoding, rather than through association with concepts. The response of feature neurons did not depend on face identity nor face familiarity, and the region of feature space to which they are tuned predicted their response to new face stimuli. Our results provide critical evidence bridging the perception-driven representation of facial features in the higher visual cortex and the memory-driven representation of semantics in the MTL, which may form the basis for declarative memory
Rethinking Range View Representation for LiDAR Segmentation
LiDAR segmentation is crucial for autonomous driving perception. Recent
trends favor point- or voxel-based methods as they often yield better
performance than the traditional range view representation. In this work, we
unveil several key factors in building powerful range view models. We observe
that the "many-to-one" mapping, semantic incoherence, and shape deformation are
possible impediments against effective learning from range view projections. We
present RangeFormer -- a full-cycle framework comprising novel designs across
network architecture, data augmentation, and post-processing -- that better
handles the learning and processing of LiDAR point clouds from the range view.
We further introduce a Scalable Training from Range view (STR) strategy that
trains on arbitrary low-resolution 2D range images, while still maintaining
satisfactory 3D segmentation accuracy. We show that, for the first time, a
range view method is able to surpass the point, voxel, and multi-view fusion
counterparts in the competing LiDAR semantic and panoptic segmentation
benchmarks, i.e., SemanticKITTI, nuScenes, and ScribbleKITTI.Comment: ICCV 2023; 24 pages, 10 figures, 14 tables; Webpage at
https://ldkong.com/RangeForme
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
The robustness of 3D perception systems under natural corruptions from
environments and sensors is pivotal for safety-critical applications. Existing
large-scale 3D perception datasets often contain data that are meticulously
cleaned. Such configurations, however, cannot reflect the reliability of
perception models during the deployment stage. In this work, we present Robo3D,
the first comprehensive benchmark heading toward probing the robustness of 3D
detectors and segmentors under out-of-distribution scenarios against natural
corruptions that occur in real-world environments. Specifically, we consider
eight corruption types stemming from adversarial weather conditions, external
disturbances, and internal sensor failure. We uncover that, although promising
results have been progressively achieved on standard benchmarks,
state-of-the-art 3D perception models are at risk of being vulnerable to
corruptions. We draw key observations on the use of data representations,
augmentation schemes, and training strategies, that could severely affect the
model's performance. To pursue better robustness, we propose a
density-insensitive training framework along with a simple flexible
voxelization strategy to enhance the model resiliency. We hope our benchmark
and approach could inspire future research in designing more robust and
reliable 3D perception models. Our robustness benchmark suite is publicly
available.Comment: 33 pages, 26 figures, 26 tables; code at
https://github.com/ldkong1205/Robo3D project page at
https://ldkong.com/Robo3
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP
Contrastive Language-Image Pre-training (CLIP) achieves promising results in
2D zero-shot and few-shot learning. Despite the impressive performance in 2D,
applying CLIP to help the learning in 3D scene understanding has yet to be
explored. In this paper, we make the first attempt to investigate how CLIP
knowledge benefits 3D scene understanding. We propose CLIP2Scene, a simple yet
effective framework that transfers CLIP knowledge from 2D image-text
pre-trained models to a 3D point cloud network. We show that the pre-trained 3D
network yields impressive performance on various downstream tasks, i.e.,
annotation-free and fine-tuning with labelled data for semantic segmentation.
Specifically, built upon CLIP, we design a Semantic-driven Cross-modal
Contrastive Learning framework that pre-trains a 3D network via semantic and
spatial-temporal consistency regularization. For the former, we first leverage
CLIP's text semantics to select the positive and negative point samples and
then employ the contrastive loss to train the 3D network. In terms of the
latter, we force the consistency between the temporally coherent point cloud
features and their corresponding image features. We conduct experiments on
SemanticKITTI, nuScenes, and ScanNet. For the first time, our pre-trained
network achieves annotation-free 3D semantic segmentation with 20.8% and 25.08%
mIoU on nuScenes and ScanNet, respectively. When fine-tuned with 1% or 100%
labelled data, our method significantly outperforms other self-supervised
methods, with improvements of 8% and 1% mIoU, respectively. Furthermore, we
demonstrate the generalizability for handling cross-domain datasets. Code is
publicly available https://github.com/runnanchen/CLIP2Scene.Comment: CVPR 202
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
Point-, voxel-, and range-views are three representative forms of point
clouds. All of them have accurate 3D measurements but lack color and texture
information. RGB images are a natural complement to these point cloud views and
fully utilizing the comprehensive information of them benefits more robust
perceptions. In this paper, we present a unified multi-modal LiDAR segmentation
network, termed UniSeg, which leverages the information of RGB images and three
views of the point cloud, and accomplishes semantic segmentation and panoptic
segmentation simultaneously. Specifically, we first design the Learnable
cross-Modal Association (LMA) module to automatically fuse voxel-view and
range-view features with image features, which fully utilize the rich semantic
information of images and are robust to calibration errors. Then, the enhanced
voxel-view and range-view features are transformed to the point space,where
three views of point cloud features are further fused adaptively by the
Learnable cross-View Association module (LVA). Notably, UniSeg achieves
promising results in three public benchmarks, i.e., SemanticKITTI, nuScenes,
and Waymo Open Dataset (WOD); it ranks 1st on two challenges of two benchmarks,
including the LiDAR semantic segmentation challenge of nuScenes and panoptic
segmentation challenges of SemanticKITTI. Besides, we construct the OpenPCSeg
codebase, which is the largest and most comprehensive outdoor LiDAR
segmentation codebase. It contains most of the popular outdoor LiDAR
segmentation algorithms and provides reproducible implementations. The
OpenPCSeg codebase will be made publicly available at
https://github.com/PJLab-ADG/PCSeg.Comment: ICCV 2023; 21 pages; 9 figures; 18 tables; Code at
https://github.com/PJLab-ADG/PCSe