85 research outputs found
Recent progress in mitochondria-targeted drug and drug-free agents for cancer therapy
The mitochondrion is a dynamic eukaryotic organelle that controls lethal and vital functions of the cell. Being a critical center of metabolic activities and involved in many diseases, mitochondria have been attracting attention as a potential target for therapeutics, especially for cancer treatment. Structural and functional differences between healthy and cancerous mitochondria, such as membrane potential, respiratory rate, energy production pathway, and gene mutations, could be employed for the design of selective targeting systems for cancer mitochondria. A number of mitochondria-targeting compounds, including mitochondria-directed conventional drugs, mitochondrial proteins/metabolism-inhibiting agents, and mitochondria-targeted photosensitizers, have been discussed. Recently, certain drug-free approaches have been introduced as an alternative to induce selective cancer mitochondria dysfunction, such as intramitochondrial aggregation, self-assembly, and biomineralization. In this review, we discuss the recent progress in mitochondria-targeted cancer therapy from the conventional approach of drug/cytotoxic agent conjugates to advanced drug-free approaches
FPANet: Frequency-based Video Demoireing using Frame-level Post Alignment
Interference between overlapping gird patterns creates moire patterns,
degrading the visual quality of an image that captures a screen of a digital
display device by an ordinary digital camera. Removing such moire patterns is
challenging due to their complex patterns of diverse sizes and color
distortions. Existing approaches mainly focus on filtering out in the spatial
domain, failing to remove a large-scale moire pattern. In this paper, we
propose a novel model called FPANet that learns filters in both frequency and
spatial domains, improving the restoration quality by removing various sizes of
moire patterns. To further enhance, our model takes multiple consecutive
frames, learning to extract frame-invariant content features and outputting
better quality temporally consistent images. We demonstrate the effectiveness
of our proposed method with a publicly available large-scale dataset, observing
that ours outperforms the state-of-the-art approaches, including ESDNet,
VDmoire, MBCNN, WDNet, UNet, and DMCNN, in terms of the image and video quality
metrics, such as PSNR, SSIM, LPIPS, FVD, and FSIM
Array-Based Protein Sensing Using an Aggregation-Induced Emission (AIE) Light-Up Probe
Protein detection and identification are important for the diagnosis of diseases; however, the development of facile sensing probes still remains challenging. Here, we present an array-based "turn on" protein-sensing platform capable of detecting and identifying proteins using aggregation-induced emission luminogens (AIEgens). The water-soluble AIEgens in which fluorescence was initially turned off showed strong fluorescence in the presence of nanomolar concentrations of proteins via restriction of the intramolecular rotation of the AIEgens. The binding affinities between the AIEgens and proteins were associated with various chemical functional groups on AIEgens, resulting in distinct fluorescent-signal outcomes for each protein. The combined fluorescence outputs provided sufficient information to detect and discriminate proteins of interest by linear discriminant analysis. Furthermore, the array-based sensor enabled classification of different concentrations of specific proteins. These results provide novel insight into the use of the AIEgens as a new type of sensing probe in array-based systems
LISA: Localized Image Stylization with Audio via Implicit Neural Representation
We present a novel framework, Localized Image Stylization with Audio (LISA)
which performs audio-driven localized image stylization. Sound often provides
information about the specific context of the scene and is closely related to a
certain part of the scene or object. However, existing image stylization works
have focused on stylizing the entire image using an image or text input.
Stylizing a particular part of the image based on audio input is natural but
challenging. In this work, we propose a framework that a user provides an audio
input to localize the sound source in the input image and another for locally
stylizing the target object or scene. LISA first produces a delicate
localization map with an audio-visual localization network by leveraging CLIP
embedding space. We then utilize implicit neural representation (INR) along
with the predicted localization map to stylize the target object or scene based
on sound information. The proposed INR can manipulate the localized pixel
values to be semantically consistent with the provided audio input. Through a
series of experiments, we show that the proposed framework outperforms the
other audio-guided stylization methods. Moreover, LISA constructs concise
localization maps and naturally manipulates the target object or scene in
accordance with the given audio input
Event Fusion Photometric Stereo Network
We present a novel method to estimate the surface normal of an object in an
ambient light environment using RGB and event cameras. Modern photometric
stereo methods rely on an RGB camera, mainly in a dark room, to avoid ambient
illumination. To alleviate the limitations of the darkroom environment and to
use essential light information, we employ an event camera with a high dynamic
range and low latency. This is the first study that uses an event camera for
the photometric stereo task, which works on continuous light sources and
ambient light environment. In this work, we also curate a novel photometric
stereo dataset that is constructed by capturing objects with event and RGB
cameras under numerous ambient lights environment. Additionally, we propose a
novel framework named Event Fusion Photometric Stereo Network~(EFPS-Net), which
estimates the surface normals of an object using both RGB frames and event
signals. Our proposed method interpolates event observation maps that generate
light information with sparse event signals to acquire fluent light
information. Subsequently, the event-interpolated observation maps are fused
with the RGB observation maps. Our numerous experiments showed that EFPS-Net
outperforms state-of-the-art methods on a dataset captured in the real world
where ambient lights exist. Consequently, we demonstrate that incorporating
additional modalities with EFPS-Net alleviates the limitations that occurred
from ambient illumination.Comment: 33 pages, 11 figure
ORA3D: Overlap Region Aware Multi-view 3D Object Detection
Current multi-view 3D object detection methods often fail to detect objects
in the overlap region properly, and the networks' understanding of the scene is
often limited to that of a monocular detection network. Moreover, objects in
the overlap region are often largely occluded or suffer from deformation due to
camera distortion, causing a domain shift. To mitigate this issue, we propose
using the following two main modules: (1) Stereo Disparity Estimation for Weak
Depth Supervision and (2) Adversarial Overlap Region Discriminator. The former
utilizes the traditional stereo disparity estimation method to obtain reliable
disparity information from the overlap region. Given the disparity estimates as
supervision, we propose regularizing the network to fully utilize the geometric
potential of binocular images and improve the overall detection accuracy
accordingly. Further, the latter module minimizes the representational gap
between non-overlap and overlapping regions. We demonstrate the effectiveness
of the proposed method with the nuScenes large-scale multi-view 3D object
detection data. Our experiments show that our proposed method outperforms
current state-of-the-art models, i.e., DETR3D and BEVDet.Comment: BMVC202
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion
In recent years, video generation has become a prominent generative tool and
has drawn significant attention. However, there is little consideration in
audio-to-video generation, though audio contains unique qualities like temporal
semantics and magnitude. Hence, we propose The Power of Sound (TPoS) model to
incorporate audio input that includes both changeable temporal semantics and
magnitude. To generate video frames, TPoS utilizes a latent stable diffusion
model with textual semantic information, which is then guided by the sequential
audio embedding from our pretrained Audio Encoder. As a result, this method
produces audio reactive video contents. We demonstrate the effectiveness of
TPoS across various tasks and compare its results with current state-of-the-art
techniques in the field of audio-to-video generation. More examples are
available at https://ku-vai.github.io/TPoS/Comment: ICCV202
- …