Search CORE

19 research outputs found

DEFormer: DCT-driven Enhancement Transformer for Low-light Image and Dark Vision

Author: Gao Xin
Ju Ran
Sun Xiao
Yin Xiangchen
Yu Zhenda
Zhang Xinyu
Publication venue
Publication date: 13/09/2023
Field of study

The goal of low-light image enhancement is to restore the color and details of the image and is of great significance for high-level visual tasks in autonomous driving. However, it is difficult to restore the lost details in the dark area by relying only on the RGB domain. In this paper we introduce frequency as a new clue into the network and propose a novel DCT-driven enhancement transformer (DEFormer). First, we propose a learnable frequency branch (LFB) for frequency enhancement contains DCT processing and curvature-based frequency enhancement (CFE). CFE calculates the curvature of each channel to represent the detail richness of different frequency bands, then we divides the frequency features, which focuses on frequency bands with richer textures. In addition, we propose a cross domain fusion (CDF) for reducing the differences between the RGB domain and the frequency domain. We also adopt DEFormer as a preprocessing in dark detection, DEFormer effectively improves the performance of the detector, bringing 2.1% and 3.4% improvement in ExDark and DARK FACE datasets on mAP respectively.Comment: submit to ICRA202

arXiv.org e-Print Archive

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Author: Liu Wen
Liu Yebin
Shao Ruizhi
Sun Jingxiang
Wang Lizhen
Xie Zhenda
Zhang Bo
Publication venue
Publication date: 26/10/2023
Field of study

We present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation. Code available at https://github.com/deepseek-ai/DreamCraft3D.Comment: Project Page: https://mrtornado24.github.io/DreamCraft3D

arXiv.org e-Print Archive

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Author: Deng Chengqi
Dong Kai
Li Zhuoshu
Liu Bo
Liu Wen
Lu Haoyu
Ren Tongzheng
Ruan Chong
Sun Jingxiang
Sun Yaofeng
Wang Bingxuan
Xie Zhenda
Xu Hanwei
Yang Hao
Zhang Bo
Publication venue
Publication date: 11/03/2024
Field of study

We present DeepSeek-VL, an open-source Vision-Language (VL) Model designed for real-world vision and language understanding applications. Our approach is structured around three key dimensions: We strive to ensure our data is diverse, scalable, and extensively covers real-world scenarios including web screenshots, PDFs, OCR, charts, and knowledge-based content, aiming for a comprehensive representation of practical contexts. Further, we create a use case taxonomy from real user scenarios and construct an instruction tuning dataset accordingly. The fine-tuning with this dataset substantially improves the model's user experience in practical applications. Considering efficiency and the demands of most real-world scenarios, DeepSeek-VL incorporates a hybrid vision encoder that efficiently processes high-resolution images (1024 x 1024), while maintaining a relatively low computational overhead. This design choice ensures the model's ability to capture critical semantic and detailed information across various visual tasks. We posit that a proficient Vision-Language Model should, foremost, possess strong language abilities. To ensure the preservation of LLM capabilities during pretraining, we investigate an effective VL pretraining strategy by integrating LLM training from the beginning and carefully managing the competitive dynamics observed between vision and language modalities. The DeepSeek-VL family (both 1.3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks. We have made both 1.3B and 7B models publicly accessible to foster innovations based on this foundation model.Comment: https://github.com/deepseek-ai/DeepSeek-V

arXiv.org e-Print Archive

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Author: :
Bi Xiao
Chen Deli
Chen Guanting
Chen Shanhuang
Dai Damai
DeepSeek-AI
Deng Chengqi
Ding Honghui
Dong Kai
Du Qiushi
Fu Zhe
Gao Huazuo
Gao Kaige
Gao Wenjun
Ge Ruiqi
Guan Kang
Guo Daya
Guo Jianzhong
Hao Guangbo
Hao Zhewen
He Ying
Hu Wenjie
Huang Panpan
Li Erhang
Li Guowei
Li Jiashi
Li Y. K.
Li Yao
Liang Wenfeng
Lin Fangyun
Liu A. X.
Liu Bo
Liu Wen
Liu Xiaodong
Liu Xin
Liu Yiyuan
Lu Haoyu
Lu Shanghao
Luo Fuli
Ma Shirong
Nie Xiaotao
Pei Tian
Piao Yishi
Qiu Junjie
Qu Hui
Ren Tongzheng
Ren Zehui
Ruan Chong
Sha Zhangli
Shao Zhihong
Song Junxiao
Su Xuecheng
Sun Jingxiang
Sun Yaofeng
Tang Minghui
Wang Bingxuan
Wang Peiyi
Wang Shiyu
Wang Yaohui
Wang Yongji
Wu Tong
Wu Y.
Xie Xin
Xie Zhenda
Xie Ziwei
Xiong Yiliang
Xu Hanwei
Xu R. X.
Xu Yanhong
Yang Dejian
You Yuxiang
Yu Shuiping
Yu Xingkai
Zhang B.
Zhang Haowei
Zhang Lecong
Zhang Liyue
Zhang Mingchuan
Zhang Minghua
Zhang Wentao
Zhang Yichao
Zhao Chenggang
Zhao Yao
Zhou Shangyan
Zhou Shunfeng
Zhu Qihao
Zou Yuheng
Publication venue
Publication date: 05/01/2024
Field of study

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5

arXiv.org e-Print Archive

Redundancy Analysis of Capacitance Data of a Coplanar Electrode Array for Fast and Stable Imaging Processing

Author: Dongtao Sun
Yintang Wen
Yuyan Zhang
Zhenda Zhang
Publication venue: 'MDPI AG'
Publication date: 01/12/2017
Field of study

A coplanar electrode array sensor is established for the imaging of composite-material adhesive-layer defect detection. The sensor is based on the capacitive edge effect, which leads to capacitance data being considerably weak and susceptible to environmental noise. The inverse problem of coplanar array electrical capacitance tomography (C-ECT) is ill-conditioning, in which a small error of capacitance data can seriously affect the quality of reconstructed images. In order to achieve a stable image reconstruction process, a redundancy analysis method for capacitance data is proposed. The proposed method is based on contribution rate and anti-interference capability. According to the redundancy analysis, the capacitance data are divided into valid and invalid data. When the image is reconstructed by valid data, the sensitivity matrix needs to be changed accordingly. In order to evaluate the effectiveness of the sensitivity map, singular value decomposition (SVD) is used. Finally, the two-dimensional (2D) and three-dimensional (3D) images are reconstructed by the Tikhonov regularization method. Through comparison of the reconstructed images of raw capacitance data, the stability of the image reconstruction process can be improved, and the quality of reconstructed images is not degraded. As a result, much invalid data are not collected, and the data acquisition time can also be reduced

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Cellular alterations and crosstalk in the osteochondral joint in osteoarthritis and promising therapeutic strategies

Author: Jiang Ai
Leng Huijie
Li Weishi
Song Chunli
Sun Shang
Tan Qizhao
Xu Peng
Zhao Zhenda
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

Osteoarthritis (OA) is a joint disorder involving cartilage degeneration and subchondral bone sclerosis. The bone-cartilage interface is implicated in OA pathogenesis due to its susceptibility to mechanical and biological factors. The crosstalk between cartilage and the underlying subchondral bone is elevated in OA due to multiple factors, such as increased vascularization, porosity, microcracks and fissures. Changes in the osteochondral joint are traceable to alterations in chondrocytes and bone cells (osteoblasts, osteocytes and osteoclasts). The phenotypes of these cells can change with the progression of OA. Aberrant intercellular communications among bone cell-bone cell and bone cell-chondrocyte are of great importance and might be factors OA development. An appreciation of cellular phenotypic changes in OA and the mechanisms by which these cells communicate would be expected to lead to the development of targeted drugs with fewer side effects

of Botany,Chinese Academy Of Sciences

Microscale Corrosion Inhibition Behavior of Four Corrosion Inhibitors (BTA, MBI, MBT, and MBO) on Archeological Silver Artifacts Based on Scanning Electrochemical Cell Microscopy

Author: Dongbo Hu (4612015)
Gang Hu (132827)
Meiqin Zhang (624727)
Pei Hu (1489597)
Shengyu Liu (7406969)
Siyuan Sun (1345560)
Xiangyu Sun (714760)
Zhenda Xie (7251101)
Publication venue
Publication date: 15/09/2023
Field of study

The problem of corrosion-induced discoloration and embrittlement in silverware is a significant concern for the long-term preservation of excavated archeological silver artifacts, even after thermal restoration. The key to addressing this issue lies in the meticulous selection and evaluation of corrosion inhibitors that possess targeted corrosion inhibition capabilities. This study focuses on the evaluation of corrosion inhibitors for archeological silver artifacts using scanning electrochemical cell microscopy (SECCM) and X-ray photoelectron spectroscopy (XPS). The researchers aimed to compare the inhibition effects of four corrosion inhibitors [1,2,3-benzotriazole (BTA), 2-mercaptobenzimidazole (MBI), 2-mercaptobenzothiazole (MBT), and 2-mercaptobenzoxazole (MBO)] on a simulated Ag–Cu alloy sample and understand their mechanisms. The results showed that MBT exhibited better corrosion inhibition for microstructural regions with higher silver content due to its ability to form stable chelation structures with Ag(I). MBO exhibited better corrosion inhibition for microstructural regions with higher copper content due to its strong affinity with Cu(I). The targeted corrosion inhibition ability for the β-phase was ranked as MBO > BTA ≈ MBI > MBT, while for the α-phase the ranking was MBT > MBO > MBI > BTA. The study demonstrated the feasibility and capabilities of SECCM in the targeted screening of corrosion inhibitors for different compositions and microstructural regions in archeological metal artifacts. This study highlights the potential of SECCM in corrosion inhibitor research for archeological metal artifacts and wider applications in metal material corrosion protection

The Francis Crick Institute

Metallurgically lithiated SiOx anode with high capacity and ambient air compatibility

Author: Lee Hyun-Wook
Lin Dingchang
Liu Wei
Liu Yayuan
Lu Zhenda
Sun Jie
Yan Kai
Yi Cui
Zhao Jie
Zhou Guangmin
Publication venue: NATL ACAD SCIENCES
Publication date: 16/06/2016
Field of study

A common issue plaguing battery anodes is the large consumption of lithium in the initial cycle as a result of the formation of a solid electrolyte interphase followed by gradual loss in subsequent cycles. It presents a need for prelithiation to compensate for the loss. However, anode prelithiation faces the challenge of high chemical reactivity because of the low anode potential. Previous efforts have produced prelithiated Si nanoparticles with dry air stability, which cannot be stabilized under ambient air. Here, we developed a one-pot metallurgical process to synthesize Lix Si/Li2 O composites by using low-cost SiO or SiO2 as the starting material. The resulting composites consist of homogeneously dispersed Lix Si nanodomains embedded in a highly crystalline Li2 O matrix, providing the composite excellent stability even in ambient air with 40% relative humidity. The composites are readily mixed with various anode materials to achieve high first cycle Coulombic efficiency (CE) of >100% or serve as an excellent anode material by itself with stable cyclability and consistently high CEs (99.81% at the seventh cycle and ???99.87% for subsequent cycles). Therefore, Lix Si/Li2 O composites achieved balanced reactivity and stability, promising a significant boost to lithium ion batteries.clos

PubMed Central

ScholarWorks@UNIST

Arbitrary coherent distributions in a programmable quantum walk

Author: Chang-Wei Sun
Heng Zhou
Jian Guo
Ping Xu
Ran Yang
Rong Zhang
Shi-Ning Zhu
Yan-Xiao Gong
Yi-Chen Liu
Zhenda Xie
Publication venue: American Physical Society
Publication date: 19/02/2022
Field of study

The coherent superposition of position states in a quantum walk (QW) can be precisely engineered towards the desired distributions to meet the need of quantum information applications. The coherent distribution can make full use of quantum parallel in computation and simulation. Particularly, the uniform superposition provides the robust nonlocality, which has wide applications such as the generation of genuine multibit random numbers without postprocessing. We experimentally demonstrate that the rich dynamics featured with arbitrary coherent distributions can be obtained by introducing different sets of the time- and position-dependent operations. Such a QW is realized by a resource-constant and flexible optical circuit, in which the variable operation is executed based on a Sagnac interferometer in an intrinsically stable and precisely controlled way. Our results contribute to the practical realization of quantum-walk-based quantum computation, quantum simulations, and quantum information protocols

arXiv.org e-Print Archive

Directory of Open Access Journals