326 research outputs found
The Protection, Designation and Management of Cultural Routes: A Case Study of the Tea & Horse Road in China
Cultural routes are a relatively new, and much discussed concept in heritage designation and management. The extent to which this concept provides an effective theoretical framework for management of diverse sites, monuments, and landscapes, encompassing multiple stakeholders and values, is under debate.
The research explores the so-called Tea & Horse Road (THR), which stretched from southwestern China to the South Asian subcontinent. It is an intriguing example of a historic network of interactions, combining multidimensional issues of protection, designation, and management, within a challenging contemporary social and political context.
Using literature reviews, case studies, semi-structured interviews, and field investigations, the thesis focuses on the THR within Yunnan Province in China. The selected case study was divided into three categories: productive regions, transfer regions and consuming regions, in order to both articulate the assorted THR heritage, and to explore relevant crucial issues: the nature of the physical remains; their integrity and authenticity; the potential and impacts of tourism; local, regional and state-based values; and the prospective management, protection and designation of these areas.
The research concludes that introducing the concept of cultural routes enables these multifaceted sites and landscapes to be integrated within a wider systematic framework, which offers possible approaches to top-down preservation and management of the THR. However, the research also reveals the tensions between cultural route and cultural landscape approaches, with the latter far easier to implement at a local/regional level. More broadly, it also raises questions about the implementation of cultural routes as a nomination strategy when dealing with diverse heritage resources, landscapes and communities
3D-Printed Artificial Microfish
Hydrogel microfish featuring biomimetic structures, locomotive capabilities, and functionalized nanoparticles are engineered using a rapid 3D printing platform: microscale continuous optical printing (μCOP). The 3D-printed microfish exhibit chemically powered and magnetically guided propulsion, as well as highly efficient detoxification capabilities that highlight the technical versatility of this platform for engineering advanced functional microswimmers for diverse biomedical applications
A Robotic Visual Grasping Design: Rethinking Convolution Neural Network with High-Resolutions
High-resolution representations are important for vision-based robotic
grasping problems. Existing works generally encode the input images into
low-resolution representations via sub-networks and then recover
high-resolution representations. This will lose spatial information, and errors
introduced by the decoder will be more serious when multiple types of objects
are considered or objects are far away from the camera. To address these
issues, we revisit the design paradigm of CNN for robotic perception tasks. We
demonstrate that using parallel branches as opposed to serial stacked
convolutional layers will be a more powerful design for robotic visual grasping
tasks. In particular, guidelines of neural network design are provided for
robotic perception tasks, e.g., high-resolution representation and lightweight
design, which respond to the challenges in different manipulation scenarios. We
then develop a novel grasping visual architecture referred to as HRG-Net, a
parallel-branch structure that always maintains a high-resolution
representation and repeatedly exchanges information across resolutions.
Extensive experiments validate that these two designs can effectively enhance
the accuracy of visual-based grasping and accelerate network training. We show
a series of comparative experiments in real physical environments at Youtube:
https://youtu.be/Jhlsp-xzHFY
Lightweight Neural Path Planning
Learning-based path planning is becoming a promising robot navigation
methodology due to its adaptability to various environments. However, the
expensive computing and storage associated with networks impose significant
challenges for their deployment on low-cost robots. Motivated by this practical
challenge, we develop a lightweight neural path planning architecture with a
dual input network and a hybrid sampler for resource-constrained robotic
systems. Our architecture is designed with efficient task feature extraction
and fusion modules to translate the given planning instance into a guidance
map. The hybrid sampler is then applied to restrict the planning within the
prospective regions indicated by the guide map. To enable the network training,
we further construct a publicly available dataset with various successful
planning instances. Numerical simulations and physical experiments demonstrate
that, compared with baseline approaches, our approach has nearly an order of
magnitude fewer model size and five times lower computational while achieving
promising performance. Besides, our approach can also accelerate the planning
convergence process with fewer planning iterations compared to sample-based
methods.Comment: 8 page
TODE-Trans: Transparent Object Depth Estimation with Transformer
Transparent objects are widely used in industrial automation and daily life.
However, robust visual recognition and perception of transparent objects have
always been a major challenge. Currently, most commercial-grade depth cameras
are still not good at sensing the surfaces of transparent objects due to the
refraction and reflection of light. In this work, we present a
transformer-based transparent object depth estimation approach from a single
RGB-D input. We observe that the global characteristics of the transformer make
it easier to extract contextual information to perform depth estimation of
transparent areas. In addition, to better enhance the fine-grained features, a
feature fusion module (FFM) is designed to assist coherent prediction. Our
empirical evidence demonstrates that our model delivers significant
improvements in recent popular datasets, e.g., 25% gain on RMSE and 21% gain on
REL compared to previous state-of-the-art convolutional-based counterparts in
ClearGrasp dataset. Extensive results show that our transformer-based model
enables better aggregation of the object's RGB and inaccurate depth information
to obtain a better depth representation. Our code and the pre-trained model
will be available at https://github.com/yuchendoudou/TODE.Comment: Submitted to ICRA202
FaithLM: Towards Faithful Explanations for Large Language Models
Large Language Models (LLMs) have become proficient in addressing complex
tasks by leveraging their extensive internal knowledge and reasoning
capabilities. However, the black-box nature of these models complicates the
task of explaining their decision-making processes. While recent advancements
demonstrate the potential of leveraging LLMs to self-explain their predictions
through natural language (NL) explanations, their explanations may not
accurately reflect the LLMs' decision-making process due to a lack of fidelity
optimization on the derived explanations. Measuring the fidelity of NL
explanations is a challenging issue, as it is difficult to manipulate the input
context to mask the semantics of these explanations. To this end, we introduce
FaithLM to explain the decision of LLMs with NL explanations. Specifically,
FaithLM designs a method for evaluating the fidelity of NL explanations by
incorporating the contrary explanations to the query process. Moreover, FaithLM
conducts an iterative process to improve the fidelity of derived explanations.
Experiment results on three datasets from multiple domains demonstrate that
FaithLM can significantly improve the fidelity of derived explanations, which
also provides a better alignment with the ground-truth explanations
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
With the rapid growth in model size, fine-tuning the large pre-trained
language model has become increasingly difficult due to its extensive memory
usage. Previous works usually focus on reducing the number of trainable
parameters in the network. While the model parameters do contribute to memory
usage, the primary memory bottleneck during training arises from storing
feature maps, also known as activations, as they are crucial for gradient
calculation. Notably, neural networks are usually trained using stochastic
gradient descent. We argue that in stochastic optimization, models can handle
noisy gradients as long as the gradient estimator is unbiased with reasonable
variance. Following this motivation, we propose a new family of unbiased
estimators called WTA-CRS, for matrix production with reduced variance, which
only requires storing the sub-sampled activations for calculating the gradient.
Our work provides both theoretical and experimental evidence that, in the
context of tuning transformers, our proposed estimators exhibit lower variance
compared to existing ones. By replacing the linear operation with our
approximated one in transformers, we can achieve up to 2.7 peak memory
reduction with almost no accuracy drop and enables up to larger
batch size. Under the same hardware, WTA-CRS enables better down-streaming task
performance by applying larger models and/or faster training speed with larger
batch sizes
BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development
Despite the widespread applications of machine learning force field (MLFF) on
solids and small molecules, there is a notable gap in applying MLFF to complex
liquid electrolytes. In this work, we introduce BAMBOO (ByteDance AI Molecular
Simulation Booster), a novel framework for molecular dynamics (MD) simulations,
with a demonstration of its capabilities in the context of liquid electrolytes
for lithium batteries. We design a physics-inspired graph equivariant
transformer architecture as the backbone of BAMBOO to learn from quantum
mechanical simulations. Additionally, we pioneer an ensemble knowledge
distillation approach and apply it on MLFFs to improve the stability of MD
simulations. Finally, we propose the density alignment algorithm to align
BAMBOO with experimental measurements. BAMBOO demonstrates state-of-the-art
accuracy in predicting key electrolyte properties such as density, viscosity,
and ionic conductivity across various solvents and salt combinations. Our
current model, trained on more than 15 chemical species, achieves the average
density error of 0.01 g/cm on various compositions compared with
experimental data. Moreover, our model demonstrates transferability to
molecules not included in the quantum mechanical dataset. We envision this work
as paving the way to a "universal MLFF" capable of simulating properties of
common organic liquids
An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT
The 'Impression' section of a radiology report is a critical basis for
communication between radiologists and other physicians, and it is typically
written by radiologists based on the 'Findings' section. However, writing
numerous impressions can be laborious and error-prone for radiologists.
Although recent studies have achieved promising results in automatic impression
generation using large-scale medical text data for pre-training and fine-tuning
pre-trained language models, such models often require substantial amounts of
medical text data and have poor generalization performance. While large
language models (LLMs) like ChatGPT have shown strong generalization
capabilities and performance, their performance in specific domains, such as
radiology, remains under-investigated and potentially limited. To address this
limitation, we propose ImpressionGPT, which leverages the in-context learning
capability of LLMs by constructing dynamic contexts using domain-specific,
individualized data. This dynamic prompt approach enables the model to learn
contextual knowledge from semantically similar examples from existing data.
Additionally, we design an iterative optimization algorithm that performs
automatic evaluation on the generated impression results and composes the
corresponding instruction prompts to further optimize the model. The proposed
ImpressionGPT model achieves state-of-the-art performance on both MIMIC-CXR and
OpenI datasets without requiring additional training data or fine-tuning the
LLMs. This work presents a paradigm for localizing LLMs that can be applied in
a wide range of similar application scenarios, bridging the gap between
general-purpose LLMs and the specific language processing needs of various
domains.Comment: Change to the published version. "ImpressionGPT" has been removed
from the titl
- …
