197 research outputs found

    Neural Volumetric Memory for Visual Locomotion Control

    Full text link
    Legged robots have the potential to expand the reach of autonomy beyond paved roads. In this work, we consider the difficult problem of locomotion on challenging terrains using a single forward-facing depth camera. Due to the partial observability of the problem, the robot has to rely on past observations to infer the terrain currently beneath it. To solve this problem, we follow the paradigm in computer vision that explicitly models the 3D geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric memory architecture that explicitly accounts for the SE(3) equivariance of the 3D world. NVM aggregates feature volumes from multiple camera views by first bringing them back to the ego-centric frame of the robot. We test the learned visual-locomotion policy on a physical robot and show that our approach, which explicitly introduces geometric priors during training, offers superior performance than more na\"ive methods. We also include ablation studies and show that the representations stored in the neural volumetric memory capture sufficient geometric information to reconstruct the scene. Our project page with videos is https://rchalyang.github.io/NVM .Comment: CVPR 2023 Highlight. Our project page with videos is https://rchalyang.github.io/NV

    A Self-adaptive Discriminative Autoencoder for Medical Applications

    Get PDF
    Computer aided diagnosis (CAD) systems play an essential role in the early detection and diagnosis of developing disease for medical applications. In order to obtain the highly recognizable representation for the medical images, a self-adaptive discriminative autoencoder (SADAE) is proposed in this paper. The proposed SADAE system is implemented under a deep metric learning framework which consists of K local autoencoders, employed to learn the K subspaces that represent the diverse distribution of the underlying data, and a global autoencoder to restrict the spatial scale of the learned representation of images. Such community of autoencoders is aided by a self-adaptive metric learning method that extracts the discriminative features to recognize the different categories in the given images. The quality of the extracted features by SADAE is compared against that of those extracted by other state-of-the-art deep learning and metric learning methods on five popular medical image data sets. The experimental results demonstrate that the medical image recognition results gained by SADAE are much improved over those by the alternatives

    Automated Detection and Characterization of Cracks on Concrete using Laser Scanning

    Get PDF
    Accurate crack detection and characterization on concrete are essential for the maintenance, safety, and serviceability of various infrastructures. In this paper, an innovative approach was developed to automatically measure the cracks from 3D point clouds collected by a phase-shift terrestrial laser scanner (TLS) (FARO Focus3D S120). The approach integrates several techniques to characterize the cracks, which include the deviation on point normal determined using k-nearest neighbor (kNN) and principal components analysis (PCA) algorithms to identify the cracks, and principal axes and curve skeletons of cracks to determine the projected and real dimensions of cracks, respectively. The coordinate transformation was then performed to estimate the projected dimensions of cracks. Curve skeletons and cross sections of cracks were extracted to represent the real dimensions. Two cases of surface cracks were used to validate the developed approach. Because of the differences in definitions of the crack dimension in the three methods and due to the curve shape of the crack, the width and depth of cracks obtained from the cross-section method and manual measurement were close but slightly smaller than those measured by the projection algorithm; whereas the length of cracks determined by the curve-skeletons method was slightly larger than those obtained by the manual measurement and projection method. The real dimension of a crack has good agreements with real situations when compared with the results of the manual measurement and projection method

    Self-Play and Self-Describe: Policy Adaptation with Vision-Language Foundation Models

    Full text link
    Recent progress on vision-language foundation models have brought significant advancement to building general-purpose robots. By using the pre-trained models to encode the scene and instructions as inputs for decision making, the instruction-conditioned policy can generalize across different objects and tasks. While this is encouraging, the policy still fails in most cases given an unseen task or environment. To adapt the policy to unseen tasks and environments, we explore a new paradigm on leveraging the pre-trained foundation models with Self-PLAY and Self-Describe (SPLAYD). When deploying the trained policy to a new task or a new environment, we first let the policy self-play with randomly generated instructions to record the demonstrations. While the execution could be wrong, we can use the pre-trained foundation models to accurately self-describe (i.e., re-label or classify) the demonstrations. This automatically provides new pairs of demonstration-instruction data for policy fine-tuning. We evaluate our method on a broad range of experiments with the focus on generalization on unseen objects, unseen tasks, unseen environments, and sim-to-real transfer. We show SPLAYD improves baselines by a large margin in all cases. Our project page is available at https://geyuying.github.io/SPLAYD/Comment: Project page: https://geyuying.github.io/SPLAYD

    GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields

    Full text link
    It is a long-standing problem in robotics to develop agents capable of executing diverse manipulation tasks from visual observations in unstructured real-world environments. To achieve this goal, the robot needs to have a comprehensive understanding of the 3D structure and semantics of the scene. In this work, we present GNFactor\textbf{GNFactor}, a visual behavior cloning agent for multi-task robotic manipulation with G\textbf{G}eneralizable N\textbf{N}eural feature F\textbf{F}ields. GNFactor jointly optimizes a generalizable neural field (GNF) as a reconstruction module and a Perceiver Transformer as a decision-making module, leveraging a shared deep 3D voxel representation. To incorporate semantics in 3D, the reconstruction module utilizes a vision-language foundation model (e.g.\textit{e.g.}, Stable Diffusion) to distill rich semantic information into the deep 3D voxel. We evaluate GNFactor on 3 real robot tasks and perform detailed ablations on 10 RLBench tasks with a limited number of demonstrations. We observe a substantial improvement of GNFactor over current state-of-the-art methods in seen and unseen tasks, demonstrating the strong generalization ability of GNFactor. Our project website is https://yanjieze.com/GNFactor/ .Comment: CoRL 2023 Oral. Website: https://yanjieze.com/GNFactor

    Genetic Fingerprint Concerned with Lymphatic Metastasis of Human Lung Squamous Cancer

    Get PDF
    Background and objective With the most recent introduction of microarray technology to biology, it becomes possible to perform comprehensive analysis of gene expression in cancer cell. In this study the laser microdissection technique and cDNA microarray analysis were combined to obtain accurate molecular profiles of lymphatic metastasis in patients with lung squamous cell carcinoma. Methods Primary lung squamous cancer tissues and regional lymph nodes were obtained from 10 patients who underwent complete resection of lung cancer. According to the source of lung cancer cells, the samples were classified into three groups: the primary tumor with lymphatic metastasis (TxN+, n=5), the primary tumor without lymphatic metastasis (TxN-, n=5) and matched tumor cells from metastatic lymph nodes (N+, n=5). Total RNA was extracted from laser microdissected tumor samples. Adequate RNA starting material of mRNA from primary tumor or metastatic nodes were labeled and then hybridized into the same microarray containing 6 000 known, named human genes/ESTs. After scanning, data analysis was performed using GeneSpringTM6.2. Results A total of 37 genes were found to be able to separate TxN+ from TxN-. TxN+ have higher levels of genes concerned with structural protein, signal transducer, chaperone and enzyme. TxN- have higher levels of genes coding for cell cycle regulator, transporter, signal transducer and apoptosis regulator. Interestingly, there were no differentially expressed genes between N+ and TxN+. Conclusion The acquisition of the metastatic phenotype might occur early in the development of lung squamous cancer. We raise the hypothesis that the gene-expression signature described herein is valuable to elucidate the molecular mechanisms regarding lymphatic metastasis and to look for novel therapeutic targets

    Amplified role of potential HONO sources in O3 formation in North China Plain during autumn haze aggravating processes

    Get PDF
    Co-occurrences of high concentrations of PM2.5 and ozone (O-3) have been frequently observed in haze-aggravating processes in the North China Plain (NCP) over the past few years. Higher O-3 concentrations on hazy days were hypothesized to be related to nitrous acid (HONO), but the key sources of HONO enhancing O-3 during haze-aggravating processes remain unclear. We added six potential HONO sources, i.e., four groundbased (traffic, soil, and indoor emissions, and the NO2 heterogeneous reaction on ground surface (Het(ground))) sources, and two aerosol-related (the NO2 heterogeneous reaction on aerosol surfaces (Het(aerosol)) and nitrate photolysis (Phot(nitrate))) sources into the WRF-Chem model and designed 23 simulation scenarios to explore the unclear key sources. The results indicate that ground-based HONO sources producing HONO enhancements showed a rapid decrease with height, while the NO C OH reaction and aerosol-related HONO sources decreased slowly with height. Photnitrate contributions to HONO concentrations were enhanced with aggravated pollution levels. The enhancement of HONO due to Phot(nitrate) on hazy days was about 10 times greater than on clean days and Phot(nitrate) dominated daytime HONO sources (similar to 30 %-70% when the ratio of the photolysis frequency of nitrate (J(nitrate)) to gas nitric acid (JHNO(3)) equals 30) at higher layers (>800 m). Compared with that on clean days, the Phot(nitrate) contribution to the enhanced daily maximum 8 h averaged (DMA8) O-3 was increased by over 1 magnitude during the haze-aggravating process. Phot(nitrate) contributed only similar to 5% of the surface HONO in the daytime with a J(nitrate) =JHNO(3) ratio of 30 but contributed similar to 30 %-50% of the enhanced O-3 near the surface in NCP on hazy days. Surface O-3 was dominated by volatile organic compound-sensitive chemistry, while O-3 at higher altitudes ( >800 m) was dominated by NOx-sensitive chemistry. Phot(nitrate) had a limited impact on nitrate concentrations (Peer reviewe

    Measurement of distal intramural spread and the optimal distal resection by naked eyes after neoadjuvant radiation for rectal cancers

    Get PDF
    BACKGROUND: The safe distance between the intraoperative resection line and the visible margin of the distal rectal tumor after preoperative radiotherapy is unclear. We aimed to investigate the furthest tumor intramural spread distance in fresh tissue to determine a safe distal intraoperative resection margin length. METHODS: Twenty rectal cancer specimens were collected after preoperative radiotherapy. Tumor intramural spread distances were defined as the distance between the tumor’s visible and microscopic margins. Visible tumor margins in fresh specimens were identified during the operation and were labeled with 5 - 0 sutures under the naked eye at the distal 5, 6, and 7 o’clock directions of visible margins immediately after removal of the tumor. After fixation with formalin, the sutures were injected with nanocarbon particles. Longitudinal tissues were collected along three labels and stained with hematoxylin and eosin. The spread distance after formalin fixation was measured between the furthest intramural spread of tumor cells and the nanocarbon under a microscope. A positive intramural spread distance indicated that the furthest tumor cell was distal to the nanocarbon, and a negative value indicated that the tumor cell was proximal to the nanocarbon. The tumor intramural spread distance in fresh tissue during the operation was 1.75 times the tumor intramural spread distance after formalin fixation according to the literature. RESULTS: At the distal 5, 6, and 7 o’clock direction, seven (35%), five (25%), and six (30%) patients, respectively, had distal tumor cell intramural spread distance > 0 mm. The mean and 95% confidence interval of tumor cell intramural spread distance in fresh tissue during operation was − 0.3 (95%CI − 4.0 ~ 3.4) mm, − 0.9 (95%CI − 3.4 ~ 1.7) mm, and − 0.4 (95%CI − 3.5 ~ 2.8) mm, respectively. The maximal intraoperative intramural spread distances in fresh tissue were 8.8, 7, and 7 mm, respectively. CONCLUSIONS: The intraoperative distance between the distal resection line and the visible margin of the rectal tumor after radiotherapy should not be less than 1 cm to ensure oncological safety
    • …
    corecore