684 research outputs found

    Underwater Acoustic Signal Recognition Based on Salient Feature

    Full text link
    With the rapid advancement of technology, the recognition of underwater acoustic signals in complex environments has become increasingly crucial. Currently, mainstream underwater acoustic signal recognition relies primarily on time-frequency analysis to extract spectral features, finding widespread applications in the field. However, existing recognition methods heavily depend on expert systems, facing limitations such as restricted knowledge bases and challenges in handling complex relationships. These limitations stem from the complexity and maintenance difficulties associated with rules or inference engines. Recognizing the potential advantages of deep learning in handling intricate relationships, this paper proposes a method utilizing neural networks for underwater acoustic signal recognition. The proposed approach involves continual learning of features extracted from spectra for the classification of underwater acoustic signals. Deep learning models can automatically learn abstract features from data and continually adjust weights during training to enhance classification performance

    Causal Mediation Analysis with a Three-Dimensional Image Mediator

    Full text link
    Causal mediation analysis is increasingly abundant in biology, psychology, and epidemiology studies, etc. In particular, with the advent of the big data era, the issue of high-dimensional mediators is becoming more prevalent. In neuroscience, with the widespread application of magnetic resonance technology in the field of brain imaging, studies on image being a mediator emerged. In this study, a novel causal mediation analysis method with a three-dimensional image mediator is proposed. We define the average casual effects under the potential outcome framework, explore several sufficient conditions for the valid identification, and develop techniques for estimation and inference. To verify the effectiveness of the proposed method, a series of simulations under various scenarios is performed. Finally, the proposed method is applied to a study on the causal effect of mother′^{\prime}s delivery mode on child′^{\prime}s IQ development. It is found that the white matter in certain regions of the frontal-temporal areas has mediating effects.Comment: 35 pages, 9 figure

    LLM Connection Graphs for Global Feature Extraction in Point Cloud Analysis

    Get PDF
    Graph convolutional networks (GCNs) have effectively utilized local connections for point cloud analysis. How- ever, capturing distant dependencies (i.e., global features) with a single local connection graph, such as the Euclidean k-nearest neighbor graph, remains challenging. To ad- dress this, we introduce the Multi-Space Graph Convolutional Network (PointGCNN), which leverages reinforcement learning to adaptively construct connection graphs in multiple latent spaces, integrating both local and non-local dependencies. Initially, we encode and concatenate low- level local features from Euclidean and Eigenvalue spaces. Convolution layers are then hierarchically built, with each layer forming dynamic connection graphs to guide the propagation of low-level features. [1,2,3,4,11,14,16]These implicitly constructed graphs enable our model to uncover hidden dependencies. The assorted connections from different graphs support the extraction of fine-grained features from various perspectives, enhancing complex scene recognition. Thus, our model can capture multiple global contexts beyond the local scope of a single space, providing strong robustness against perturbations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on two major public point cloud benchmarks

    Training-Free Layout Control with Cross-Attention Guidance

    Full text link
    Recent diffusion-based generators can produce high-quality images from textual prompts. However, they often disregard textual instructions that specify the spatial layout of the composition. We propose a simple approach that achieves robust layout control without the need for training or fine-tuning of the image generator. Our technique manipulates the cross-attention layers that the model uses to interface textual and visual information and steers the generation in the desired direction given, e.g., a user-specified layout. To determine how to best guide attention, we study the role of attention maps and explore two alternative strategies, forward and backward guidance. We thoroughly evaluate our approach on three benchmarks and provide several qualitative examples and a comparative analysis of the two strategies that demonstrate the superiority of backward guidance compared to forward guidance, as well as prior work. We further demonstrate the versatility of layout guidance by extending it to applications such as editing the layout and context of real images.Comment: WACV 2024, Project Page: https://silent-chen.github.io/layout-guidance

    DGE: direct gaussian 3D editing by consistent multi-view editing

    Get PDF
    We consider the problem of editing 3D objects and scenes based on open-ended language instructions. A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process, obviating the need for 3D data. However, this process is often inefficient due to the need for iterative updates of costly 3D representations, such as neural radiance fields, either through individual view edits or score distillation sampling. A major disadvantage of this approach is the slow convergence caused by aggregating inconsistent information across views, as the guidance from 2D models is not multi-view consistent. We thus introduce the Direct Gaussian Editor (DGE), a method that addresses these issues in two stages. First, we modify a given highquality image editor like InstructPix2Pix to be multi-view consistent. To do so, we propose a training-free approach that integrates cues from the 3D geometry of the underlying scene. Second, given a multi-view consistent edited sequence of images, we directly and efficiently optimize the 3D representation, which is based on 3D Gaussian Splatting. Because it avoids incremental and iterative edits, DGE is significantly more accurate and efficient than existing approaches and offers additional benefits, such as enabling selective editing of parts of the scene
    • …
    corecore