18 research outputs found

    Generative Modeling in Structural-Hankel Domain for Color Image Inpainting

    Full text link
    In recent years, some researchers focused on using a single image to obtain a large number of samples through multi-scale features. This study intends to a brand-new idea that requires only ten or even fewer samples to construct the low-rank structural-Hankel matrices-assisted score-based generative model (SHGM) for color image inpainting task. During the prior learning process, a certain amount of internal-middle patches are firstly extracted from several images and then the structural-Hankel matrices are constructed from these patches. To better apply the score-based generative model to learn the internal statistical distribution within patches, the large-scale Hankel matrices are finally folded into the higher dimensional tensors for prior learning. During the iterative inpainting process, SHGM views the inpainting problem as a conditional generation procedure in low-rank environment. As a result, the intermediate restored image is acquired by alternatively performing the stochastic differential equation solver, alternating direction method of multipliers, and data consistency steps. Experimental results demonstrated the remarkable performance and diversity of SHGM.Comment: 11 pages, 10 figure

    Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond

    Get PDF
    A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere
    corecore