15 research outputs found
Multiple Hankel matrix rank minimization for audio inpainting
Sasaki et al. (2018) presented an efficient audio declipping algorithm, based
on the properties of Hankel-structured matrices constructed from time-domain
signal blocks. We adapt their approach to solve the audio inpainting problem,
where samples are missing in the signal. We analyze the algorithm and provide
modifications, some of them leading to an improved performance. Overall, it
turns out that the new algorithms perform reasonably well for speech signals
but they are not competitive in the case of music signals
Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)
The implicit objective of the biennial "international - Traveling Workshop on
Interactions between Sparse models and Technology" (iTWIST) is to foster
collaboration between international scientific teams by disseminating ideas
through both specific oral/poster presentations and free discussions. For its
second edition, the iTWIST workshop took place in the medieval and picturesque
town of Namur in Belgium, from Wednesday August 27th till Friday August 29th,
2014. The workshop was conveniently located in "The Arsenal" building within
walking distance of both hotels and town center. iTWIST'14 has gathered about
70 international participants and has featured 9 invited talks, 10 oral
presentations, and 14 posters on the following themes, all related to the
theory, application and generalization of the "sparsity paradigm":
Sparsity-driven data sensing and processing; Union of low dimensional
subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph
sensing/processing; Blind inverse problems and dictionary learning; Sparsity
and computational neuroscience; Information theory, geometry and randomness;
Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?;
Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website:
http://sites.google.com/site/itwist1
Revisiting Synthesis Model of Sparse Audio Declipper
The state of the art in audio declipping has currently been achieved by SPADE
(SParse Audio DEclipper) algorithm by Kiti\'c et al. Until now, the
synthesis/sparse variant, S-SPADE, has been considered significantly slower
than its analysis/cosparse counterpart, A-SPADE. It turns out that the opposite
is true: by exploiting a recent projection lemma, individual iterations of both
algorithms can be made equally computationally expensive, while S-SPADE tends
to require considerably fewer iterations to converge. In this paper, the two
algorithms are compared across a range of parameters such as the window length,
window overlap and redundancy of the transform. The experiments show that
although S-SPADE typically converges faster, the average performance in terms
of restoration quality is not superior to A-SPADE
A new generalized projection and its application to acceleration of audio declipping
In convex optimization, it is often inevitable to work with projectors onto convex sets composed with a linear operator. Such a need arises from both the theory and applications, with signal processing being a prominent and broad field where convex optimization has been used recently. In this article, a novel projector is presented, which generalizes previous results in that it admits to work with a broader family of linear transforms when compared with the state of the art but, on the other hand, it is limited to box-type convex sets in the transformed domain. The new projector is described by an explicit formula, which makes it simple to implement and requires a low computational cost. The projector is interpreted within the framework of the so-called proximal splitting theory. The convenience of the new projector is demonstrated on an example from signal processing, where it was possible to speed up the convergence of a signal declipping algorithm by a factor of more than two
mathematical modeling of human behavior in video image
研究成果の概要 (和文) : 本研究では、防犯カメラ等で観測された人物行動を裁判の証拠として活用する手法について取り組んだ。防犯カメラ映像に映った人物の間接部位のうち、障害物に隠れて位置が計測できない部位の位置を推定する手法を導出した。また、人物行動を表現する数学モデル構築について取り組み、複数の線形システムの重み付き平均により、人物行動を表現する数学モデルを構築した。研究成果の概要 (英文) : This work provided a estimation method for unoberved human behavior in video image and proposed a new mathematical model to describe human behavior in video image by a weighted combination of linear systems
Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques
Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction