43 research outputs found
ΠΡΠ΅Π΄Π΅Π»ΡΠ½ΡΠ΅ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠ΅ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ Π΄Π»Ρ ΡΠΎΠ±Π°ΡΡΠ½ΠΎΠ³ΠΎ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠΎΠ²ΡΡ Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π»ΠΎΠ² ΠΏΠΎ ΠΌΠ΅ΡΠΎΠ΄Ρ Π»ΠΎΡΠΊΡΡΠ°
Ensuring the robustness of digital audio watermarking under the influence of interference, various transformations and possible attacks is an urgent problem. One of the most used and fairly stable marking methods is the patchwork method. Its robustness is ensured by the use of expanding bipolar numerical sequences in the formation and embedding of a watermark in a digital audio and correlation detection in the detection and extraction of a watermark. An analysis of the patchwork method showed that the absolute values of the ratio of the maximum of the autocorrelation function (ACF) to its minimum for expanding bipolar sequences and extended marker sequences used in traditional digital watermarking approach 2 with high accuracy. This made it possible to formulate criteria for searching for special expanding bipolar sequences, which have improved correlation properties and greater robustness. The article developed a mathematical apparatus for searching and constructing limit-expanding bipolar sequences used in solving the problem of robust digital audio watermarking using the patchwork method. Limit bipolar sequences are defined as sequences whose autocorrelation functions have the maximum possible ratios of maximum to minimum in absolute value. Theorems and corollaries from them are formulated and proved: on the existence of an upper bound on the minimum values of autocorrelation functions of limit bipolar sequences and on the values of the first and second petals of the ACF. On this basis, a rigorous mathematical definition of limit bipolar sequences is given. A method for searching for the complete set of limit bipolar sequences based on rational search and a method for constructing limit bipolar sequences of arbitrary length using generating functions are developed. The results of the computer simulation of the assessment of the values of the absolute value of the ratio of the maximum to the minimum of the autocorrelation and cross-correlation functions of the studied bipolar sequences for blind reception are presented. It is shown that the proposed limit bipolar sequences are characterized by better correlation properties in comparison with the traditionally used bipolar sequences and are more robust.ΠΠ±Π΅ΡΠΏΠ΅ΡΠ΅Π½ΠΈΠ΅ ΡΡΡΠΎΠΉΡΠΈΠ²ΠΎΡΡΠΈ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠΎΠ²ΡΡ
Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π»ΠΎΠ² Π² ΡΡΠ»ΠΎΠ²ΠΈΡΡ
Π΄Π΅ΠΉΡΡΠ²ΠΈΡ ΠΏΠΎΠΌΠ΅Ρ
, ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΠΏΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½ΠΈΠΉ ΠΈ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΡΡ
Π°ΡΠ°ΠΊ ΡΠ²Π»ΡΠ΅ΡΡΡ Π°ΠΊΡΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΏΡΠΎΠ±Π»Π΅ΠΌΠΎΠΉ. ΠΠ΄Π½ΠΈΠΌ ΠΈΠ· Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΡ
ΠΈ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΠΎ ΡΡΡΠΎΠΉΡΠΈΠ²ΡΡ
ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠ²Π»ΡΠ΅ΡΡΡ ΠΌΠ΅ΡΠΎΠ΄ Π»ΠΎΡΠΊΡΡΠ°. ΠΠ³ΠΎ ΡΠΎΠ±Π°ΡΡΠ½ΠΎΡΡΡ ΠΎΠ±Π΅ΡΠΏΠ΅ΡΠΈΠ²Π°Π΅ΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ΠΌ ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΡΠΈΡΠ»ΠΎΠ²ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΏΡΠΈ ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΠΈ ΠΈ Π²Π½Π΅Π΄ΡΠ΅Π½ΠΈΠΈ ΠΌΠ°ΡΠΊΠ΅ΡΠ° Π² ΡΠΈΡΡΠΎΠ²ΠΎΠΉ Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π» ΠΈ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΠΎΠ³ΠΎ Π΄Π΅ΡΠ΅ΠΊΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΡΠΈ ΠΎΠ±Π½Π°ΡΡΠΆΠ΅Π½ΠΈΠΈ ΠΈ ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΠΈ ΠΌΠ°ΡΠΊΠ΅ΡΠ½ΠΎΠΉ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ. ΠΠ½Π°Π»ΠΈΠ· ΡΠ²ΠΎΠΉΡΡΠ² Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΡΠ΅Π°Π»ΠΈΠ·ΡΠ΅ΠΌΡΡ
Π² ΠΌΠ΅ΡΠΎΠ΄Π΅ Π»ΠΎΡΠΊΡΡΠ°, ΠΏΠΎΠΊΠ°Π·Π°Π», ΡΡΠΎ Π°Π±ΡΠΎΠ»ΡΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ Π²Π΅Π»ΠΈΡΠΈΠ½Ρ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡ ΠΌΠ°ΠΊΡΠΈΠΌΡΠΌΠ° Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΠΎΠΉ ΡΡΠ½ΠΊΡΠΈΠΈ (ΠΠΠ€) ΠΊ Π΅Ρ ΠΌΠΈΠ½ΠΈΠΌΡΠΌΡ Π΄Π»Ρ ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΈ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΡΡ
ΠΌΠ°ΡΠΊΠ΅ΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΡ
ΠΏΡΠΈ ΡΡΠ°Π΄ΠΈΡΠΈΠΎΠ½Π½ΠΎΠΌ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΠΈ, Ρ Π²ΡΡΠΎΠΊΠΎΠΉ ΡΠΎΡΠ½ΠΎΡΡΡΡ ΠΏΡΠΈΠ±Π»ΠΈΠΆΠ°ΡΡΡΡ ΠΊ 2. ΠΡΠΎ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΎ ΡΡΠΎΡΠΌΡΠ»ΠΈΡΠΎΠ²Π°ΡΡ ΠΊΡΠΈΡΠ΅ΡΠΈΠΈ Π΄Π»Ρ ΠΏΠΎΠΈΡΠΊΠ° ΡΠΏΠ΅ΡΠΈΠ°Π»ΡΠ½ΡΡ
ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΠΎΠ±Π»Π°Π΄Π°ΡΡΠΈΡ
ΡΠ»ΡΡΡΠ΅Π½Π½ΡΠΌΠΈ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΠΌΠΈ ΡΠ²ΠΎΠΉΡΡΠ²Π°ΠΌΠΈ ΠΈ Π±ΠΎΠ»ΡΡΠ΅ΠΉ ΡΡΡΠΎΠΉΡΠΈΠ²ΠΎΡΡΡΡ. Π ΡΡΠ°ΡΡΠ΅ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½ ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠΉ Π°ΠΏΠΏΠ°ΡΠ°Ρ Π΄Π»Ρ ΠΏΠΎΠΈΡΠΊΠ° ΠΈ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΡ
ΠΏΡΠΈ ΡΠ΅ΡΠ΅Π½ΠΈΠΈ Π·Π°Π΄Π°ΡΠΈ ΡΠΎΠ±Π°ΡΡΠ½ΠΎΠ³ΠΎ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠΎΠ²ΡΡ
Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π»ΠΎΠ² ΠΏΠΎ ΠΌΠ΅ΡΠΎΠ΄Ρ Π»ΠΎΡΠΊΡΡΠ°. ΠΡΠ΅Π΄Π΅Π»ΡΠ½ΡΠ΅ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠ΅ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Ρ ΠΊΠ°ΠΊ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ, Ρ ΠΊΠΎΡΠΎΡΡΡ
Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΠ΅ ΡΡΠ½ΠΊΡΠΈΠΈ ΠΎΠ±Π»Π°Π΄Π°ΡΡ ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡΠ½ΠΎ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΡΠΌΠΈ ΠΏΠΎ Π°Π±ΡΠΎΠ»ΡΡΠ½ΠΎΠΌΡ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡΠΌΠΈ ΠΌΠ°ΠΊΡΠΈΠΌΡΠΌΠ° ΠΊ ΠΌΠΈΠ½ΠΈΠΌΡΠΌΡ. Π‘ΡΠΎΡΠΌΡΠ»ΠΈΡΠΎΠ²Π°Π½Ρ ΠΈ Π΄ΠΎΠΊΠ°Π·Π°Π½Ρ ΡΠ΅ΠΎΡΠ΅ΠΌΡ ΠΈ ΡΠ»Π΅Π΄ΡΡΠ²ΠΈΡ ΠΈΠ· Π½ΠΈΡ
: ΠΎ ΡΡΡΠ΅ΡΡΠ²ΠΎΠ²Π°Π½ΠΈΠΈ Π²Π΅ΡΡ
Π½Π΅ΠΉ Π³ΡΠ°Π½ΠΈΡΡ ΠΌΠΈΠ½ΠΈΠΌΠ°Π»ΡΠ½ΡΡ
Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΡ
ΡΡΠ½ΠΊΡΠΈΠΉ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΈ ΠΎ Π·Π½Π°ΡΠ΅Π½ΠΈΡΡ
ΠΏΠ΅ΡΠ²ΠΎΠ³ΠΎ ΠΈ Π²ΡΠΎΡΠΎΠ³ΠΎ Π»Π΅ΠΏΠ΅ΡΡΠΊΠΎΠ² ΠΠΠ€. ΠΠ° ΡΡΠΎΠΉ ΠΎΡΠ½ΠΎΠ²Π΅ Π΄Π°Π½ΠΎ ΡΡΡΠΎΠ³ΠΎΠ΅ ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ΅ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΠ΅ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ. Π Π°Π·ΡΠ°Π±ΠΎΡΠ°Π½Ρ ΠΌΠ΅ΡΠΎΠ΄ ΠΏΠΎΠΈΡΠΊΠ° ΠΏΠΎΠ»Π½ΠΎΠ³ΠΎ ΠΌΠ½ΠΎΠΆΠ΅ΡΡΠ²Π° ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΠ°ΡΠΈΠΎΠ½Π°Π»ΡΠ½ΠΎΠ³ΠΎ ΠΏΠ΅ΡΠ΅Π±ΠΎΡΠ° ΠΈ ΠΌΠ΅ΡΠΎΠ΄ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΏΡΠΎΠΈΠ·Π²ΠΎΠ»ΡΠ½ΠΎΠΉ Π΄Π»ΠΈΠ½Ρ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΠΎΡΠΎΠΆΠ΄Π°ΡΡΠΈΡ
ΡΡΠ½ΠΊΡΠΈΠΉ. ΠΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Ρ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΠΊΠΎΠΌΠΏΡΡΡΠ΅ΡΠ½ΠΎΠ³ΠΎ ΠΌΠΎΠ΄Π΅Π»ΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΠΎΡΠ΅Π½ΠΊΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π°Π±ΡΠΎΠ»ΡΡΠ½ΠΎΠΉ Π²Π΅Π»ΠΈΡΠΈΠ½Ρ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡ ΠΌΠ°ΠΊΡΠΈΠΌΡΠΌΠ° ΠΊ ΠΌΠΈΠ½ΠΈΠΌΡΠΌΡ Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΠΎΠΉ ΠΈ Π²Π·Π°ΠΈΠΌΠ½ΠΎΠΉ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΡ
ΡΡΠ½ΠΊΡΠΈΠΉ ΠΈΡΡΠ»Π΅Π΄ΡΠ΅ΠΌΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ Π΄Π»Ρ ΡΠ»Π΅ΠΏΠΎΠ³ΠΎ ΠΏΡΠΈΠ΅ΠΌΠ°. ΠΠΎΠΊΠ°Π·Π°Π½ΠΎ, ΡΡΠΎ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΡΠ΅ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΠ΅ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠ΅ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΠ·ΡΡΡΡΡ Π»ΡΡΡΠΈΠΌΠΈ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΠΌΠΈ ΡΠ²ΠΎΠΉΡΡΠ²Π°ΠΌΠΈ Π² ΡΡΠ°Π²Π½Π΅Π½ΠΈΠΈ Ρ ΡΡΠ°Π΄ΠΈΡΠΈΠΎΠ½Π½ΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΠΌΠΈ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠΌΠΈ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΡΠΌΠΈ ΠΈ ΠΎΠ±Π»Π°Π΄Π°ΡΡ Π±ΠΎΠ»ΡΡΠ΅ΠΉ ΡΡΡΠΎΠΉΡΠΈΠ²ΠΎΡΡΡΡ
ΠΡΠ΅Π΄Π΅Π»ΡΠ½ΡΠ΅ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠ΅ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ Π΄Π»Ρ ΡΠΎΠ±Π°ΡΡΠ½ΠΎΠ³ΠΎ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠΎΠ²ΡΡ Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π»ΠΎΠ² ΠΏΠΎ ΠΌΠ΅ΡΠΎΠ΄Ρ Π»ΠΎΡΠΊΡΡΠ°
ΠΠ±Π΅ΡΠΏΠ΅ΡΠ΅Π½ΠΈΠ΅ ΡΡΡΠΎΠΉΡΠΈΠ²ΠΎΡΡΠΈ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠΎΠ²ΡΡ
Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π»ΠΎΠ² Π² ΡΡΠ»ΠΎΠ²ΠΈΡΡ
Π΄Π΅ΠΉΡΡΠ²ΠΈΡ ΠΏΠΎΠΌΠ΅Ρ
, ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΠΏΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½ΠΈΠΉ ΠΈ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΡΡ
Π°ΡΠ°ΠΊ ΡΠ²Π»ΡΠ΅ΡΡΡ Π°ΠΊΡΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΏΡΠΎΠ±Π»Π΅ΠΌΠΎΠΉ. ΠΠ΄Π½ΠΈΠΌ ΠΈΠ· Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΡ
ΠΈ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΠΎ ΡΡΡΠΎΠΉΡΠΈΠ²ΡΡ
ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠ²Π»ΡΠ΅ΡΡΡ ΠΌΠ΅ΡΠΎΠ΄ Π»ΠΎΡΠΊΡΡΠ°. ΠΠ³ΠΎ ΡΠΎΠ±Π°ΡΡΠ½ΠΎΡΡΡ ΠΎΠ±Π΅ΡΠΏΠ΅ΡΠΈΠ²Π°Π΅ΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ΠΌ ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΡΠΈΡΠ»ΠΎΠ²ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΏΡΠΈ ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΠΈ ΠΈ Π²Π½Π΅Π΄ΡΠ΅Π½ΠΈΠΈ ΠΌΠ°ΡΠΊΠ΅ΡΠ° Π² ΡΠΈΡΡΠΎΠ²ΠΎΠΉ Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π» ΠΈ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΠΎΠ³ΠΎ Π΄Π΅ΡΠ΅ΠΊΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΡΠΈ ΠΎΠ±Π½Π°ΡΡΠΆΠ΅Π½ΠΈΠΈ ΠΈ ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΠΈ ΠΌΠ°ΡΠΊΠ΅ΡΠ½ΠΎΠΉ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ. ΠΠ½Π°Π»ΠΈΠ· ΡΠ²ΠΎΠΉΡΡΠ² Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΡΠ΅Π°Π»ΠΈΠ·ΡΠ΅ΠΌΡΡ
Π² ΠΌΠ΅ΡΠΎΠ΄Π΅ Π»ΠΎΡΠΊΡΡΠ°, ΠΏΠΎΠΊΠ°Π·Π°Π», ΡΡΠΎ Π°Π±ΡΠΎΠ»ΡΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ Π²Π΅Π»ΠΈΡΠΈΠ½Ρ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡ ΠΌΠ°ΠΊΡΠΈΠΌΡΠΌΠ° Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΠΎΠΉ ΡΡΠ½ΠΊΡΠΈΠΈ (ΠΠΠ€) ΠΊ Π΅Ρ ΠΌΠΈΠ½ΠΈΠΌΡΠΌΡ Π΄Π»Ρ ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΈ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΡΡ
ΠΌΠ°ΡΠΊΠ΅ΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΡ
ΠΏΡΠΈ ΡΡΠ°Π΄ΠΈΡΠΈΠΎΠ½Π½ΠΎΠΌ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΠΈ, Ρ Π²ΡΡΠΎΠΊΠΎΠΉ ΡΠΎΡΠ½ΠΎΡΡΡΡ ΠΏΡΠΈΠ±Π»ΠΈΠΆΠ°ΡΡΡΡ ΠΊ 2. ΠΡΠΎ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΎ ΡΡΠΎΡΠΌΡΠ»ΠΈΡΠΎΠ²Π°ΡΡ ΠΊΡΠΈΡΠ΅ΡΠΈΠΈ Π΄Π»Ρ ΠΏΠΎΠΈΡΠΊΠ° ΡΠΏΠ΅ΡΠΈΠ°Π»ΡΠ½ΡΡ
ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΠΎΠ±Π»Π°Π΄Π°ΡΡΠΈΡ
ΡΠ»ΡΡΡΠ΅Π½Π½ΡΠΌΠΈ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΠΌΠΈ ΡΠ²ΠΎΠΉΡΡΠ²Π°ΠΌΠΈ ΠΈ Π±ΠΎΠ»ΡΡΠ΅ΠΉ ΡΡΡΠΎΠΉΡΠΈΠ²ΠΎΡΡΡΡ. Π ΡΡΠ°ΡΡΠ΅ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½ ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠΉ Π°ΠΏΠΏΠ°ΡΠ°Ρ Π΄Π»Ρ ΠΏΠΎΠΈΡΠΊΠ° ΠΈ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
ΡΠ°ΡΡΠΈΡΡΡΡΠΈΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΡ
ΠΏΡΠΈ ΡΠ΅ΡΠ΅Π½ΠΈΠΈ Π·Π°Π΄Π°ΡΠΈ ΡΠΎΠ±Π°ΡΡΠ½ΠΎΠ³ΠΎ ΠΌΠ°ΡΠΊΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠΎΠ²ΡΡ
Π°ΡΠ΄ΠΈΠΎΡΠΈΠ³Π½Π°Π»ΠΎΠ² ΠΏΠΎ ΠΌΠ΅ΡΠΎΠ΄Ρ Π»ΠΎΡΠΊΡΡΠ°. ΠΡΠ΅Π΄Π΅Π»ΡΠ½ΡΠ΅ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠ΅ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Ρ ΠΊΠ°ΠΊ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ, Ρ ΠΊΠΎΡΠΎΡΡΡ
Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΠ΅ ΡΡΠ½ΠΊΡΠΈΠΈ ΠΎΠ±Π»Π°Π΄Π°ΡΡ ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡΠ½ΠΎ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΡΠΌΠΈ ΠΏΠΎ Π°Π±ΡΠΎΠ»ΡΡΠ½ΠΎΠΌΡ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡΠΌΠΈ ΠΌΠ°ΠΊΡΠΈΠΌΡΠΌΠ° ΠΊ ΠΌΠΈΠ½ΠΈΠΌΡΠΌΡ. Π‘ΡΠΎΡΠΌΡΠ»ΠΈΡΠΎΠ²Π°Π½Ρ ΠΈ Π΄ΠΎΠΊΠ°Π·Π°Π½Ρ ΡΠ΅ΠΎΡΠ΅ΠΌΡ ΠΈ ΡΠ»Π΅Π΄ΡΡΠ²ΠΈΡ ΠΈΠ· Π½ΠΈΡ
: ΠΎ ΡΡΡΠ΅ΡΡΠ²ΠΎΠ²Π°Π½ΠΈΠΈ Π²Π΅ΡΡ
Π½Π΅ΠΉ Π³ΡΠ°Π½ΠΈΡΡ ΠΌΠΈΠ½ΠΈΠΌΠ°Π»ΡΠ½ΡΡ
Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΡ
ΡΡΠ½ΠΊΡΠΈΠΉ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΈ ΠΎ Π·Π½Π°ΡΠ΅Π½ΠΈΡΡ
ΠΏΠ΅ΡΠ²ΠΎΠ³ΠΎ ΠΈ Π²ΡΠΎΡΠΎΠ³ΠΎ Π»Π΅ΠΏΠ΅ΡΡΠΊΠΎΠ² ΠΠΠ€. ΠΠ° ΡΡΠΎΠΉ ΠΎΡΠ½ΠΎΠ²Π΅ Π΄Π°Π½ΠΎ ΡΡΡΠΎΠ³ΠΎΠ΅ ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ΅ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΠ΅ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ. Π Π°Π·ΡΠ°Π±ΠΎΡΠ°Π½Ρ ΠΌΠ΅ΡΠΎΠ΄ ΠΏΠΎΠΈΡΠΊΠ° ΠΏΠΎΠ»Π½ΠΎΠ³ΠΎ ΠΌΠ½ΠΎΠΆΠ΅ΡΡΠ²Π° ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΠ°ΡΠΈΠΎΠ½Π°Π»ΡΠ½ΠΎΠ³ΠΎ ΠΏΠ΅ΡΠ΅Π±ΠΎΡΠ° ΠΈ ΠΌΠ΅ΡΠΎΠ΄ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΏΡΠΎΠΈΠ·Π²ΠΎΠ»ΡΠ½ΠΎΠΉ Π΄Π»ΠΈΠ½Ρ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΠΎΡΠΎΠΆΠ΄Π°ΡΡΠΈΡ
ΡΡΠ½ΠΊΡΠΈΠΉ. ΠΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Ρ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΠΊΠΎΠΌΠΏΡΡΡΠ΅ΡΠ½ΠΎΠ³ΠΎ ΠΌΠΎΠ΄Π΅Π»ΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΠΎΡΠ΅Π½ΠΊΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π°Π±ΡΠΎΠ»ΡΡΠ½ΠΎΠΉ Π²Π΅Π»ΠΈΡΠΈΠ½Ρ ΠΎΡΠ½ΠΎΡΠ΅Π½ΠΈΡ ΠΌΠ°ΠΊΡΠΈΠΌΡΠΌΠ° ΠΊ ΠΌΠΈΠ½ΠΈΠΌΡΠΌΡ Π°Π²ΡΠΎΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΠΎΠΉ ΠΈ Π²Π·Π°ΠΈΠΌΠ½ΠΎΠΉ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΡ
ΡΡΠ½ΠΊΡΠΈΠΉ ΠΈΡΡΠ»Π΅Π΄ΡΠ΅ΠΌΡΡ
Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΡ
ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠ΅ΠΉ Π΄Π»Ρ ΡΠ»Π΅ΠΏΠΎΠ³ΠΎ ΠΏΡΠΈΠ΅ΠΌΠ°. ΠΠΎΠΊΠ°Π·Π°Π½ΠΎ, ΡΡΠΎ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΡΠ΅ ΠΏΡΠ΅Π΄Π΅Π»ΡΠ½ΡΠ΅ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠ΅ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΠ·ΡΡΡΡΡ Π»ΡΡΡΠΈΠΌΠΈ ΠΊΠΎΡΡΠ΅Π»ΡΡΠΈΠΎΠ½Π½ΡΠΌΠΈ ΡΠ²ΠΎΠΉΡΡΠ²Π°ΠΌΠΈ Π² ΡΡΠ°Π²Π½Π΅Π½ΠΈΠΈ Ρ ΡΡΠ°Π΄ΠΈΡΠΈΠΎΠ½Π½ΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΠΌΠΈ Π±ΠΈΠΏΠΎΠ»ΡΡΠ½ΡΠΌΠΈ ΠΏΠΎΡΠ»Π΅Π΄ΠΎΠ²Π°ΡΠ΅Π»ΡΠ½ΠΎΡΡΡΠΌΠΈ ΠΈ ΠΎΠ±Π»Π°Π΄Π°ΡΡ Π±ΠΎΠ»ΡΡΠ΅ΠΉ ΡΡΡΠΎΠΉΡΠΈΠ²ΠΎΡΡΡΡ
Securing Multi-Layer Communications: A Signal Processing Approach
Security is becoming a major concern in this information era. The development in wireless communications, networking technology, personal computing devices, and software engineering has led to numerous emerging applications whose security requirements are beyond the framework of conventional cryptography. The primary motivation of this dissertation research is to develop new approaches to the security problems in secure communication systems, without unduly increasing the complexity and cost of the entire system.
Signal processing techniques have been widely applied in communication systems. In this dissertation, we investigate the potential, the mechanism, and the performance of incorporating signal processing techniques into various layers along the chain of secure information processing. For example, for application-layer data confidentiality, we have proposed atomic encryption operations for multimedia data that can preserve standard compliance and are friendly to communications and delegate processing. For multimedia authentication, we have discovered the potential key disclosure problem for popular image hashing schemes, and proposed mitigation solutions. In physical-layer wireless communications, we have discovered the threat of signal garbling attack from compromised relay nodes in the emerging cooperative communication paradigm, and proposed a countermeasure to trace and pinpoint the adversarial relay. For the design and deployment of secure sensor communications, we have proposed two sensor location adjustment algorithms for mobility-assisted sensor deployment that can jointly optimize sensing coverage and secure communication connectivity. Furthermore, for general scenarios of group key management, we have proposed a time-efficient key management scheme that can improve the scalability of contributory key management from O(log n) to O(log(log n)) using scheduling and optimization techniques.
This dissertation demonstrates that signal processing techniques, along with optimization, scheduling, and beneficial techniques from other related fields of study, can be successfully integrated into security solutions in practical communication systems. The fusion of different technical disciplines can take place at every layer of a secure communication system to strengthen communication security and improve performance-security tradeoff
Intelligent watermarking of long streams of document images
Digital watermarking has numerous applications in the imaging domain, including (but not limited to) fingerprinting, authentication, tampering detection. Because of the trade-off between watermark robustness and image quality, the heuristic parameters associated with digital watermarking systems need to be optimized. A common strategy to tackle this optimization problem formulation of digital watermarking, known as intelligent watermarking (IW), is to employ evolutionary computing (EC) to optimize these parameters for each image, with a computational cost that is infeasible for practical applications. However, in industrial applications involving streams of document images, one can expect instances of problems to reappear over time. Therefore, computational cost can be saved by preserving the knowledge of previous optimization problems in a separate archive (memory) and employing that memory to speedup or even replace optimization for future similar problems.
That is the basic principle behind the research presented in this thesis. Although similarity in the image space can lead to similarity in the problem space, there is no guarantee of that and for this reason, knowledge about the image space should not be employed whatsoever. Therefore, in this research, strategies to appropriately represent, compare, store and sample from problem instances are investigated. The objective behind these strategies is to allow for a comprehensive representation of a stream of optimization problems in a way to avoid re-optimization whenever a previously seen problem provides solutions as good as those that would be obtained by reoptimization, but at a fraction of its cost. Another objective is to provide IW systems with a predictive capability which allows replacing costly fitness evaluations with cheaper regression models whenever re-optimization cannot be avoided.
To this end, IW of streams of document images is first formulated as the problem of optimizing a stream of recurring problems and a Dynamic Particle Swarm Optimization (DPSO) technique is proposed to tackle this problem. This technique is based on a two-tiered memory of static solutions. Memory solutions are re-evaluated for every new image and then, the re-evaluated fitness distribution is compared with stored fitness distribution as a mean of measuring the similarity between both problem instances (change detection). In simulations involving homogeneous streams of bi-tonal document images, the proposed approach resulted in a decrease of 95% in computational burden with little impact in watermarking performace. Optimization cost was severely decreased by replacing re-optimizations with recall to previously seen solutions.
After that, the problem of representing the stream of optimization problems in a compact manner is addressed. With that, new optimization concepts can be incorporated into previously learned concepts in an incremental fashion. The proposed strategy to tackle this problem is based on Gaussian Mixture Models (GMM) representation, trained with parameter and fitness data of all intermediate (candidate) solutions of a given problem instance. GMM sampling replaces selection of individual memory solutions during change detection. Simulation results demonstrate that such memory of GMMs is more adaptive and can thus, better tackle the optimization of embedding parameters for heterogeneous streams of document images when compared to the approach based on memory of static solutions.
Finally, the knowledge provided by the memory of GMMs is employed as a manner of decreasing the computational cost of re-optimization. To this end, GMM is employed in regression mode during re-optimization, replacing part of the costly fitness evaluations in a strategy known as surrogate-based optimization. Optimization is split in two levels, where the first one relies primarily on regression while the second one relies primarily on exact fitness values and provide a safeguard to the whole system. Simulation results demonstrate that the use of surrogates allows for better adaptation in situations involving significant variations in problem representation as when the set of attacks employed in the fitness function changes.
In general lines, the intelligent watermarking system proposed in this thesis is well adapted for the optimization of streams of recurring optimization problems. The quality of the resulting solutions for both, homogeneous and heterogeneous image streams is comparable to that obtained through full optimization but for a fraction of its computational cost. More specifically, the number of fitness evaluations is 97% smaller than that of full optimization for homogeneous streams and 95% for highly heterogeneous streams of document images. The proposed method is general and can be easily adapted to other applications involving streams of recurring problems
Intelligent Circuits and Systems
ICICS-2020 is the third conference initiated by the School of Electronics and Electrical Engineering at Lovely Professional University that explored recent innovations of researchers working for the development of smart and green technologies in the fields of Energy, Electronics, Communications, Computers, and Control. ICICS provides innovators to identify new opportunities for the social and economic benefits of society.γ This conference bridges the gap between academics and R&D institutions, social visionaries, and experts from all strata of society to present their ongoing research activities and foster research relations between them. It provides opportunities for the exchange of new ideas, applications, and experiences in the field of smart technologies and finding global partners for future collaboration. The ICICS-2020 was conducted in two broad categories, Intelligent Circuits & Intelligent Systems and Emerging Technologies in Electrical Engineering
An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony
In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end userβs speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony
In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end userβs speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique