Search CORE

71 research outputs found

Recommended from our members

High Level Synthesis for Packet Processing Pipelines

Author: Soviani Cristian
Publication venue: Department of Computer Science, Columbia University
Publication date: 01/01/2007
Field of study

Packet processing is an essential function of state-of-the-art network routers and switches. Implementing packet processors in pipelined architectures is a well-known, established technique, albeit different approaches have been proposed. The design of packet processing pipelines is a delicate trade-off between the desire for abstract specifications, short development time, and design maintainability on one hand and very aggressive performance requirements on the other. This thesis proposes a coherent design flow for packet processing pipelines. Like the design process itself, I start by introducing a novel domain-specific language that provides a high-level specification of the pipeline. Next, I address synthesizing this model and calculating its worst-case throughput. Finally, I address some specific circuit optimization issues. I claim, based on experimental results, that my proposed technique can dramatically improve the design process of these pipelines, while the resulting performance matches the expectations of hand-crafted design. The considered pipelines exhibit a pseudo-linear topology, which can be too restrictive in the general case. However, especially due to its high performance, such an architecture may be suitable for applications outside packet processing, in which case some of my proposed techniques could be easily adapted. Since I ran my experiments on FPGAs, this work has an inherent bias towards that technology; however, most results are technology-independent

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Natural image processing and synthesis using deep learning

Author: Ganin Iaroslav
Publication venue
Publication date: 01/09/2019
Field of study

Nous étudions dans cette thèse comment les réseaux de neurones profonds peuvent être utilisés dans différents domaines de la vision artificielle. La vision artificielle est un domaine interdisciplinaire qui traite de la compréhension d’images et de vidéos numériques. Les problèmes de ce domaine ont traditionnellement été adressés avec des méthodes ad-hoc nécessitant beaucoup de réglages manuels. En effet, ces systèmes de vision artificiels comprenaient jusqu’à récemment une série de modules optimisés indépendamment. Cette approche est très raisonnable dans la mesure où, avec peu de données, elle bénéficient autant que possible des connaissances du chercheur. Mais cette avantage peut se révéler être une limitation si certaines données d’entré n’ont pas été considérées dans la conception de l’algorithme. Avec des volumes et une diversité de données toujours plus grands, ainsi que des capacités de calcul plus rapides et économiques, les réseaux de neurones profonds optimisés d’un bout à l’autre sont devenus une alternative attrayante. Nous démontrons leur avantage avec une série d’articles de recherche, chacun d’entre eux trouvant une solution à base de réseaux de neurones profonds à un problème d’analyse ou de synthèse visuelle particulier. Dans le premier article, nous considérons un problème de vision classique: la détection de bords et de contours. Nous partons de l’approche classique et la rendons plus ‘neurale’ en combinant deux étapes, la détection et la description de motifs visuels, en un seul réseau convolutionnel. Cette méthode, qui peut ainsi s’adapter à de nouveaux ensembles de données, s’avère être au moins aussi précis que les méthodes conventionnelles quand il s’agit de domaines qui leur sont favorables, tout en étant beaucoup plus robuste dans des domaines plus générales. Dans le deuxième article, nous construisons une nouvelle architecture pour la manipulation d’images qui utilise l’idée que la majorité des pixels produits peuvent d’être copiés de l’image d’entrée. Cette technique bénéficie de plusieurs avantages majeurs par rapport à l’approche conventionnelle en apprentissage profond. En effet, elle conserve les détails de l’image d’origine, n’introduit pas d’aberrations grâce à la capacité limitée du réseau sous-jacent et simplifie l’apprentissage. Nous démontrons l’efficacité de cette architecture dans le cadre d’une tâche de correction du regard, où notre système produit d’excellents résultats. Dans le troisième article, nous nous éclipsons de la vision artificielle pour étudier le problème plus générale de l’adaptation à de nouveaux domaines. Nous développons un nouvel algorithme d’apprentissage, qui assure l’adaptation avec un objectif auxiliaire à la tâche principale. Nous cherchons ainsi à extraire des motifs qui permettent d’accomplir la tâche mais qui ne permettent pas à un réseau dédié de reconnaître le domaine. Ce réseau est optimisé de manière simultané avec les motifs en question, et a pour tâche de reconnaître le domaine de provenance des motifs. Cette technique est simple à implémenter, et conduit pourtant à l’état de l’art sur toutes les tâches de référence. Enfin, le quatrième article présente un nouveau type de modèle génératif d’images. À l’opposé des approches conventionnels à base de réseaux de neurones convolutionnels, notre système baptisé SPIRAL décrit les images en termes de programmes bas-niveau qui sont exécutés par un logiciel de graphisme ordinaire. Entre autres, ceci permet à l’algorithme de ne pas s’attarder sur les détails de l’image, et de se concentrer plutôt sur sa structure globale. L’espace latent de notre modèle est, par construction, interprétable et permet de manipuler des images de façon prévisible. Nous montrons la capacité et l’agilité de cette approche sur plusieurs bases de données de référence.In the present thesis, we study how deep neural networks can be applied to various tasks in computer vision. Computer vision is an interdisciplinary field that deals with understanding of digital images and video. Traditionally, the problems arising in this domain were tackled using heavily hand-engineered adhoc methods. A typical computer vision system up until recently consisted of a sequence of independent modules which barely talked to each other. Such an approach is quite reasonable in the case of limited data as it takes major advantage of the researcher's domain expertise. This strength turns into a weakness if some of the input scenarios are overlooked in the algorithm design process. With the rapidly increasing volumes and varieties of data and the advent of cheaper and faster computational resources end-to-end deep neural networks have become an appealing alternative to the traditional computer vision pipelines. We demonstrate this in a series of research articles, each of which considers a particular task of either image analysis or synthesis and presenting a solution based on a ``deep'' backbone. In the first article, we deal with a classic low-level vision problem of edge detection. Inspired by a top-performing non-neural approach, we take a step towards building an end-to-end system by combining feature extraction and description in a single convolutional network. The resulting fully data-driven method matches or surpasses the detection quality of the existing conventional approaches in the settings for which they were designed while being significantly more usable in the out-of-domain situations. In our second article, we introduce a custom architecture for image manipulation based on the idea that most of the pixels in the output image can be directly copied from the input. This technique bears several significant advantages over the naive black-box neural approach. It retains the level of detail of the original images, does not introduce artifacts due to insufficient capacity of the underlying neural network and simplifies training process, to name a few. We demonstrate the efficiency of the proposed architecture on the challenging gaze correction task where our system achieves excellent results. In the third article, we slightly diverge from pure computer vision and study a more general problem of domain adaption. There, we introduce a novel training-time algorithm (\ie, adaptation is attained by using an auxilliary objective in addition to the main one). We seek to extract features that maximally confuse a dedicated network called domain classifier while being useful for the task at hand. The domain classifier is learned simultaneosly with the features and attempts to tell whether those features are coming from the source or the target domain. The proposed technique is easy to implement, yet results in superior performance in all the standard benchmarks. Finally, the fourth article presents a new kind of generative model for image data. Unlike conventional neural network based approaches our system dubbed SPIRAL describes images in terms of concise low-level programs executed by off-the-shelf rendering software used by humans to create visual content. Among other things, this allows SPIRAL not to waste its capacity on minutae of datasets and focus more on the global structure. The latent space of our model is easily interpretable by design and provides means for predictable image manipulation. We test our approach on several popular datasets and demonstrate its power and flexibility

Dépôt Institutionnel Numérique

Modeling of Polish Intonation for Statistical-Parametric Speech Synthesis

Author: Kuczmarski Tomasz
Publication venue
Publication date: 01/01/2022
Field of study

Wydział NeofilologiiBieżąca praca prezentuje próbę budowy neurobiologicznie umotywowanego modelu mapowań pomiędzy wysokopoziomowymi dyskretnymi kategoriami lingwistycznymi a ciągłym sygnałem częstotliwości podstawowej w polskiej neutralnej mowie czytanej, w oparciu o konwolucyjne sieci neuronowe. Po krótkim wprowadzeniu w problem badawczy w kontekście intonacji, syntezy mowy oraz luki pomiędzy fonetyką a fonologią, praca przedstawia opis uczenia modelu na podstawie specjalnego korpusu mowy oraz ewaluację naturalności konturu F0 generowanego przez wyuczony model za pomocą eksperymentów percepcyjnych typu ABX oraz MOS przy użyciu specjalnie w tym celu zbudowanego resyntezatora Neural Source Filter. Następnie, prezentowane są wyniki eksploracji fonologiczno-fonetycznych mapowań wyuczonych przez model. W tym celu wykorzystana została jedna z tzw. metod wyjaśniających dla sztucznej inteligencji – Layer-wise Relevance Propagation. W pracy przedstawione zostały wyniki powstałej na tej podstawie obszernej analizy ilościowej istotności dla konturu częstotliwości podstawowej każdej z 1297 specjalnie wygenerowanych lingwistycznych kategorii wejściowych modelu oraz ich wielorakich grupowań na różnorodnych poziomach abstrakcji. Pracę kończy dogłębna analiza oraz interpretacja uzyskanych wyników oraz rozważania na temat mocnych oraz słabych stron zastosowanych metod, a także lista proponowanych usprawnień.This work presents an attempt to build a neurobiologically inspired Convolutional Neural Network-based model of the mappings between discrete high-level linguistic categories into a continuous signal of fundamental frequency in Polish neutral read speech. After a brief introduction of the current research problem in the context of intonation, speech synthesis and the phonetic-phonology gap, the work goes on to describe the training of the model on a special speech corpus, and an evaluation of the naturalness of the F0 contour produced by the trained model through ABX and MOS perception experiments conducted with help of a specially built Neural Source Filter resynthesizer. Finally, an in-depth exploration of the phonology-to-phonetics mappings learned by the model is presented; the Layer-wise Relevance Propagation explainability method was used to perform an extensive quantitative analysis of the relevance of 1297 specially engineered linguistic input features and their groupings at various levels of abstraction for the specific contours of the fundamental frequency. The work ends with an in-depth interpretation of these results and a discussion of the advantages and disadvantages of the current method, and lists a number of potential future improvements.Badania przedstawione w pracy zostały cz˛e´sciowo zrealizowane w ramach grantu badawczego Harmonia nr UMO-2014/14/M/HS2/00631 przyznanego przez Narodowe Centrum Nauki

Adam Mickiewicz University Repository

Repozytorium Uniwersytetu im. Adama Mickiewicza (AMUR)

Acoustic investigations of concert halls for rock music

Author: Adelman-Larsen Niels Werner
Gade Anders Christian
Thompson Eric Robert
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2007
Field of study

Crossref

Online Research Database In Technology

Recommended from our members

End-to-end Speech Separation with Neural Networks

Author: Luo Yi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

Speech separation has long been an active research topic in the signal processing community with its importance in a wide range of applications such as hearable devices and telecommunication systems. It not only serves as a fundamental problem for all higher-level speech processing tasks such as automatic speech recognition, natural language understanding, and smart personal assistants, but also plays an important role in smart earphones and augmented and virtual reality devices. With the recent progress in deep neural networks, the separation performance has been significantly advanced by various new problem definitions and model architectures. The most widely-used approach in the past years performs separation in time-frequency domain, where a spectrogram or a time-frequency representation is first calculated from the mixture signal and multiple time-frequency masks are then estimated for the target sources. The masks are applied on the mixture's time-frequency representation to extract the target representations, and then operations such as inverse short-time Fourier transform is utilized to convert them back to waveforms. However, such frequency-domain methods may have difficulties in modeling the phase spectrogram as the conventional time-frequency masks often only consider the magnitude spectrogram. Moreover, the training objectives for the frequency-domain methods are typically also in frequency-domain, which may not be inline with widely-used time-domain evaluation metrics such as signal-to-noise ratio and signal-to-distortion ratio. The problem formulation of time-domain, end-to-end speech separation naturally arises to tackle the disadvantages in the frequency-domain systems. The end-to-end speech separation networks take the mixture waveform as input and directly estimate the waveforms of the target sources. Following the general pipeline of conventional frequency-domain systems which contains a waveform encoder, a separator, and a waveform decoder, time-domain systems can be design in a similar way while significantly improves the separation performance. In this dissertation, I focus on multiple aspects in the general problem formulation of end-to-end separation networks including the system designs, model architectures, and training objectives. I start with a single-channel pipeline, which we refer to as the time-domain audio separation network (TasNet), to validate the advantage of end-to-end separation comparing with the conventional time-frequency domain pipelines. I then move to the multi-channel scenario and introduce the filter-and-sum network (FaSNet) for both fixed-geometry and ad-hoc geometry microphone arrays. Next I introduce methods for lightweight network architecture design that allows the models to maintain the separation performance while using only as small as 2.5% model size and 17.6% model complexity. After that, I look into the training objective functions for end-to-end speech separation and describe two training objectives for separating varying numbers of sources and improving the robustness under reverberant environments, respectively. Finally I take a step back and revisit several problem formulations in end-to-end separation pipeline and raise more questions in this framework to be further analyzed and investigated in future works

Columbia University Academic Commons

Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques

Author: Champion Pierre
Publication venue
Publication date: 04/09/2023
Field of study

The growing use of voice user interfaces has led to a surge in the collection and storage of speech data. While data collection allows for the development of efficient tools powering most speech services, it also poses serious privacy issues for users as centralized storage makes private personal speech data vulnerable to cyber threats. With the increasing use of voice-based digital assistants like Amazon's Alexa, Google's Home, and Apple's Siri, and with the increasing ease with which personal speech data can be collected, the risk of malicious use of voice-cloning and speaker/gender/pathological/etc. recognition has increased. This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization. In this work, anonymization refers to making personal speech data unlinkable to an identity while maintaining the usefulness (utility) of the speech signal (e.g., access to linguistic content). We start by identifying several challenges that evaluation protocols need to consider to evaluate the degree of privacy protection properly. We clarify how anonymization systems must be configured for evaluation purposes and highlight that many practical deployment configurations do not permit privacy evaluation. Furthermore, we study and examine the most common voice conversion-based anonymization system and identify its weak points before suggesting new methods to overcome some limitations. We isolate all components of the anonymization system to evaluate the degree of speaker PPI associated with each of them. Then, we propose several transformation methods for each component to reduce as much as possible speaker PPI while maintaining utility. We promote anonymization algorithms based on quantization-based transformation as an alternative to the most-used and well-known noise-based approach. Finally, we endeavor a new attack method to invert anonymization.Comment: PhD Thesis Pierre Champion | Universit\'e de Lorraine - INRIA Nancy | for associated source code, see https://github.com/deep-privacy/SA-toolki

arXiv.org e-Print Archive

Myalgic encephalomyelitis/chronic fatigue syndrome and encephalomyelitis disseminata/multiple sclerosis show remarkable levels of similarity in phenomenology and neuroimmune characteristics

Author: A Araque
A Bar-Or
A Boullerne
A Chaudhuri
A Chaudhuri
A Chaudhuri
A Chvatal
A Ghazavi
A Greco
A Kuspinar
A Pokryszko-Dragan
A Prinster
A Samii
A Samii
A Starr
A Steens
A Tomoda
A Vojdani
A Wilson
AG Il’ves
AH Cross
AJ Lenman
AL Zozulya
AM Amorini
AM Nordenbo
AR Jazirehi
AS Winkler
AT Argaw
AV Ng
AV Plioplys
B Bengsch
B Bielekova
B Bonavida
B Bonetti
B Cannella
B Colombo
B Fischler
B Manuel y Keenoy
B Manuel-y-Keenoy
B Tavazzi
BH Natelson
BJ Saltzstein
BK Puri
BK Puri
BM Carruthers
C Heesen
C Laule
C Morimoto
C Pozzilli
C Rodriguez-Sainz Mdel
C Shepherd
C Syburra
C Zhang
CA Davie
CA Haensch
CC Chao
CC Chao
CM Costantino
CS Constantinescu
D Acheson
D Boerio
D Buchwald
D Buchwald
D Buljevac
D Buljevac
D Gveric
D Horakova
D Kos
D Mahad
D Mahad
D Ontaneda
D Zeller
DA Papanicolaou
DC Costa
DC Shungu
DD Libera
DG Allen
DH Miller
DH Streeten
DJ Brooks
DJ Newham
DJA Butteriss
DL Arnold
E Capelli
E Hattingen
E Lensch
E Paulasu
E Peelen
ER Unger
F de Bustos
F Lu
F Sellebjerg
FB Axelrod
FL van de Veerdonk
G Broderick
G Disanto
G Giovannoni
G Kennedy
G Lange
G Lazzarino
G Morris
G Morris
G Morris
G Morris
GC Ebers
GC Higgins
Gerwyn Morris
GJ Nordal
GL Nicolson
GL Nicolson
GL Nicolson
GL Nicolson
GL Nicolson
GME Hinds
GP Holmes
GR Campbell
H Khorami
H Lassman
H Link
H Murai
H Nakao
HF Harbo
HF McFarland
HF Petereit
HH Hofstetter
HM Gurcan
I Bou-Holaigh
I Dujmovic
I Mihaylova
IL Simone
IY Choi
J Al-Omaishi
J Allen
J De Keyser
J de Seze
J Hu
J Iriarte
J Iriarte
J Matsuda
J McDougall
J Nijs
J Nijs
J Stewart
J van Horssen
J van Horssen
J Vecchiet
JA Kent-Braun
JA Kent-Braun
JA Kent-Braun
JA Kent-Braun
JA Lederer
JC Brooks
JC Edwards
JD Morrow
JF Jones
JH Persoons
JH Petajan
JJ LaManca
JK Chia
JL Newton
JM Stewart
JM Stewart
JR Kerr
JW Chen
JW IJdo
JW Murrough
JW Rose
K Eguchi
K Frei
K Fukuda
K Genc
K Kanjwal
K Koguchi
K Konstantinov
K Matuska
K Miwa
K Morgen
K Rejdak
K Schott
K Takahashi
K Yoshiuchi
KE Hill
KJ Maher
KJ Smith
KM Murphy
KR Sharma
KR Sharma
L Bo
L Haider
L Konecny
L Leocani
L Maneta-Peyret
L Ni
L Zhou
LA Jason
LA Tiersky
LF Kastrukoff
LJ Edwards
LK Borysiewicz
LR Barnden
M Amorini
M Benczur
M Blinkenberg
M Caligiuri
M Debouverie
M D’Souza
M Geffard
M Ichise
M Inglese
M Inglese
M Koch
M Kostic
M Krakauer
M Maes
M Maes
M Maes
M Maes
M Maes
M Maes
M Maes
M Maes
M Maes
M Maes
M Neema
M Nishikai
M Olsson
M Pagani
M Reyes
M Saresella
M Seishima
M Tobi
M Tokunaga
M Vigna-Perez
M Vigna-Perez
MA Fletcher
MA Maurer
MC Gustafsson
MC Sharpe
MD Buchwald
ME Fransson
Michael Maes
MK Sharief
MK Sharief
ML Schillings
MP Mattson
MU Goebel
N Bassi
N Klimas
N Klimas
N Mattsson
NF Hill
NG Carlson
NJ Davey
NJ Davey
NL Monson
NL Reynolds
O Fluge
O Perrella
P Daverat
P De Becker
P Emery
P Flachenecker
P Gallo
P Gallo
P Gallo
P Hautecoeur
P Mao
P Martin
P Reichardt
P Sacco
P Zamboni
PB Carrieri
PC Rowe
PK Peterson
PK Toshniwal
PM Matthews
PM Soetekouw
PO Behan
PPZ Tang
PR Cheney
R Arnon
R Bakshi
R Bakshi
R Bakshi
R Bakshi
R Eisenberg
R Freeman
R Hohlfeld
R Klein
R Mosqueda Garcia
R Naidoo
R Nisenbaum
R Pacheco
R Schondorf
R Weissert
R Wong
RB Schwartz
RCW Vermeulen
RE Gonsette
RJ Lane
RJM Lane
RS Richards
S Adhya
S Barned
S Basu
S Fredrikson
S Fulle
S Ireland
S Myhill
S Savci
S Signoretti
S Simpson Jr
S Tanaka
S Vucic
SA Imam
SB Cohen
SF Ziegler
SJ Kohler
SJ Mathew
SJ Song
SL Hauser
SL Hauser
SM Anderton
SM Vanage
SP Van der Werf
T Fukazawa
T Ishii
T Okada
T Siessmeier
T Vyshkina
TF Gajewski
TJ Montague
TL Richards
TL Whiteside
U Merkelbach
U Roelcke
U Tirelli
U Tirelli
V Devauchelle-Pensec
V Di Lazzaro
V Dousset
V Lombardi
V Navikas
V Preller
V Viglietta
VG Mathiowetz
W Jiang
W Raschid
WM Behan
Y Barak
Y Hokama
Y Hokama
Y Jammes
Y Jammes
YM Huang
YM Huang
Z Popmihajlov
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

New FPGA design tools and architectures

Author: Vansteenkiste Elias
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

The IPS fidelity scale as a guideline to implement Supported Employment

Author: DeSmet Ann
Knaeps Jeroen
Van Audenhove Chantal
Publication venue: 'IOS Press'
Publication date: 01/01/2012
Field of study

info:eu-repo/semantics/publishe

Ghent University Academic Bibliography

DI-fusion

Approximate logic circuits: Theory and applications

Author: Choudhury Mihir Rajanikant
Publication venue
Publication date: 01/01/2011
Field of study

CMOS technology scaling, the process of shrinking transistor dimensions based on Moore's law, has been the thrust behind increasingly powerful integrated circuits for over half a century. As dimensions are scaled to few tens of nanometers, process and environmental variations can significantly alter transistor characteristics, thus degrading reliability and reducing performance gains in CMOS designs with technology scaling. Although design solutions proposed in recent years to improve reliability of CMOS designs are power-efficient, the performance penalty associated with these solutions further reduces performance gains with technology scaling, and hence these solutions are not well-suited for high-performance designs. This thesis proposes approximate logic circuits as a new logic synthesis paradigm for reliable, high-performance computing systems. Given a specification, an approximate logic circuit is functionally equivalent to the given specification for a "significant" portion of the input space, but has a smaller delay and power as compared to a circuit implementation of the original specification. This contributions of this thesis include (i) a general theory of approximation and efficient algorithms for automated synthesis of approximations for unrestricted random logic circuits, (ii) logic design solutions based on approximate circuits to improve reliability of designs with negligible performance penalty, and (iii) efficient decomposition algorithms based on approxiiii mate circuits to improve performance of designs during logic synthesis. This thesis concludes with other potential applications of approximate circuits and identifies. open problems in logic decomposition and approximate circuit synthesis

DSpace at Rice University