Search CORE

51 research outputs found

On-device Efficient Acoustic Modeling with Simple Gated Convolutional Networks

Author: 이루카스
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 성원용.오늘날, 자동 음성 인식 시스템으로 인공신경망 기반의 알고리즘이 주요하게 활용되고 있다. 그런 가운데, 스마트폰이나 임베디드 장치에서 서버를 거치지 않고 진행되는 온-디바이스 음성 인식 시스템에 대한 수요가 증가하고 있다. 온-디바이스 음성 인식 시스템은 사용자의 음성이 서비스 제공자의 서버로 제공되지 않고, 음성인식이 사용자의 장치에서 독립적으로 이루어진다. 따라서, 프라이버시 침해와 보안에 대한 우려를 상당 부분 해소할 수 있다. 그러나, 인공신경망 기반의 음성 인식 시스템에서 주로 사용되는 LSTM 기반의 회귀신경망(RNN)은 온-디바이스 음성 인식에 효율적이지 않다. LSTM RNN은 시퀀스(sequence) 정보의 병렬화가 어렵다. 이는 LSTM RNN에는 현재의 시간 스텝(step)이 과거의 시간 스텝에 의존하는 되먹임(Feedback) 특성이 존재하기 때문이다. 또, 이 되먹임 정보는 너무 커서 캐시 메모리에 들어갈 수 없다. 따라서, 시퀀스 정보의 매 시간 스텝마다 DRAM에 접근하여 샘플을 불러와야 한다. 이 경우 매 시간 스텝마다 DRAM에 접근하여 전력소모가 증가할 뿐만 아니라, 실행 시간도 증가하게 된다. 우리는 이 논문에서 온-디바이스에 친화적인 인공신경말 모델을 제시한다. 이 모델들을 음향 모델링에 활용하여 LSTM RNN을 대체한다. 게이티드 콘볼루션 네크워크(Gated ConvNet), 대각성분 LSTM(Diagonal LSTM), QRNN(the quasi RNN)이 활용되었다. 이들 모델은 대부분의 연산에서 순서 의존성이 존재 하지 않아 시간 스텝별 병렬화가 가능하다. \\ \\ \\ \\ 이들 모델들은 자동 음성 인식에서 1차원 깊이 콘볼루션(1D depthwise Convolution)이 추가된 후에는 LSTM RNN의 성능을 훨씬 능가하였다. 특히 게이티드 콘볼루션 네트워크의 경우 깊은 구조를 채택하였을 때, 음향 모델 없이 가장 좋은 성능을 보여주었다. 무엇보다도 온-디바이스에 효율적인 인공신경망 모델들은 시퀀스의 시간 스텝별 병렬화를 통해 실제 임베디드 장치에서 LSTM RNN 대비 최소 5배의 실행 속도 증가를 보여주었다. 우리는 여기서 더 나아가, 심플 게이티드 콘볼루션 네트워크(Simple Gated ConvNet)을 제시한다. 심플 게이티드 콘볼루션은 게이티드 콘볼루션의 가장 단순화 된 형태에 기반을 둔 것으로, 파라미터의 수가 혁명적으로 감소한다. 이는 하드웨어 사양의 제한을 받는 온-디바이스 음성인식에 유리한 특성이다. 또한 심플 게이티드 콘볼루션 네트워크는 시간 스텝 별 순서 의존성이 존재하지 않기 때문에 시간 스텝별 병렬화도 가능하다. 우리는 1차원 깊이 병렬화(1D depthwise convolution)을 여러 방향을 적용하여 성능 향상을 이끌어 내었다. 구체적으로, 우리는 심플 게이티드 콘볼루션 네크워크를 활용해 파라미터 사용량을 3 M 이하로 줄였다. 동일한 파라미터 수가 주어졌을 때 심플 게이티드 콘볼루션 네트워크는 자동 음성 인식에서 LSTM RNN이나 게이티드 콘볼루션 네트워크의 성능을 능가했다. 3 M 아래의 심플 게이티드 콘볼루션 네크워크는 10 M의 LSTM보다 더 좋은 성능을 보여주기도 하였다. 또한, 시간 스텝 별 병렬화를 통해서 ARM CPU에서 LSTM RNN 대비 10 배의 실행 속도 증가를 얻어냈다.Automatic speech recognition (ASR) is widely adopted for smartphones and many embedded devices in recent years, and neural network based algorithms show the best performance for ASR. While most of ASR systems are based on server-based processing, there is an increasing demand for on-device speech recognition because of privacy concern and low latency processing. Reducing the power consumption is especially important for on-device speech recognition to lengthen the battery life. Among several neural network models, recurrent neural network (RNN) based algorithms are mostly used for speech recognition, and long short-term memory(LSTM) RNN is most popular because of its superior performance over the other ones. However, executing LSTM RNN demands many DRAM accesses because the cache size of embedded devices is usually much smaller than the parameter size of RNN. Multi-time step parallelization technique computes multiple output samples at a time by fetching one set of parameters, and thus it can reduce the number of DRAM accesses in proportional to the number of time steps computed at a time. However, LSTM RNN does not permit the multi-time step parallelization because of complex feedback structure of the model. This thesis presents neural network models that support efficient on-device speech recognition. First, a few models that permit multi-time step parallel processing are evaluated. The models evaluated include Gated ConvNet, Diagonal LSTM, and QRNN (quasi RNN). Since the performance of these models are not as good as the LSTM, one-dimensional depthwise convolution is added to improve the performance. The one-dimensional convolution helps finding the temporal patterns of speech signal. Second, Simple Gated Convolution Network (Simple Gated ConvNet) is proposed for improved performance when the parameter count is very small. The Simple Gated ConvNet employs the simplest form of Gated ConvNet. Instead it relies on one-dimensional convolution for temporal observation. Simple Gated ConvNet supports low-power on-device speech recognition because it can be executed employing multi-time step parallelization. The Simple Gated ConvNet under 3 million even shows better performance than the LSTM with 10 million parameters. In addition, the execution speed in ARM CPU can be increased more than ten-times compared with the LSTM RNN through multi-time step parallelization.1 Introduction 1 1.1 On-device speech recognition: advantages and challenges . . . . . . 1 1.2 The components of speech recognition . . . . . . . . . . . . . . . . 3 1.3 The downsides of RNN based acoustic models . . . . . . . . . . . . . 4 1.4 Exploration of efficient on-device acoustic modeling with neural networks . . . . . . . . . . 5 1.5 Simple Gated ConvNet for small footprint acoustic modeling . . . . . 6 1.6 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Exploration of Efficient On-device Acoustic Modeling with Neural Networks 8 2.1 Acoustic Modeling Algorithms . . . . . . . . . . . . . . . . . . . . 8 2.1.1 Diagonal LSTM RNN . . . . . . . . . . . . . . . . . . . . . 8 2.1.2 QRNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 Gated ConvNet . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.1 End-to-end speech recognition . . . . . . . . . . . . . . . . . 11 2.2.2 Phoneme classification . . . . . . . . . . . . . . . . . . . . . 15 2.2.3 Implementation Results on Embedded Systems . . . . . . . . 17 2.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Simple Gated Convolutional Networks for small footprint acoustic modeling 20 3.1 Simple Gated ConvNet . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.1 Gated ConvNet . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.2 Simple Gated ConvNet . . . . . . . . . . . . . . . . . . . . . 21 3.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1 Experiment Setups . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . 25 3.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Conclusions 32 Abstract (In Korean) 39Maste

End-to-End Neural Network-based Speech Recognition for Mobile and Embedded Devices

Author: 박진환
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2020. 8. 성원용.Real-time automatic speech recognition (ASR) on mobile and embedded devices has been of great interest in recent years. Deep neural network-based automatic speech recognition demands a large number of computations, while the memory bandwidth and power storage of mobile devices are limited. The server-based implementation is often employed, but this increases latency or privacy concerns. Therefore, the need of the on-device ASR system is increasing. Recurrent neural networks (RNNs) are often used for the ASR model. The RNN implementation on embedded devices can suffer from excessive DRAM accesses, because the parameter size of a neural network usually exceeds that of the cache memory. Also, the parameters of RNN cannot be reused for multiple time-steps due to its feedback structure. To solve this problem, multi-time step parallelizable models are applied for speech recognition. The multi-time step parallelization approach computes multiple output samples at a time with the parameters fetched from the DRAM. Since the number of DRAM accesses can be reduced in proportion to the number of parallelization steps, a high processing speed can be achieved for the parallelizable model. In this thesis, a connectionist temporal classification (CTC) model is constructed by combining simple recurrent units (SRUs) and depth-wise 1-dimensional convolution layers for multi-time step parallelization. Both the character and word piece models are developed for the CTC model, and the corresponding RNN based language models are used for beam search decoding. A competitive WER for WSJ corpus is achieved using the entire model size of approximately 15MB. The system operates in real-time speed using only a single core ARM without GPU or special hardware. A low-latency on-device speech recognition system with a simple gated convolutional network (SGCN) is also proposed. The SGCN shows a competitive recognition accuracy even with 1M parameters. 8-bit quantization is applied to reduce the memory size and computation time. The proposed system features an online recognition with a 0.4s latency limit and operates in 0.2 RTF with only a single 900MHz CPU core. In addition, an attention-based model with the depthwise convolutional encoder is proposed. Convolutional encoders enable faster training and inference of attention models than recurrent neural network-based ones. However, convolutional models often require a very large receptive field to achieve high recognition accuracy, which not only increases the parameter size but also the computational cost and run-time memory footprint. A convolutional encoder with a short receptive field length often suffers from looping or skipping problems. We believe that this is due to the time-invariance of convolutions. We attempt to remedy this issue by adding positional information to the convolution-based encoder. It is shown that the word error rate (WER) of a convolutional encoder with a short receptive field size can be reduced significantly by augmenting it with positional information. Visualization results are presented to demonstrate the effectiveness of incorporating positional information. The streaming end-to-end ASR model is also developed by applying monotonic chunkwise attention.최근 모바일 및 임베디드 기기에서 실시간 동작하는 음성 인식 시스템을 개발하는 것이 큰 관심을 받고 있다. 깊은 인공 신경망 음성인식은 많은 양의 연산을 필요로 하는 반면, 모바일 기기의 메모리 대역폭이나 전력은 제한되어 있다. 이러한 한계 때문에 서버 기반 구현이 보통 사용되어지지만, 이는 지연 시간 및 사생활 침해 문제를 일으킨다. 따라서 모바일 기기 상 동작하는 음성 인식 시스템의 요구가 커지고 있다. 음성 인식 시스템에 주로 사용되는 모델은 재귀형 인공 신경망이다. 재귀형 인공 신경망의 모델 크기는 보통 캐시의 크기보다 크고 피드백 구조 때문에 재사용이 어렵기 때문에 많은 DRAM 접근을 필요로 한다. 이러한 문제를 해결하기 위해 다중 시간의 입력에대해 병렬화 가능한 모델을 이용한 음성 인식 시스템을 제안한다. 다중 시간 병렬화 기법은 한 번의 메모리 접근으로 여러 시간의 출력을 동시에 계산하는 방법이다. 병렬화 수에 따라 DRAM 접근 횟수를 줄일 수 있기 때문에, 병렬화 가능한 모델에 대하여 빠른 연산이 가능하다. 단순 재귀 유닛과 1차원 컨벌루션을 이용한 CTC 모델을 제시하였다. 문자와 단어 조각 수준의 모델이 개발되었다. 각 출력 단위에 해당하는 재귀형 신경망 기반 언어 모델을 이용하여 디코딩에 사용되었다. 전체 15MB의 메모리 크기로 WSJ 에서 높은 수준의 인식 성능을 얻었으며 GPU나 기타 하드웨어 없이 1개의 ARM CPU 코어로 실시간 처리를 달성하였다. 또한 단순 컨벌루션 인공 신경망 (SGCN)을 이용한 낮은 지연시간을 가지는 음성인식 시스템을 개발하였다. SGCN은 1M의 매우 낮은 변수 갯수로도 경쟁력 있는 인식 정확도를 보여준다. 추가적으로 8-bit 양자화를 적용하여 메모리 크기와 연산 시간을 감소 시켰다. 해당 시스템은 0.4초의 이론적 지연시간을 가지며 900MHz의 CPU 상에서 0.2의 RTF로 동작하였다. 추가적으로, 깊이별 컨벌루션 인코더를 이용한 어텐션 기반 모델이 개발되었다. 컨벌루션 기반의 인코더는 재귀형 인공 신경망 기반 모델보다 빠른 처리 속도를 가진다. 하지만 컨벌루션 모델은 높은 성능을 위해서 큰 입력 범위를 필요로 한다. 이는 모델 크기 및 연산량, 그리고 동작 시 메모리 소모를 증가 시킨다. 작은 크기의 입력 범위를 가지는 컨벌루션 인코더는 출력의 반복이나 생략으로 인하여 높은 오차율을 가진다. 이것은 컨벌루션의 시간 불변성 때문으로 여겨지며, 이 문제를 위치 인코딩 벡터를 이용하여 해결하였다. 위치 정보를 이용하여 작은 크기의 필터를 가지는 컨벌루션 모델의 성능을 높일 수 있음을 보였다. 또한 위치 정보가 가지는 영향을 시각화 하였다. 해당 방법을 단조 어텐션을 이용한 모델에 활용하여 컨벌루션 기반의 스트리밍 가능한 음성 인식 시스템을 개발하였다.1 Introduction 1 1.1 End-to-End Automatic Speech Recognition with Neural Networks . . 1 1.2 Challenges on On-device Implementation of Neural Network-based ASR 2 1.3 Parallelizable Neural Network Architecture 3 1.4 Scope of Dissertation 3 2 Simple Recurrent Units for CTC-based End-to-End Speech Recognition 6 2.1 Introduction 6 2.2 Related Works 8 2.3 Speech Recognition Algorithm 9 2.3.1 Acoustic modeling 10 2.3.2 Character-based model 12 2.3.3 Word piece-based model 14 2.3.4 Decoding 14 2.4 Experimental Results 15 2.4.1 Acoustic models 15 2.4.2 Word piece based speech recognition 22 2.4.3 Execution time analysis 25 2.5 Concluding Remarks 27 3 Low-Latency Lightweight Streaming Speech Recognition with 8-bit Quantized Depthwise Gated Convolutional Neural Networks 28 3.1 Introduction 28 3.2 Simple Gated Convolutional Networks 30 3.2.1 Model structure 30 3.2.2 Multi-time-step parallelization 31 3.3 Training CTC AM with SGCN 34 3.3.1 Regularization with symmetrical weight noise injection 34 3.3.2 8-bit quantization 34 3.4 Experimental Results 36 3.4.1 Experimental setting 36 3.4.2 Results on WSJ eval92 38 3.4.3 Implementation on the embedded system 38 3.5 Concluding Remarks 39 4 Effect of Adding Positional Information on Convolutional Neural Networks for End-to-End Speech Recognition 41 4.1 Introduction 41 4.2 Related Works 43 4.3 Model Description 45 4.4 Experimental Results 46 4.4.1 Effect of receptive field size 46 4.4.2 Visualization 49 4.4.3 Comparison with other models 53 4.5 Concluding Remarks 53 5 Convolution-based Attention Model with Positional Encoding for Streaming Speech Recognition 55 5.1 Introduction 55 5.2 Related Works 58 5.3 End-to-End Model for Speech Recognition 61 5.3.1 Model description 61 5.3.2 Monotonic chunkwise attention 62 5.3.3 Positional encoding 63 5.4 Experimental Results 64 5.4.1 Effect of positional encoding 66 5.4.2 Comparison with other models 68 5.4.3 Execution time analysis 70 5.5 Concluding Remarks 71 6 Conclusion 72 Abstract (In Korean) 86Docto

효율적인 키워드 인식을 위한 간략 콘볼루션 신경망

Author: QIAN XUE
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (석사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2020. 8. 성원용.키워드 스팟팅(KWS)은 현재의 음성 기반 휴먼-컴퓨터 상호작용에서 중요한 역할을 하며 스마트 기기에서 널리 사용되고 있다. 신경망의 급속한 발달로 음성인식, 음성 합성, 화자인식 등 여러 음성 처리 분야에 걸친 어플리케이션에서 큰 성과를 거뒀다. 다양한 음성 처리 분야에서 강점을 보이고 있는 인공 신경망은 KWS를 위한 시스템에도 매력적인 선택이 되었다. 그러나 애플리케이션 환경은 스마트폰, 패드 및 일부 스마트 홈 기기를 포함한 소형 스마트 기기들이 대부분이기 때문에, 신경 네트워크 아키텍처들은 KWS 시스템을 설계할 때 이러한 스마트 기기의 제한된 메모리와 계산 용량을 고려해야 한다. 동시에 실시간, 사용자 친화적, 높은 정확도로 대응하려면 낮은 대기 시간을 유지할 수 있어야 한다. 또한 KWS는 다른 업무와 달라 상시 온라인 상태에서 이용자의 호출을 기다려야 하기 때문에 KWS 애플리케이션의 전력 예산도 크게 제한된다. 메인스트림 신경망 모델 중에는 과거 DNN, CNN, RNN, 그리고 서로의 조합이 주로 KWS에 사용되면서 최근에는 Attention 기반 모델도 점점 인기를 끌고 있다. 그 중에서도 CNN은 정확성과 견고성, 병렬처리가 뛰어나 KWS에서 널리 채택되고 있다. 본 연구에서는 효율적인 키워드 스팟팅을 지원하는 신경망 모델인 신플 콘볼루션 네트워크를 제시한다. 높은 정확도를 유지하기 위한 중간 과정으로 보다 컴팩트한 residual 네트워크와 노이즈 인식 훈련법을 주로 사용한다. ResNet은 좋은 성능을 얻기 위해 항상 수십만 개의 매개 변수를 필요로 하기 때문에, 우리 모델에서는 한정된 자원을 가진 스마트 기기에 더 적합할 수 있도록 depthwise 콘볼루션 네트워크를 사용하여 파라미터 수를 줄이는 법을 제시한다. 마지막으로 실제 모바일 기기인 삼성 갤럭시 S6 엣지에서 제안된 모델의 실제 추론 시간(즉, 지연 시간)을 측정하였다. 온라인 상 공개된 Google 음성 명령 데이터 집합이 모델을 평가하는 데 사용되었다. 결과는 제시된 모델이 기존 모델보다 약 1/2 의 매개변수와 계산 횟수를 훨씬 적게 사용한다는 것을 보여주며거의 동일한 정확도로 속도가 17.5 % 빠르며 6.9ms에 도달했다. 훨씬 작은 메모리 소모로도 다른 최신 KWS 모델을 능가하는 96.59%의 높은 정확도를 유지하고 있다.Keyword spotting (KWS) plays an important role in the current speech-based human-computer interaction, and is widely used on smart devices. With the rapid development of neural networks, various applications in speech related fields such as speech recognition, speech synthesis and speaker recognition have achieved great performances. Neural networks have become attractive choices for KWS architectures because of their good performance in speech processing. However, since the application environment is mostly in small smart devices including smart phones, tablets and smart home devices, neural network architectures must consider the limited memory and computation capacity of these smart devices when designing a KWS system . At the same time, the KWS system should be able to maintain low latency in order to respond in real time. In addition, KWS is different from other tasks, because it needs to be always online and waiting for the call from the users, therefore, the power budget of the KWS application is also greatly restricted. Among the mainstream neural network models, FCDNN (fully connected deep neural network), CNN (convolutional neural network), RNN (recurrent neural network) and the combination of them are mainly used for KWS in the past. Recently, attention-based models have become more and more popular. Among them, CNN is widely adopted in KWS, because of its excellent accuracy, robustness, and parallel processing capacity. Parallel processing capacity is essential for low-power implementations. In this work, we present a neural network model-Simple Depthwise Convolutional Network, which supports an efficient keyword spotting. We mainly focus on a more compact Residual Network, and apply noise injection as an intermediate process to maintain high accuracy. Typically, ResNet always requires several hundred thousands parameters to achieve good performance. In our model, we employ depthwise convolutional neural networks to decrease the number of parameters, so that it can be more suitable for smart devices with limited resources. Finally, our model is tested on a real mobile device Samsung Galaxy S6 Edge, reality in the real inference time (that is, latency) of about 6.9ms, which is 17.5% faster than the state-of-the-art model TC-ResNet. The publicly available Google Speech Commands dataset is used to evaluate the models. The results show that we only use about one half of the parameters and at most 300 times fewer number of computations than the original base model, meanwhile, much smaller memory footprint yet maintain the 96.59% comparable high accuracy which outperforms the other state-of-the-art KWS models.1. Introduction 1 1.1 Keyword Spotting System (KWS) 1 1.2 Challenges in Keyword Spotting 6 1.3 Neural Network Architecture for Small-Footprint KWS 6 1.3.1 TDNN-SWSA 7 1.3.2 TC-ResNet 9 1.3.3 DS-CNN 9 1.4 Simple Depthwise Convolutional Neural Network for Efficient KWS 10 1.5 Outline of the Thesis 11 2.Simple Depthwise Convolutional Neural Network 12 2.1 Depthwise ConvNet 12 2.2 Simple Depthwise ConvNet 14 2.3 Residual Simple Depthwise ConvNet 15 2.4 Experiments and Results 17 3. Robustness of Efficient Keyword Spotting 19 3.1 Weight Noise Injection 19 3.2 Experiments on Two Different GSCs 21 3.2.1 Standard GSC 21 3.2.2 Augmented GSC 22 3.2.3 Experiments and Results 22 3.3 FRR and FAR in a 3rd Dataset 24 3.3.1 FRR and FAR 24 3.3.2 The third GSC 24 3.3.3 Experiments and Results 25 4. Conclusions 28 5. Bibliography 29 Abstract (in Korean) 32Maste

MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition

Author: Ginsburg Boris
Majumdar Somshubra
Publication venue
Publication date: 21/04/2020
Field of study

We present an MatchboxNet - an end-to-end neural network for speech command recognition. MatchboxNet is a deep residual network composed from blocks of 1D time-channel separable convolution, batch-normalization, ReLU and dropout layers. MatchboxNet reaches state-of-the-art accuracy on the Google Speech Commands dataset while having significantly fewer parameters than similar models. The small footprint of MatchboxNet makes it an attractive candidate for devices with limited computational resources. The model is highly scalable, so model accuracy can be improved with modest additional memory and compute. Finally, we show how intensive data augmentation using an auxiliary noise dataset improves robustness in the presence of background noise

arXiv.org e-Print Archive

EASTER: Efficient and Scalable Text Recognizer

Author: Bali Raghav
Chaudhary Kartik
Publication venue
Publication date: 19/08/2020
Field of study

Recent progress in deep learning has led to the development of Optical Character Recognition (OCR) systems which perform remarkably well. Most research has been around recurrent networks as well as complex gated layers which make the overall solution complex and difficult to scale. In this paper, we present an Efficient And Scalable TExt Recognizer (EASTER) to perform optical character recognition on both machine printed and handwritten text. Our model utilises 1-D convolutional layers without any recurrence which enables parallel training with considerably less volume of data. We experimented with multiple variations of our architecture and one of the smallest variant (depth and number of parameter wise) performs comparably to RNN based complex choices. Our 20-layered deepest variant outperforms RNN architectures with a good margin on benchmarking datasets like IIIT-5k and SVT. We also showcase improvements over the current best results on offline handwritten text recognition task. We also present data generation pipelines with augmentation setup to generate synthetic datasets for both handwritten and machine printed text.Comment: 9 pages, fixed typos and minor edit

arXiv.org e-Print Archive

Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017)

Author
Publication venue: Tampere University of Technology. Laboratory of Signal Processing
Publication date: 01/01/2017
Field of study

COMPUTATIONAL ANALYSIS OF CODE-MULTIPLEXED COULTER SENSOR SIGNALS

Author: Wang Ningquan
Publication venue: Georgia Institute of Technology
Publication date: 15/09/2021
Field of study

Nowadays, lab-on-a-chip (LoC) technology has been applied in a variety of applications because of its capability to perform accurate microscale manipulations of cells for point-of-care diagnostics. On the other hand, such a result is not readily available from an LoC device and typically still requires a post-inspection of the chip using traditional laboratory equipment such as a microscope, negating the advantages of the LoC technology. To solve this dilemma, my doctoral research mainly focuses on developing portable and disposable biosensors for interfacing with and digitizing the information from an LoC system. Our sensor platform, integrated with multiple microfluidic impedance sensors, electrically monitors and tracks manipulated cells on an LoC device. The sensor platform compresses information from each sensor into a 1-dimensional electrical waveform, and therefore, further signal processing is required to recover the readout of each sensor and extract information of detected cells. Furthermore, with the capability of the sensor platform, we have introduced integrated microfluidic cytometers to characterize properties of cells such as cell surface expression and mechanical properties.Ph.D