Search CORE

55 research outputs found

A New Design of Ultra-Flattened Near-zero Dispersion PCF Using Selectively Liquid Infiltration

Author: Chaudhuri Partha Roy
Maji Partha Sona
Publication venue
Publication date: 25/12/2014
Field of study

The paper report new results of chromatic dispersion in Photonic Crystal Fibers (PCFs) through appropriate designing of index-guiding triangular-lattice structure devised, with a selective infiltration of only the first air-hole ring with index-matching liquid. Our proposed structure can be implemented for both ultra-low and ultra-flattened dispersion over a wide wavelength range. The dependence of dispersion parameter of the PCF on infiltrating liquid indices, hole-to-hole distance and air-hole diameter are investigated in details. The result establishes the design to yield a dispersion of 0+-0.15ps/ (nm.km) in the communication wavelength band. We propose designs pertaining to infiltrating practical liquid for near-zero ultra-flat dispersion of D=0+-0.48ps/ (nm.km) achievable over a bandwidth of 276-492nm in the wavelength range of 1.26 {\mu}m to 1.80{\mu}m realization.Comment: 6 pages, 13 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Model-Architecture Co-design of Deep Neural Networks for Embedded Systems

Author: Maji Partha
Publication venue: University of Cambridge
Publication date: 24/06/2020
Field of study

In deep learning, a convolutional neural network (ConvNet or CNN) is a powerful tool for building interesting embedded applications that use data to make predictions. An application running on an embedded system typically has limited access to memory resources, processing power, and storage. Implementing deep convolutional neural network-based inference on resource-constrained devices can be very challenging, as these environments cannot usually make use of the massive computing power and storage that are present in cloud server environments. Furthermore, the constantly evolving nature of modern deep network architecture aggravates the problem by making it necessary to balance flexibility against specialisation to avoid the inability to adapt. However, much of the baseline architecture of a deep convolutional neural network stayed the same. With careful optimisation of the most common and widely occurring layer architectures, it is typically possible to accelerate these emerging workloads for resource-constrained embedded systems. This thesis makes four contributions. I first developed a lossy three-stage low-rank approximation scheme that can reduce the computational complexity of a pre-trained model by 3-5x and up to 8-9x for individual convolutional layers. This scheme requires restructuring of the convolutional layers and generally suits the scenario where both the training data and trained model are available. In many scenarios, the training data is not available for fine-tuning any loss in prediction accuracy if structural changes are made to a model as a post-processing step. Besides the lack of availability of training data, there are other situations where the architecture of a model cannot be changed after training. My second contribution handles this scenario by using a low-level optimisation scheme that requires no changes to the model architecture, unlike the low-rank approximation scheme. This novel scheme uses a modified version of the Cook-Toom algorithm to reduce the computational intensity of commonly occurring dense and spatial convolutional layers and speedup inference time by 2-4x. My third contribution is an efficient implementation of the Cook-Toom class of algorithms on ubiquitous Arm's low-power Cortex processor. Unlike the direct convolution, computing convolutions using the modified Cook-Toom algorithm requires a different data processing pipeline as it involves pre- and post-transformations of the intermediate activations. I introduced a multi-channel multi-region (MCMR) scheme to enable an efficient implementation of the fast Cook-Toom algorithm. I demonstrate that by effectively using SIMD instructions and the MCMR scheme an average 2-3x and a peak 4x per layer speedup is easily achievable. My final contribution is the Cook-Toom accelerator, a custom hardware architecture for modern convolutional neural networks. This accelerator architecture is designed from the ground up to address some of the limitations of a resource-constrained SIMD processor. I also illustrate how new emerging layer types can be mapped efficiently to the same flexible architecture without any modification

Apollo (Cambridge)

Quarc: a high-efficiency network on-chip architecture

Author: Maji Partha
Moadeli Mahmoud
Vanderbauwhede Wim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

The novel Quarc NoC architecture, inspired by the Spidergon scheme is introduced as a NoC architecture that is highly efficient in performing collective communication operations including broadcast and multicast. The efficiency of the Quarc architecture is achieved through balancing the traffic which is the result of the modifications applied to the topology and the routing elements of the Spidergon NoC. This paper provides an ASIC implementation of both architectures using UMCpsilas 0.13 mum CMOS technology and demonstrates an analysis and comparison of the cost and performance between the Quarc and the Spidergon NoCs

Crossref

Enlighten

Near-elliptic core triangular-lattice and square-lattice PCFs: a comparison of birefringence, cut-off and GVD characteristics towards fiber device application

Author: Chaudhuri Partha Roy
Maji Partha Sona
Publication venue
Publication date: 01/01/2014
Field of study

In this work, detailed numerical analysis of the near-elliptic core index-guiding triangular-lattice and square-lattice photonic crystal fiber (PCFs) are reported for birefringence, single mode, cut-off behavior, group velocity dispersion and effective area properties. For the same relative values of d/P, triangular-lattice PCFs show higher birefringence whereas the square-lattice PCFs show a wider range of single-mode operation. Square-lattice PCF was found to be endlessly single-mode for higher air-filling fraction (d/P). Smaller lengths of triangular-lattice PCF are required for dispersion compensation whereas PCFs with square-lattice with nearer relative dispersion slope (RDS) can better compensate the broadband dispersion. Square-lattice PCFs show ZDW red-shifted, making it preferable for mid-IR supercontinuum generation (SCG) with highly non-linear chalcogenide material. Square-lattice PCFs show higher dispersion slope that leads to compression of the broadband, thus accumulating more power in the pulse. On the other hand, triangular-lattice PCF with flat dispersion profile can generate broader SCG. Square-lattice PCF with low Group Velocity Dispersion (GVD) at the anomalous dispersion corresponds to higher dispersion length and higher degree of solitonic interaction. The effective area of square-lattice PCF is always greater than its triangular-lattice counterpart making it better suited for high power applications. Smaller length of symmetric-core PCF for dispersion compensation, while broadband dispersion compensation can be better performed with asymmetric-core PCF. Mid-Infrared SCG can be better performed with asymmetric-core PCF with compressed and high power pulse, while wider range of SCG can be performed with symmetric core PCF. Thus, this study will be extremely useful for realizing fiber towards a custom application around these characteristics.Comment: 10 pages, 17 figure

arXiv.org e-Print Archive

CiteSeerX

On the Reduction of Computational Complexity of Deep Convolutional Neural Networks.

Author: Maji Partha
Mullins Robert
Publication venue: Entropy (Basel)
Publication date: 01/04/2018
Field of study

Deep convolutional neural networks (ConvNets), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks. Unfortunately, achieving accuracy often implies significant computational costs, limiting deployability. In modern ConvNets it is typical for the convolution layers to consume the vast majority of computational resources during inference. This has made the acceleration of these layers an important research area in academia and industry. In this paper, we examine the effects of co-optimizing the internal structures of the convolutional layers and underlying implementation of fundamental convolution operation. We demonstrate that a combination of these methods can have a big impact on the overall speedup of a ConvNet, achieving a ten-fold increase over baseline. We also introduce a new class of fast one-dimensional (1D) convolutions for ConvNets using the Toom-Cook algorithm. We show that our proposed scheme is mathematically well-grounded, robust, and does not require any time-consuming retraining, while still achieving speedups solely from convolutional layers with no loss in baseline accuracy

Directory of Open Access Journals

Apollo (Cambridge)

On the effects of quantisation on model uncertainty in Bayesian neural networks

Author: Ferianc Martin
Maji Partha
Mattina Matthew
Rodrigues Miguel
Publication venue: PMLR
Publication date: 01/01/2021
Field of study

Bayesian neural networks (BNNs) are making significant progress in many research areas where decision-making needs to be accompanied by uncertainty estimation. Being able to quantify uncertainty while making decisions is essential for understanding when the model is over-/under-confident, and hence BNNs are attracting interest in safety-critical applications, such as autonomous driving, healthcare, and robotics. Nevertheless, BNNs have not been as widely used in industrial practice, mainly because of their increased memory and compute costs. In this work, we investigate quantisation of BNNs by compressing 32-bit floating-point weights and activations to their integer counterparts, that has already been successful in reducing the compute demand in standard pointwise neural networks. We study three types of quantised BNNs, we evaluate them under a wide range of different settings, and we empirically demonstrate that a uniform quantisation scheme applied to BNNs does not substantially decrease their quality of uncertainty estimation

UCL Discovery