47 research outputs found
A compressed sensing approach to block-iterative equalization: connections and applications to radar imaging reconstruction
The widespread of underdetermined systems has brought forth a variety of new algorithmic solutions, which capitalize on the Compressed Sensing (CS) of sparse data. While well known greedy or iterative threshold type of CS recursions take the form of an adaptive filter followed by a proximal operator, this is no different in spirit from the role of block iterative decision-feedback equalizers (BI-DFE), where structure is roughly exploited by the signal constellation slicer. By taking advantage of the intrinsic sparsity of signal modulations in a communications scenario, the concept of interblock interference (IBI) can be approached more cunningly in light of CS concepts, whereby the optimal feedback of detected symbols is devised adaptively. The new DFE takes the form of a more efficient re-estimation scheme, proposed under recursive-least-squares based adaptations. Whenever suitable, these recursions are derived under a reduced-complexity, widely-linear formulation, which further reduces the minimum-mean-square-error (MMSE) in comparison with traditional strictly-linear approaches. Besides maximizing system throughput, the new algorithms exhibit significantly higher performance when compared to existing methods. Our reasoning will also show that a properly formulated BI-DFE turns out to be a powerful CS algorithm itself. A new algorithm, referred to as CS-Block DFE (CS-BDFE) exhibits improved convergence and detection when compared to first order methods, thus outperforming the state-of-the-art Complex Approximate Message Passing (CAMP) recursions. The merits of the new recursions are illustrated under a novel 3D MIMO Radar formulation, where the CAMP algorithm is shown to fail with respect to important performance measures.A proliferação de sistemas sub-determinados trouxe a tona uma gama de novas soluções algorítmicas, baseadas no sensoriamento compressivo (CS) de dados esparsos. As recursões do tipo greedy e de limitação iterativa para CS se apresentam comumente como um filtro adaptativo seguido de um operador proximal, não muito diferente dos equalizadores de realimentação de decisão iterativos em blocos (BI-DFE), em que um decisor explora a estrutura do sinal de constelação. A partir da esparsidade intrínseca presente na modulação de sinais no contexto de comunicações, a interferência entre blocos (IBI) pode ser abordada utilizando-se o conceito de CS, onde a realimentação ótima de símbolos detectados é realizada de forma adaptativa. O novo DFE se apresenta como um esquema mais eficiente de reestimação, baseado na atualização por mínimos quadrados recursivos (RLS). Sempre que possível estas recursões são propostas via formulação linear no sentido amplo, o que reduz ainda mais o erro médio quadrático mínimo (MMSE) em comparação com abordagens tradicionais. Além de maximizar a taxa de transferência de informação, o novo algoritmo exibe um desempenho significativamente superior quando comparado aos métodos existentes. Também mostraremos que um equalizador BI-DFE formulado adequadamente se torna um poderoso algoritmo de CS. O novo algoritmo CS-BDFE apresenta convergência e detecção aprimoradas, quando comparado a métodos de primeira ordem, superando as recursões de Passagem de Mensagem Aproximada para Complexos (CAMP). Os méritos das novas recursões são ilustrados através de um modelo tridimensional para radares MIMO recentemente proposto, onde o algoritmo CAMP falha em aspectos importantes de medidas de desempenho
FPGA Implementation of Fast Fourier Transform Core Using NEDA
Transforms like DFT are a major block in communication systems such as OFDM, etc. This thesis reports architecture of a DFT core using NEDA. The advantage of the proposed architecture is that the entire transform can be implemented using adder/subtractors and shifters only, thus minimising the hardware requirement compared to other architectures. The proposed design is implemented for 16-bit data path (12–bit for comparison) considering both integer representation as well as fixed point representation, thus increasing the scope of usage. The proposed design is mapped on to Xilinx XC2VP30 FPGA, which is fabricated using 130 nm process technology. The maximum on board frequency of operation of the proposed design is 122 MHz. NEDA is one of the techniques to implement many signal processing systems that require multiply and accumulate units. FFT is one of the most employed blocks in many communication and signal processing systems. The FPGA implementation of a 16 point radix-4 complex FFT is proposed. The proposed design has improvement in terms of hardware utilization compared to traditional methods. The design has been implemented on a range of FPGAs to compare the performance. The maximum frequency achieved is 114.27 MHz on XC5VLX330 FPGA and the maximum throughput, 1828.32 Mbit/s and minimum slice delay product, 9.18. The design is also implemented using synopsys DC synthesis in both 65 nm and 180 nm technology libraries. The advantages of multiplier-less architectures are reduced hardware and improved latency. The multiplier-less architectures for the implementation of radix-2^2 folded pipelined complex FFT core are based on NEDA. The number of points considered in the work is sixteen and the folding is done by a factor of four. The proposed designs are implemented on Xilinx XC5VSX240T FPGA. Proposed designs based on NEDA have reduced area over 83%. The observed slice-delay product for NEDA based designs are 2.196 and 5.735
Deep neural mobile networking
The next generation of mobile networks is set to become increasingly complex, as these struggle to accommodate tremendous data traffic demands generated by ever-more connected devices that have diverse performance requirements in terms of throughput, latency, and reliability. This makes monitoring and managing the multitude of network elements intractable with existing tools and impractical for traditional machine learning algorithms that rely on hand-crafted feature engineering. In this context, embedding machine intelligence into mobile networks becomes necessary, as this enables systematic mining of valuable information from mobile big data and automatically uncovering correlations that would otherwise have been too difficult to extract by human experts. In particular, deep learning based solutions can automatically extract features from raw data, without human expertise. The performance of artificial intelligence (AI) has achieved in other domains draws unprecedented interest from both academia and industry in employing deep learning approaches to address technical challenges in mobile networks.
This thesis attacks important problems in the mobile networking area from various perspectives by harnessing recent advances in deep neural networks. As a preamble, we bridge the gap between deep learning and mobile networking by presenting a survey on the crossovers between the two areas. Secondly, we design dedicated deep learning architectures to forecast mobile traffic consumption at city scale. In particular, we tailor our deep neural network models to different mobile traffic data structures (i.e. data originating from urban grids and geospatial point-cloud antenna deployments) to deliver precise prediction. Next, we propose a mobile traffic super resolution (MTSR) technique to achieve coarse-to-fine grain transformations on mobile traffic measurements using generative adversarial network architectures. This can provide insightful knowledge to mobile operators about mobile traffic distribution, while effectively reducing the data post-processing overhead. Subsequently, the mobile traffic decomposition (MTD) technique is proposed to break the aggregated mobile traffic measurements into service-level time series, by using a deep learning based framework. With MTD, mobile operators can perform more efficient resource allocation for network slicing (i.e, the logical partitioning of physical infrastructure) and alleviate the privacy concerns that come with the extensive use of deep packet inspection. Finally, we study the robustness of network specific deep anomaly detectors with a realistic black-box threat model and propose reliable solutions for defending against attacks that seek to subvert existing network deep learning based intrusion detection systems (NIDS).
Lastly, based on the results obtained, we identify important research directions that are worth pursuing in the future, including (i) serving deep learning with massive high-quality data (ii) deep learning for spatio-temporal mobile data mining (iii) deep learning for geometric mobile data mining (iv) deep unsupervised learning in mobile networks, and (v) deep reinforcement learning for mobile network control. Overall, this thesis demonstrates that deep learning can underpin powerful tools that address data-driven problems in the mobile networking domain. With such intelligence, future mobile networks can be monitored and managed more effectively and thus higher user quality of experience can be guaranteed
RFold: RNA Secondary Structure Prediction with Decoupled Optimization
The secondary structure of ribonucleic acid (RNA) is more stable and
accessible in the cell than its tertiary structure, making it essential for
functional prediction. Although deep learning has shown promising results in
this field, current methods suffer from poor generalization and high
complexity. In this work, we present RFold, a simple yet effective RNA
secondary structure prediction in an end-to-end manner. RFold introduces a
decoupled optimization process that decomposes the vanilla constraint
satisfaction problem into row-wise and column-wise optimization, simplifying
the solving process while guaranteeing the validity of the output. Moreover,
RFold adopts attention maps as informative representations instead of designing
hand-crafted features. Extensive experiments demonstrate that RFold achieves
competitive performance and about eight times faster inference efficiency than
the state-of-the-art method. The code and Colab demo are available in
\href{http://github.com/A4Bio/RFold}{http://github.com/A4Bio/RFold}
Modelling the transcriptional regulation of androgen receptor in prostate cancer
Transcription of genes and production of proteins are essential functions of a normal cell. If disturbed, misregulation of crucial genes leads to aberrant cell behaviour and in some cases, leads to the development of diseased states such as cancer. One major transcriptional regulation tool involves the binding of transcription factor onto enhancer sequences that will encourage or repress transcription depending on the role of the transcription factor. In prostate cells, misregulation of the androgen receptor(AR), a key transcriptional regulator, leads to the development and maintenance of prostate cancer. Androgen receptor binds to numerous locations in the genome, but it is still unclear how and which other key transcription factors aid and repress AR-mediated transcription. Here I analyzed the data that contained the transcriptional activity of 4139 putative AR binding sites (ARBS) in the genome with and without the presence of hormone using the STARR-seq assay. Only a small fraction of ARBS showed significant differential expression when treated with hormone. To understand the underlying essential factors behind hormone-dependent behaviour, we developed both machine learning and biophysical models to identify active enhancers in prostate cancer cells. We also identify potentially crucial transcription factors for androgen-dependent behaviour and discuss the benefits and shortcomings of each modelling method
Recommended from our members
Efficient architectures and power modelling of multiresolution analysis algorithms on FPGA
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In the past two decades, there has been huge amount of interest in Multiresolution Analysis Algorithms (MAAs) and their applications. Processing some of their applications such as medical imaging are computationally intensive, power hungry and requires large amount of memory which cause a high demand for efficient algorithm implementation, low power architecture and acceleration. Recently, some MAAs such as Finite Ridgelet Transform (FRIT) Haar Wavelet Transform (HWT) are became very popular and they are suitable for a number of image processing applications such as detection of line singularities and contiguous edges, edge detection (useful for compression and feature detection), medical image denoising and segmentation. Efficient hardware implementation and acceleration of these algorithms particularly when addressing large problems are becoming very chal-lenging and consume lot of power which leads to a number of issues including mobility, reliability concerns. To overcome the computation problems, Field Programmable Gate Arrays (FPGAs) are the technology of choice for accelerating computationally intensive applications due to their high performance. Addressing the power issue requires optimi- sation and awareness at all level of abstractions in the design flow.
The most important achievements of the work presented in this thesis are summarised
here.
Two factorisation methodologies for HWT which are called HWT Factorisation Method1 and (HWTFM1) and HWT Factorasation Method2 (HWTFM2) have been explored to increase number of zeros and reduce hardware resources. In addition, two novel efficient and optimised architectures for proposed methodologies based on Distributed Arithmetic (DA) principles have been proposed. The evaluation of the architectural results have shown that the proposed architectures results have reduced the arithmetics calculation (additions/subtractions) by 33% and 25% respectively compared to direct implementa-tion of HWT and outperformed existing results in place. The proposed HWTFM2 is implemented on advanced and low power FPGA devices using Handel-C language. The FPGAs implementation results have outperformed other existing results in terms of area and maximum frequency. In addition, a novel efficient architecture for Finite Radon Trans-form (FRAT) has also been proposed. The proposed architecture is integrated with the developed HWT architecture to build an optimised architecture for FRIT. Strategies such as parallelism and pipelining have been deployed at the architectural level for efficient im-plementation on different FPGA devices. The proposed FRIT architecture performance has been evaluated and the results outperformed some other existing architecture in place. Both FRAT and FRIT architectures have been implemented on FPGAs using Handel-C language. The evaluation of both architectures have shown that the obtained results out-performed existing results in place by almost 10% in terms of frequency and area. The proposed architectures are also applied on image data (256 £ 256) and their Peak Signal to Noise Ratio (PSNR) is evaluated for quality purposes.
Two architectures for cyclic convolution based on systolic array using parallelism and pipelining which can be used as the main building block for the proposed FRIT architec-ture have been proposed. The first proposed architecture is a linear systolic array with pipelining process and the second architecture is a systolic array with parallel process. The second architecture reduces the number of registers by 42% compare to first architec-ture and both architectures outperformed other existing results in place. The proposed pipelined architecture has been implemented on different FPGA devices with vector size (N) 4,8,16,32 and word-length (W=8). The implementation results have shown a signifi-cant improvement and outperformed other existing results in place.
Ultimately, an in-depth evaluation of a high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called func-tional level power modelling approach have been presented. The mathematical techniques that form the basis of the proposed power modeling has been validated by a range of custom IP cores. The proposed power modelling is scalable, platform independent and compares favorably with existing approaches. A hybrid, top-down design flow paradigm integrating functional level power modelling with commercially available design tools for systematic optimisation of IP cores has also been developed. The in-depth evaluation of this tool enables us to observe the behavior of different custom IP cores in terms of power consumption and accuracy using different design methodologies and arithmetic techniques on virous FPGA platforms. Based on the results achieved, the proposed model accuracy is almost 99% true for all IP core's Dynamic Power (DP) components.Thomas Gerald Gray Charitable Trus
Identifying and Harnessing Concurrency for Parallel and Distributed Network Simulation
Although computer networks are inherently parallel systems, the parallel execution of network simulations on interconnected processors frequently yields only limited benefits. In this thesis, methods are proposed to estimate and understand the parallelization potential of network simulations. Further, mechanisms and architectures for exploiting the massively parallel processing resources of modern graphics cards to accelerate network simulations are proposed and evaluated
Identifying and Harnessing Concurrency for Parallel and Distributed Network Simulation
Although computer networks are inherently parallel systems, the parallel execution of network simulations on interconnected processors frequently yields only limited benefits. In this thesis, methods are proposed to estimate and understand the parallelization potential of network simulations. Further, mechanisms and architectures for exploiting the massively parallel processing resources of modern graphics cards to accelerate network simulations are proposed and evaluated