Search CORE

56 research outputs found

Learning reliability: a study on dindecisiveness in sample selection

Author: Barakova Emilia Ivanova,
Publication venue
Publication date: 01/01/1999
Field of study

Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

Author: Liao Qianli
Mhaskar Hrushikesh
Miranda Brando
Poggio Tomaso
Rosasco Lorenzo
Publication venue
Publication date: 01/01/2017
Field of study

The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage

arXiv.org e-Print Archive

DSpace@MIT

Caltech Authors

Archivio istituzionale della ricerca - Università di Genova

Regularized Neural Detection for One-Bit Massive MIMO Communication Systems

Author: Rao Bhaskar D.
Sant Aditya
Publication venue
Publication date: 26/05/2023
Field of study

Detection for one-bit massive MIMO systems presents several challenges especially for higher order constellations. Recent advances in both model-based analysis and deep learning frameworks have resulted in several robust one-bit detector designs. Our work builds on the current state-of-the-art gradient descent (GD)-based detector. We introduce two novel contributions in our detector design: (i) We augment each GD iteration with a deep learning-aided regularization step, and (ii) We introduce a novel constellation-based loss function for our regularized DNN detector. This one-bit detection strategy is applied to two different DNN architectures based on algorithm unrolling, namely, a deep unfolded neural network and a deep recurrent neural network. Being trained on multiple randomly sampled channel matrices, these networks are developed as general one-bit detectors. The numerical results show that the combination of the DNN-augmented regularized GD and constellation-based loss function improve the quality of our one-bit detector, especially for higher order M-QAM constellations.Comment: Initially submitted to IEEE TMLCN in October 202

arXiv.org e-Print Archive

Neuroverkon inferenssi digitaalisessa signaalikäsittelyssä kovien reaaliaikavaatimusten alaisuudessa

Author: Alonen Teemu
Publication venue
Publication date: 20/01/2020
Field of study

The main objective of this thesis is to investigate how neural network inference can be efficiently implemented on a digital signal processor under hard real-time constraints from the execution speed point of view. Theories on digital signal processors and software optimization as well as neural networks are discussed. A neural network model for the specific use case is designed and a digital signal processor implementation is created based on the neural network model. A neural network model for the use case is created based on the data from the Matlab simulation model. The neural network model is trained and validated using the Python programming language with the Keras package. The neural network model is implemented on the CEVA-XC4500 digital signal processor. The digital signal processor implementation is written in C++ language with the processor specific vector-processing intrinsics. The neural network model is evaluated based on the model accuracy, precision, recall and f1-score. The model performance is compared to the conventional use case implementation by calculating 3GPP specified metrics of misdetection probability, false alarm rate and bit error rate. The execution speed of the digital signal processor implementation is evaluated with the CEVA integrated development environment profiling tool and also with the Lauterbach PowerTrace profiling module attached to the real base station product. Through this thesis, an optimized CEVA-XC4500 digital signal processor implementation was created for the specific neural network architecture. The optimized implementation showed to consume 88 percent less cycles than the conventional implementation. Also, the neural network model performance fulfills the 3GPP specification requirements.Tämän diplomityön tarkoituksena on tutkia miten neuroverkon inferenssi voidaan toteuttaa tehokkaasti digitaalisella signaaliprosessorilla suoritusnopeuden näkökulmasta, kun sovelluksella on kovat reaaliaikavaatimukset. Työssä käsitellään teoriaa digitaalisista signaaliprosessoreista, ohjelmistojen optimoinnista ja neuroverkoista. Työssä kehitetään neuroverkkomalli tiettyyn käyttötapaukseen, ja mallin pohjalta luodaan toteutus digitaaliselle signaaliprosessorille. Neuroverkkomalli luodaan Matlab-simulointimallin avulla kerätystä datasta. Neuroverkkomalli opetetaan ja varmennetaan Python-ohjelmointikiellellä ja Keras-paketilla. Neuroverkkomalli toteutetaan CEVA-XC4500 digitaaliselle signaaliprosessorille. Digitaalisen signaaliprosessorin toteutus kirjoitetaan C++-ohjelmointikielellä ja prosessorikohtaisilla vektorilaskentaoperaatioilla. Neuroverkkomalli varmennetaan mallin tarkkuuden, precision-arvon, recall-arvon ja f1-arvon perusteella. Mallin suorituskykyä verrataan käyttötapauksen tavanomaiseen toteutukseen laskemalla 3GPP-spesifikaation mukaiset mittarit virhehavaintotodennäköisyys, väärien hälytysten lukumäärä ja bittivirhemäärä. Suoritusnopeus määritetään sekä CEVA-ohjelmointiympäristön profilointityökalulla että tukiasematuotteeseen kytketyllä Lauterbach PowerTrace-yksiköllä. Työn tuloksena luotiin optimoitu CEVA-XC4500 digitaalinen signaaliprosessoritoteutus valitulle neuroverkkoarkkitehtuurille. Optimoitu toteutus kulutti 88% vähemmän laskentasyklejä kuin tavanomainen toteutus. Neuroverkkomalli täytti 3GPP-spesifikaation mukaiset vaatimukset

Aaltodoc Publication Archive

Theory I: Why and When Can Deep Networks Avoid the Curse of Dimensionality?

Author: Liao Qianli
Mhaskar Hrushikesh
Miranda Brando
Poggio Tomaso
Rosasco Lorenzo
Publication venue: Center for Brains, Minds and Machines (CBMM), arXiv
Publication date: 01/01/2016
Field of study

[formerly titled "Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionality: a Review"] The paper reviews and extends an emerging body of theoretical results on deep learning including the conditions under which it can be exponentially better than shallow learning. A class of deep convolutional networks represent an important special case of these conditions, though weight sharing is not the main reason for their exponential advantage. Implications of a few key theorems are discussed, together with new results, open problems and conjectures.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216

DSpace@MIT

Archivio istituzionale della ricerca - Università di Genova

Doctor of Philosophy

Author: Mohammadabadi Sayed Mehdi Sajjadi
Publication venue: University of Utah
Publication date: 01/01/2017
Field of study

dissertationThe goal of machine learning is to develop efficient algorithms that use training data to create models that generalize well to unseen data. Learning algorithms can use labeled data, unlabeled data or both. Supervised learning algorithms learn a model using labeled data only. Unsupervised learning methods learn the internal structure of a dataset using only unlabeled data. Lastly, semisupervised learning is the task of finding a model using both labeled and unlabeled data. In this research work, we contribute to both supervised and semisupervised learning. We contribute to supervised learning by proposing an efficient high-dimensional space coverage scheme which is based on the disjunctive normal form. We use conjunctions of a set of half-spaces to create a set of convex polytopes. Disjunction of these polytopes can provide desirable coverage of space. Unlike traditional methods based on neural networks, we do not initialize the model parameters randomly. As a result, our model minimizes the risk of poor local minima and higher learning rates can be used which leads to faster convergence. We contribute to semisupervised learning by proposing 2 unsupervised loss functions that form the basis of a novel semisupervised learning method. The first loss function is called Mutual-Exclusivity. The motivation of this method is the observation that an optimal decision boundary lies between the manifolds of different classes where there are no or very few samples. Decision boundaries can be pushed away from training samples by maximizing their margin and it is not necessary to know the class labels of the samples to maximize the margin. The second loss is named Transformation/Stability and is based on the fact that the prediction of a classifier for a data sample should not change with respect to transformations and perturbations applied to that data sample. In addition, internal variations of a learning system should have little to no effect on the output. The proposed loss minimizes the variation in the prediction of the network for a specific data sample. We also show that the same technique can be used to improve the robustness of a learning model with respect to adversarial examples

The University of Utah: J. Willard Marriott Digital Library