180 research outputs found
A novel non-intrusive objective method to predict voice quality of service in LTE networks.
This research aimed to introduce a novel approach for non-intrusive objective
measurement of voice Quality of Service (QoS) in LTE networks. While achieving this aim, the thesis established a thorough knowledge of how voice traffic is
handled in LTE networks, the LTE network architecture and its similarities and
differences to its predecessors and traditional ground IP networks and most
importantly those QoS affecting parameters which are exclusive to LTE environments. Mean Opinion Score (MOS) is the scoring system used to measure
the QoS of voice traffic which can be measured subjectively (as originally intended). Subjective QoS measurement methods are costly and time-consuming,
therefore, objective methods such as Perceptual Evaluation of Speech Quality
(PESQ) were developed to address these limitations. These objective methods
have a high correlation with subjective MOS scores. However, they either require individual calculation of many network parameters or have an intrusive
nature that requires access to both the reference signal and the degraded signal
for comparison by software. Therefore, the current objective methods are not
suitable for application in real-time measurement and prediction scenarios.
A major contribution of the research was identifying LTE-specific QoS affecting parameters. There is no previous work that combines these parameters to
assess their impacts on QoS.
The experiment was configured in a hardware in the loop environment. This
configuration could serve as a platform for future research which requires simulation of voice traffic in LTE environments.
The key contribution of this research is a novel non-intrusive objective method
for QoS measurement and prediction using neural networks. A comparative
analysis is presented that examines the performance of four neural network
algorithms for non-intrusive measurement and prediction of voice quality over
LTE networks. In conclusion, the Bayesian Regularization algorithm with 4 neurons in the hidden layer and sigmoid symmetric transfer function was identified as the best solution with a Mean Square Error (MSE) rate of 0.001 and
regression value of 0.998 measured for the testing data set
Cellular and Wi-Fi technologies evolution: from complementarity to competition
This PhD thesis has the characteristic to span over a long time because while working on it, I was working as a research engineer at CTTC with highly demanding development duties. This has delayed the deposit more than I would have liked. On the other hand, this has given me the privilege of witnessing and studying how wireless technologies have been evolving over a decade from 4G to 5G and beyond.
When I started my PhD thesis, IEEE and 3GPP were defining the two main wireless technologies at the time, Wi-Fi and LTE, for covering two substantially complementary market targets. Wi-Fi was designed to operate mostly indoor, in unlicensed spectrum, and was aimed to be a simple and cheap technology. Its primary technology for coexistence was based on the assumption that the spectrum on which it was operating was for free, and so it was designed with interference avoidance through the famous CSMA/CA protocol. On the other hand, 3GPP was designing technologies for licensed spectrum, a costly kind of spectrum. As a result, LTE was designed to take the best advantage of it while providing the best QoE in mainly outdoor scenarios.
The PhD thesis starts in this context and evolves with these two technologies. In the first chapters, the thesis studies radio resource management solutions for standalone operation of Wi-Fi in unlicensed and LTE in licensed spectrum. We anticipated the now fundamental machine learning trend by working on machine learning-based radio resource management solutions to improve LTE and Wi-Fi operation in their respective spectrum. We pay particular attention to small cell deployments aimed at improving the spectrum efficiency in licensed spectrum, reproducing small range scenarios typical of Wi-Fi settings.
IEEE and 3GPP followed evolving the technologies over the years: Wi-Fi has grown into a much more complex and sophisticated technology, incorporating the key features of cellular technologies, like HARQ, OFDMA, MU-MIMO, MAC scheduling and spatial reuse. On the other hand, since Release 13, cellular networks have also been designed for unlicensed spectrum. As a result, the two last chapters of this thesis focus on coexistence scenarios, in which LTE needs to be designed to coexist with Wi-Fi fairly, and NR, the radio access for 5G, with Wi-Fi in 5 GHz and WiGig in 60 GHz. Unlike LTE, which was adapted to operate in unlicensed spectrum, NR-U is natively designed with this feature, including its capability to operate in unlicensed in a complete standalone fashion, a fundamental new milestone for cellular. In this context, our focus of analysis changes. We consider that these two technological families are no longer targeting complementarity but are now competing, and we claim that this will be the trend for the years to come.
To enable the research in these multi-RAT scenarios, another fundamental result of this PhD thesis, besides the scientific contributions, is the release of high fidelity models for LTE and NR and their coexistence with Wi-Fi and WiGig to the ns-3 open-source community. ns-3 is a popular open-source network simulator, with the characteristic to be multi-RAT and so naturally allows the evaluation of coexistence scenarios between different technologies. These models, for which I led the development, are by academic citations, the most used open-source simulation models for LTE and NR and havereceived fundings from industry (Ubiquisys, WFA, SpiderCloud, Interdigital, Facebook) and federal agencies (NIST, LLNL) over the years.Aquesta tesi doctoral té la característica d’allargar-se durant un llarg període de temps ja que mentre treballava en ella, treballava com a enginyera investigadora a CTTC amb tasques de desenvolupament molt exigents. Això ha endarrerit el dipositar-la més del que m’hagués agradat. D’altra banda, això m’ha donat el privilegi de ser testimoni i estudiar com han evolucionat les tecnologies sense fils durant més d’una dècada des del 4G fins al 5G i més enllà. Quan vaig començar la tesi doctoral, IEEE i 3GPP estaven definint les dues tecnologies sense fils principals en aquell moment, Wi-Fi i LTE, que cobreixen dos objectius de mercat substancialment complementaris. Wi-Fi va ser dissenyat per funcionar principalment en interiors, en espectre sense llicència, i pretenia ser una tecnologia senzilla i barata. La seva tecnologia primària per a la convivència es basava en el supòsit que l’espectre en el que estava operant era de franc, i, per tant, es va dissenyar simplement evitant interferències a través del famós protocol CSMA/CA. D’altra banda, 3GPP estava dissenyant tecnologies per a espectres amb llicència, un tipus d’espectre costós. Com a resultat, LTE està dissenyat per treure’n el màxim profit alhora que proporciona el millor QoE en escenaris principalment a l’aire lliure. La tesi doctoral comença amb aquest context i evoluciona amb aquestes dues tecnologies. En els primers capítols, estudiem solucions de gestió de recursos de radio per a operacions en espectre de Wi-Fi sense llicència i LTE amb llicència. Hem anticipat l’actual tendència fonamental d’aprenentatge automàtic treballant solucions de gestió de recursos de radio basades en l’aprenentatge automàtic per millorar l’LTE i Wi-Fi en el seu espectre respectiu. Prestem especial atenció als desplegaments de cèl·lules petites destinades a millorar la eficiència d’espectre llicenciat, reproduint escenaris de petit abast típics de la configuració Wi-Fi. IEEE i 3GPP van seguir evolucionant les tecnologies al llarg dels anys: El Wi-Fi s’ha convertit en una tecnologia molt més complexa i sofisticada, incorporant les característiques clau de les tecnologies cel·lulars, com ara HARQ i la reutilització espacial. D’altra banda, des de la versió 13, també s’han dissenyat xarxes cel·lulars per a espectre sense llicència. Com a resultat, els dos darrers capítols d’aquesta tesi es centren en aquests escenaris de convivència, on s’ha de dissenyar LTE per conviure amb la Wi-Fi de manera justa, i NR, l’accés a la radio per a 5G amb Wi-Fi a 5 GHz i WiGig a 60 GHz. A diferència de LTE, que es va adaptar per funcionar en espectre sense llicència, NR-U està dissenyat de forma nativa amb aquesta característica, inclosa la seva capacitat per operar sense llicència de forma autònoma completa, una nova fita fonamental per al mòbil. En aquest context, el nostre focus d’anàlisi canvia. Considerem que aquestes dues famílies de tecnologia ja no estan orientades cap a la complementarietat, sinó que ara competeixen, i afirmem que aquesta serà el tendència per als propers anys. Per permetre la investigació en aquests escenaris multi-RAT, un altre resultat fonamental d’aquesta tesi doctoral, a més de les aportacions científiques, és l’alliberament de models d’alta fidelitat per a LTE i NR i la seva coexistència amb Wi-Fi a la comunitat de codi obert ns-3. ns-3 és un popular simulador de xarxa de codi obert, amb la característica de ser multi-RAT i, per tant, permet l’avaluació de manera natural d’escenaris de convivència entre diferents tecnologies. Aquests models, pels quals he liderat el desenvolupament, són per cites acadèmiques, els models de simulació de codi obert més utilitzats per a LTE i NR i que han rebut finançament de la indústria (Ubiquisys, WFA, SpiderCloud, Interdigital, Facebook) i agències federals (NIST, LLNL) al llarg dels anys.Esta tesis doctoral tiene la característica de extenderse durante mucho tiempo porque mientras trabajaba en ella, trabajaba como ingeniera de investigación en CTTC con tareas de desarrollo muy exigentes. Esto ha retrasado el depósito más de lo que me hubiera gustado. Por otro lado,
gracias a ello, he tenido el privilegio de presenciar y estudiar como las tecnologías inalámbricas
han evolucionado durante una década, de 4G a 5G y más allá.
Cuando comencé mi tesis doctoral, IEEE y 3GPP estaban definiendo las dos principales
tecnologías inalámbricas en ese momento, Wi-Fi y LTE, cumpliendo dos objetivos de mercado
sustancialmente complementarios. Wi-Fi fue diseñado para funcionar principalmente en
interiores, en un espectro sin licencia, y estaba destinado a ser una tecnología simple y barata.
Su tecnología primaria para la convivencia se basaba en el supuesto en que el espectro en
el que estaba operando era gratis, y así fue diseñado simplemente evitando interferencias a
través del famoso protocolo CSMA/CA. Por otro lado, 3GPP estaba diseñando tecnologías
para espectro con licencia, un tipo de espectro costoso. Como resultado, LTE está diseñado
para aprovechar el espectro al máximo proporcionando al mismo tiempo el mejor QoE en
escenarios principalmente al aire libre.
La tesis doctoral parte de este contexto y evoluciona con estas dos tecnologías. En los
primeros capítulos, estudiamos las soluciones de gestión de recursos de radio para operación
en espectro Wi-Fi sin licencia y LTE con licencia. Anticipamos la tendencia ahora fundamental
de aprendizaje automático trabajando en soluciones de gestión de recursos de radio para
mejorar LTE y funcionamiento deWi-Fi en su respectivo espectro. Prestamos especial atención
a las implementaciones de células pequeñas destinadas a mejorar la eficiencia de espectro
licenciado, reproduciendo los típicos escenarios de rango pequeño de la configuración Wi-Fi.
IEEE y 3GPP siguieron evolucionando las tecnologías a lo largo de los años: Wi-Fi
se ha convertido en una tecnología mucho más compleja y sofisticada, incorporando las
características clave de las tecnologías celulares, como HARQ, OFDMA, MU-MIMO, MAC
scheduling y la reutilización espacial. Por otro lado, desde la Release 13, también se han
diseñado redes celulares para espectro sin licencia. Como resultado, los dos últimos capítulos
de esta tesis se centran en estos escenarios de convivencia, donde LTE debe diseñarse para
coexistir con Wi-Fi de manera justa, y NR, el acceso por radio para 5G con Wi-Fi en 5 GHz
y WiGig en 60 GHz. A diferencia de LTE, que se adaptó para operar en espectro sin licencia,
NR-U está diseñado de forma nativa con esta función, incluyendo su capacidad para operar
sin licencia de forma completamente independiente, un nuevo hito fundamental para los
celulares. En este contexto, cambia nuestro enfoque de análisis. Consideramos que estas dos
familias tecnológicas ya no tienen como objetivo la complementariedad, sino que ahora están
compitiendo, y afirmamos que esta será la tendencia para los próximos años.
Para permitir la investigación en estos escenarios de múltiples RAT, otro resultado fundamental
de esta tesis doctoral, además de los aportes científicos, es el lanzamiento de modelos de alta
fidelidad para LTE y NR y su coexistencia con Wi-Fi y WiGig a la comunidad de código
abierto de ns-3. ns-3 es un simulador popular de red de código abierto, con la característica
de ser multi-RAT y así, naturalmente, permite la evaluación de escenarios de convivencia
entre diferentes tecnologías. Estos modelos, para los cuales lideré el desarrollo, son por citas
académicas, los modelos de simulación de código abierto más utilizados para LTE y NR y
han recibido fondos de la industria (Ubiquisys, WFA, SpiderCloud, Interdigital, Facebook) y
agencias federales (NIST, LLNL) a lo largo de los años.Postprint (published version
JTIT
kwartalni
Novel Neural Network Applications to Mode Choice in Transportation: Estimating Value of Travel Time and Modelling Psycho-Attitudinal Factors
Whenever researchers wish to study the behaviour of individuals choosing among a set of alternatives, they usually rely on models based on the random utility theory, which postulates that the single individuals modify their behaviour so that they can maximise of their utility. These models, often identified as discrete choice models (DCMs), usually require the definition of the utilities for each alternative, by first identifying the variables influencing the decisions. Traditionally, DCMs focused on observable variables and treated users as optimizing tools with predetermined needs. However, such an approach is in contrast with the results from studies in social sciences which show that choice behaviour can be influenced by psychological factors such as attitudes and preferences. Recently there have been formulations of DCMs which include latent constructs for capturing the impact of subjective factors. These are called hybrid choice models or integrated choice and latent variable models (ICLV). However, DCMs are not exempt from issues, like, the fact that researchers have to choose the variables to include and their relations to define the utilities. This is probably one of the reasons which has recently lead to an influx of numerous studies using machine learning (ML) methods to study mode choice, in which researchers tried to find alternative methods to analyse travellers’ choice behaviour. A ML algorithm is any generic method that uses the data itself to understand and build a model, improving its performance the more it is allowed to learn. This means they do not require any a priori input or hypotheses on the structure and nature of the relationships between the several variables used as its inputs. ML models are usually considered black-box methods, but whenever researchers felt the need for interpretability of ML results, they tried to find alternative ways to use ML methods, like building them by using some a priori knowledge to induce specific constrains. Some researchers also transformed the outputs of ML algorithms so that they could be interpreted from an economic point of view, or built hybrid ML-DCM models. The object of this thesis is that of investigating the benefits and the disadvantages deriving from adopting either DCMs or ML methods to study the phenomenon of mode choice in transportation. The strongest feature of DCMs is the fact that they produce very precise and descriptive results, allowing for a thorough interpretation of their outputs. On the other hand, ML models offer a substantial benefit by being truly data-driven methods and thus learning most relations from the data itself. As a first contribution, we tested an alternative method for calculating the value of travel time (VTT) through the results of ML algorithms. VTT is a very informative parameter to consider, since the time consumed by individuals whenever they need to travel normally represents an undesirable factor, thus they are usually willing to exchange their money to reduce travel times. The method proposed is independent from the mode-choice functions, so it can be applied to econometric models and ML methods equally, if they allow the estimation of individual level probabilities. Another contribution of this thesis is a neural network (NN) for the estimation of choice models with latent variables as an alternative to DCMs. This issue arose from wanting to include in ML models not only level of service variables of the alternatives, and socio-economic attributes of the individuals, but also psycho-attitudinal indicators, to better describe the influence of psychological factors on choice behaviour. The results were estimated by using two different datasets. Since NN results are dependent on the values of their hyper-parameters and on their initialization, several NNs were estimated by using different hyper-parameters to find the optimal values, which were used to verify the stability of the results with different initializations
Next-Generation Self-Organizing Networks through a Machine Learning Approach
Fecha de lectura de Tesis Doctoral: 17 Diciembre 2018.Para reducir los costes de gestión de las redes celulares, que, con el tiempo, aumentaban en complejidad, surgió el concepto de las redes autoorganizadas, o self-organizing networks (SON). Es decir, la automatización de las tareas de gestión de una red celular para disminuir los costes de infraestructura (CAPEX) y de operación (OPEX).
Las tareas de las SON se dividen en tres categorías: autoconfiguración, autooptimización y autocuración. El objetivo de esta tesis es la mejora de las funciones SON a través del desarrollo y uso de herramientas de aprendizaje automático (machine learning, ML) para la gestión de la red. Por un lado, se aborda la autocuración a través de la propuesta de una novedosa herramienta para una diagnosis automática (RCA), consistente en la combinación de múltiples sistemas RCA independientes para el desarrollo de un sistema compuesto de RCA mejorado. A su vez, para aumentar la precisión de las herramientas de RCA mientras se reducen tanto el CAPEX como el OPEX, en esta tesis se proponen y evalúan herramientas de ML de reducción de dimensionalidad en combinación con herramientas de RCA.
Por otro lado, en esta tesis se estudian las funcionalidades multienlace dentro de la autooptimización y se proponen técnicas para su gestión automática. En el campo de las comunicaciones mejoradas de banda ancha, se propone una herramienta para la gestión de portadoras radio, que permite la implementación de políticas del operador, mientras que, en el campo de las comunicaciones vehiculares de baja latencia, se propone un mecanismo multicamino para la redirección del tráfico a través de múltiples interfaces radio.
Muchos de los métodos propuestos en esta tesis se han evaluado usando datos provenientes de redes celulares reales, lo que ha permitido demostrar su validez en entornos realistas, así como su capacidad para ser desplegados en redes móviles actuales y futuras
Mobility management in multi-RAT multiI-band heterogeneous networks
Support for user mobility is the raison d'etre of mobile cellular networks. However, mounting pressure for more capacity is leading to adaption of multi-band multi-RAT ultra-dense network design, particularly with the increased use of mmWave based small cells. While such design for emerging cellular networks is expected to offer manyfold more capacity, it gives rise to a new set of challenges in user mobility management. Among others, frequent handovers (HO) and thus higher impact of poor mobility management on quality of user experience (QoE) as well as link capacity, lack of an intelligent solution to manage dual connectivity (of user with both 4G and 5G cells) activation/deactivation, and mmWave cell discovery are the most critical challenges. In this dissertation, I propose and evaluate a set of solutions to address the aforementioned challenges.
The beginning outcome of our investigations into the aforementioned problems is the first ever taxonomy of mobility related 3GPP defined network parameters and Key Performance Indicators (KPIs) followed by a tutorial on 3GPP-based 5G mobility management procedures. The first major contribution of the thesis here is a novel framework to characterize the relationship between the 28 critical mobility-related network parameters and 8 most vital KPIs.
A critical hurdle in addressing all mobility related challenges in emerging networks is the complexity of modeling realistic mobility and HO process. Mathematical models are not suitable here as they cannot capture the dynamics as well as the myriad parameters and KPIs involved. Existing simulators also mostly either omit or overly abstract the HO and user mobility, chiefly because the problems caused by poor HO management had relatively less impact on overall performance in legacy networks as they were not multi-RAT multi-band and therefore incurred much smaller number of HOs compared to emerging networks. The second key contribution of this dissertation is development of a first of its kind system level simulator, called SyntheticNET that can help the research community in overcoming the hurdle of realistic mobility and HO process modeling. SyntheticNET is the very first python-based simulator that fully conforms to 3GPP Release 15 5G standard. Compared to the existing simulators, SyntheticNET includes a modular structure, flexible propagation modeling, adaptive numerology, realistic mobility patterns, and detailed HO evaluation criteria. SyntheticNET’s python-based platform allows the effective application of Artificial Intelligence (AI) to various network functionalities.
Another key challenge in emerging multi-RAT technologies is the lack of an intelligent solution to manage dual connectivity with 4G as well 5G cell needed by a user to access 5G infrastructure. The 3rd contribution of this thesis is a solution to address this challenge. I present a QoE-aware E-UTRAN New Radio-Dual Connectivity (EN-DC) activation scheme where AI is leveraged to develop a model that can accurately predict radio link failure (RLF) and voice muting using the low-level measurements collected from a real network. The insights from the AI based RLF and mute prediction models are then leveraged to configure sets of 3GPP parameters to maximize EN-DC activation while keeping the QoE-affecting RLF and mute anomalies to minimum.
The last contribution of this dissertation is a novel solution to address mmWave cell discovery problem. This problem stems from the highly directional nature of mmWave transmission. The proposed mmWave cell discovery scheme builds upon a joint search method where mmWave cells exploit an overlay coverage layer from macro cells sharing the UE location to the mmWave cell. The proposed scheme is made more practical by investigating and developing solutions for the data sparsity issue in model training. Ability to work with sparse data makes the proposed scheme feasible in realistic scenarios where user density is often not high enough to provide coverage reports from each bin of the coverage area. Simulation results show that the proposed scheme, efficiently activates EN-DC to a nearby mmWave 5G cell and thus substantially reduces the mmWave cell discovery failures compared to the state of the art cell discovery methods
Systems engineering approaches to safety in transport systems
openDuring driving, driver behavior monitoring may provide useful information to prevent road traffic accidents caused by driver distraction. It has been shown that 90% of road traffic accidents are due to human error and in 75% of these cases human error is the only cause. Car manufacturers have been interested in driver monitoring research for several years, aiming to enhance the general knowledge of driver behavior and to evaluate the functional state as it may drastically influence driving safety by distraction, fatigue, mental workload and attention. Fatigue and sleepiness at the wheel are well known risk factors for traffic accidents.
The Human Factor (HF) plays a fundamental role in modern transport systems. Drivers and transport operators control a vehicle towards its destination in according to their own sense, physical condition, experience and ability, and safety strongly relies on the HF which has to take the right decisions. On the other hand, we are experiencing a gradual shift towards increasingly autonomous vehicles where HF still constitutes an important component, but may in fact become the "weakest link of the chain", requiring strong and effective training feedback.
The studies that investigate the possibility to use biometrical or biophysical signals as data sources to evaluate the interaction between human brain activity and an electronic machine relate to the Human Machine Interface (HMI) framework. The HMI can acquire human signals to analyse the specific embedded structures and recognize the behavior of the subject during his/her interaction with the machine or with virtual interfaces as PCs or other communication systems. Based on my previous experience related to planning and monitoring of hazardous material transport, this work aims to create control models focused on driver behavior and changes of his/her physiological parameters. Three case studies have been considered using the interaction between an EEG system and external device, such as driving simulators or electronical components. A case study relates to the detection of the driver's behavior during a test driver. Another case study relates to the detection of driver's arm movements according to the data from the EEG during a driver test. The third case is the setting up of a Brain Computer Interface (BCI) model able to detect head movements in human participants by EEG signal and to control an electronic component according to the electrical brain activity due to head turning movements. Some videos showing the experimental results are available at https://www.youtube.com/channel/UCj55jjBwMTptBd2wcQMT2tg.openXXXIV CICLO - INFORMATICA E INGEGNERIA DEI SISTEMI/ COMPUTER SCIENCE AND SYSTEMS ENGINEERING - Ingegneria dei sistemiZero, Enric
Object tracking in video with TensorFlow
This Thesis[13] was born as collaboration between the BSC Computer Science Department
[5] and the UPC Image Processing Group [23], with the purpose to develop
an hybrid thesis on Deep Learning.
Nowadays, the interest around Machine Learning, is one of the fastest growing.
So far from the side of the BSC Computer Science Department [5], that mainly
uses his computational power for data mining and modelling analysis, the main
purpose was to verify the difficulty to adapt his infrastructure “asterix”, from the
GPU Center of Excellence at BSC/UPC [4], the to Deep Learning. Instead, from
the side of UPC IPG, there was the interest to test the environment developing a
model for Object Tracking in Video that was suitable for the ILSVRC VID challenge
[43]. To achieve the first goal and analyze the workload on the machine, I started
to become an active user of the TensorFlow [21] community, learning from posts
and blogs and I decided to implement a Virtual Environment that, led us to use
different dependencies and different versions of the library software, depending on
the model and purpose to reach. Till now, from the computer science point of view,
this environment was the best choice and the most useful experience to work with,
showing the easiness of use and implementation. I had some problems only with
third part libraries, specifics for the Visual Recognition area, like OpenCV.
To develop the model for VID challenge, I began learning the basic knowledge
for Deep Learning concepts, through on-line courses as the one of Stanford. Then
I passed to the deepest and complex knowledge regarding the Visual Recognition
topic, reading papers and understanding the main strategies and models that would
be useful to me later, during the development. The discovery of many new concepts
gave me enthusiasm and scared me at the same time, theory and practice were
complementary, but it wasn’t easy to pass from the first to the second one.
These latter were the most difficult part of the project, because it wasn’t enough
adapting my previous knowledge and programming skills to the new ones and mainly
to the TensorFlow Environment. The Python library due to its recent birth hasn’t
developed many models or components yet, as for others environments like Caffe [3]
or Theano[50], but the community interest is growing so fast that, luckily, I didn’t
had to start from scratch. I used some available models directly from Google [1]
and some GitHub Project, like TensorBox[44] from a Stanford Phd Student[45] and
tested others, like YOLO TF version[16].
The main components were some of the GitHub project I found, but none of
them left me withouth problems.
Due to the time constraints, I started trying to extend OverFeat (TensorBox
Project) [44] from single to multi class, spending efforts and many time trying to
make the difference, increasing the quality of the model, on which I made also some
important contribution, solving some of the main code pitfalls. At the end, the big
reverse engineering work I realized with the help of the author, it didn’t give the
expected results. So I had to change the main architecture composition, using other
strategies and introducing other redundant components to achieve a theoretically
still image detection model. I had to introduce a time and space analysis to correlate
results between frames and be more consistent in the detection and tracking of the
objects themselves. Starting form the modular architecture proposed by K. Kang et
al.[32], I decided to use the single class OverFeat implementation as General Object
detector, training it on the whole class set and followed it with other components.
After the General Detector, I implemented a Tracker & Smoother to be more
consistent in shape and motion during time and space on the whole frames set, using
the knowledge of Slow and Steady features analysis explained by Dinesh Jayaraman
and Kristen Grauman [31].
Inception component, the final one, is the most redundant module, because it’s
at the base of OverFeat architecture; but its use was the faster and only solution to
label easily each objects. Thanks to the available model [20] implemented by Google
[1], that it was trained on the ILSVRC Classification task, I had only to retrain it
on a really smaller class set, thirty instead of one hundred, a workload sustainable
by any personal computer available on the market. The complete architecture was
composed by three main components in the following order of connection: General
Detector, Tracker and Smoother, Inception.
Finally, I reached a working Environment and Model, that led me to submit
results for the challenge, evaluate the workload for the “asterix” infrastructure,
from GPU Center of Excellence at BSC/UPC, and prove how a Department can
adapt and develop a working Deep Learning Research and Development area in few
months.
The methodology I used could be defined Fast Learning and Fast Developing.
Since I had to start everything from scratch, first of all, the necessary basic theory
knowledge, after, the most complex and specific one and finally implement them in
the short time interval possible, wasn’t easy at all. These reasons according with the
time constraints, pushed me to learn and develop in the fastest possible way, using
tricks and tips and available components, saving time for the error solving. This
latter is a consistent part of my work, solving run and project problems, which took
me sometimes hours, sometimes entire days and once caused me a system-crash of
”asterix” infrastructure, suffering ten days of black out during August, due to the
summer period.
The model reached the last position in the VID ILSVRC competition, because
of its low mAP results. As I will explain, the source of these results is the first
component, that afterwards hasn’t modules able to boost its accuracy; at the same
time I will highlight how a different order of components and better implementation
of them, for example a trainable Tracker plus Smoother, can be the starting
improvements for this first draft work. Moreover, some other precautions can be
taken on the Dataset for the train of single components, boosting their accuracy; I only trained the provided Train database, without using tricks and tips on it.
The thesis goals were fully reached in the best way I could, solving and walking
through a path full of pitfalls and problems, which made my project harder to end
Recommended from our members
Improving next-generation wireless network performance and reliability with deep learning
A rudimentary question whether machine learning in general, or deep learning in particular, could add to the well-established field of wireless communications, which has been evolving for close to a century, is often raised. While the use of deep learning based methods is likely to help build intelligent wireless solutions, this use becomes particularly challenging for the lower layers in the wireless communication stack. The introduction of the fifth generation of wireless communications (5G) has triggered the demand for “network intelligence” to support its promises for very high data rates and extremely low latency. Consequently, 5G wireless operators are faced with the challenges of network complexity, diversification of services, and personalized user experience. Industry standards have created enablers (such as the network data analytics function), but these enablers focus on post-mortem analysis at higher stack layers and have a periodicity in the time scale of seconds (or larger). The goal of this dissertation is to show a solution for these challenges and how a data-driven approach using deep learning could add to the field of wireless communications. In particular, I propose intelligent predictive and prescriptive abilities to boost reliability and eliminate performance bottlenecks in 5G cellular networks and beyond, show contributions that justify the value of deep learning in wireless communications across several different layers, and offer in-depth analysis and comparisons with baselines and industry standards. First, to improve multi-antenna network reliability against wireless impairments with power control and interference coordination for both packetized voice and beamformed data bearers, I propose the use of a joint beamforming, power control, and interference coordination algorithm based on deep reinforcement learning. This algorithm uses a string of bits and logic operations to enable simultaneous actions to be performed by the reinforcement learning agent. Consequently, a joint reward function is also proposed. I compare the performance of my proposed algorithm with the brute force approach and show that similar performance is achievable but with faster run-time as the number of transmit antennas increases. Second, in enhancing the performance of coordinated multipoint, I propose the use of deep learning binary classification to learn a surrogate function to trigger a second transmission stream instead of depending on the popular signal to interference plus noise measurement quantity. This surrogate function improves the users' sum-rate through focusing on pre-logarithmic terms in the sum-rate formula, which have larger impact on this rate. Third, performance of band switching can be improved without the need for a full channel estimation. My proposal of using deep learning to classify the quality of two frequency bands prior to granting the band switching leads to a significant improvement in users' throughput. This is due to the elimination of the industry standard measurement gap requirement—a period of silence where no data is sent to the users so they could measure the frequency bands before switching. In this dissertation, a group of algorithms for wireless network performance and reliability for downlink are proposed. My results show that the introduction of user coordinates enhance the accuracy of the predictions made with deep learning. Also, the choice of signal to interference plus noise ratio as the optimization objective may not always be the best choice to improve user throughput rates. Further, exploiting the spatial correlation of channels in different frequency bands can improve certain network procedures without the need for perfect knowledge of the per-band channel state information. Hence, an understanding of these results help develop novel solutions to enhancing these wireless networks at a much smaller time scale compared to the industry standards todayElectrical and Computer Engineerin
- …