180 research outputs found

    A novel non-intrusive objective method to predict voice quality of service in LTE networks.

    Get PDF
    This research aimed to introduce a novel approach for non-intrusive objective measurement of voice Quality of Service (QoS) in LTE networks. While achieving this aim, the thesis established a thorough knowledge of how voice traffic is handled in LTE networks, the LTE network architecture and its similarities and differences to its predecessors and traditional ground IP networks and most importantly those QoS affecting parameters which are exclusive to LTE environments. Mean Opinion Score (MOS) is the scoring system used to measure the QoS of voice traffic which can be measured subjectively (as originally intended). Subjective QoS measurement methods are costly and time-consuming, therefore, objective methods such as Perceptual Evaluation of Speech Quality (PESQ) were developed to address these limitations. These objective methods have a high correlation with subjective MOS scores. However, they either require individual calculation of many network parameters or have an intrusive nature that requires access to both the reference signal and the degraded signal for comparison by software. Therefore, the current objective methods are not suitable for application in real-time measurement and prediction scenarios. A major contribution of the research was identifying LTE-specific QoS affecting parameters. There is no previous work that combines these parameters to assess their impacts on QoS. The experiment was configured in a hardware in the loop environment. This configuration could serve as a platform for future research which requires simulation of voice traffic in LTE environments. The key contribution of this research is a novel non-intrusive objective method for QoS measurement and prediction using neural networks. A comparative analysis is presented that examines the performance of four neural network algorithms for non-intrusive measurement and prediction of voice quality over LTE networks. In conclusion, the Bayesian Regularization algorithm with 4 neurons in the hidden layer and sigmoid symmetric transfer function was identified as the best solution with a Mean Square Error (MSE) rate of 0.001 and regression value of 0.998 measured for the testing data set

    Cellular and Wi-Fi technologies evolution: from complementarity to competition

    Get PDF
    This PhD thesis has the characteristic to span over a long time because while working on it, I was working as a research engineer at CTTC with highly demanding development duties. This has delayed the deposit more than I would have liked. On the other hand, this has given me the privilege of witnessing and studying how wireless technologies have been evolving over a decade from 4G to 5G and beyond. When I started my PhD thesis, IEEE and 3GPP were defining the two main wireless technologies at the time, Wi-Fi and LTE, for covering two substantially complementary market targets. Wi-Fi was designed to operate mostly indoor, in unlicensed spectrum, and was aimed to be a simple and cheap technology. Its primary technology for coexistence was based on the assumption that the spectrum on which it was operating was for free, and so it was designed with interference avoidance through the famous CSMA/CA protocol. On the other hand, 3GPP was designing technologies for licensed spectrum, a costly kind of spectrum. As a result, LTE was designed to take the best advantage of it while providing the best QoE in mainly outdoor scenarios. The PhD thesis starts in this context and evolves with these two technologies. In the first chapters, the thesis studies radio resource management solutions for standalone operation of Wi-Fi in unlicensed and LTE in licensed spectrum. We anticipated the now fundamental machine learning trend by working on machine learning-based radio resource management solutions to improve LTE and Wi-Fi operation in their respective spectrum. We pay particular attention to small cell deployments aimed at improving the spectrum efficiency in licensed spectrum, reproducing small range scenarios typical of Wi-Fi settings. IEEE and 3GPP followed evolving the technologies over the years: Wi-Fi has grown into a much more complex and sophisticated technology, incorporating the key features of cellular technologies, like HARQ, OFDMA, MU-MIMO, MAC scheduling and spatial reuse. On the other hand, since Release 13, cellular networks have also been designed for unlicensed spectrum. As a result, the two last chapters of this thesis focus on coexistence scenarios, in which LTE needs to be designed to coexist with Wi-Fi fairly, and NR, the radio access for 5G, with Wi-Fi in 5 GHz and WiGig in 60 GHz. Unlike LTE, which was adapted to operate in unlicensed spectrum, NR-U is natively designed with this feature, including its capability to operate in unlicensed in a complete standalone fashion, a fundamental new milestone for cellular. In this context, our focus of analysis changes. We consider that these two technological families are no longer targeting complementarity but are now competing, and we claim that this will be the trend for the years to come. To enable the research in these multi-RAT scenarios, another fundamental result of this PhD thesis, besides the scientific contributions, is the release of high fidelity models for LTE and NR and their coexistence with Wi-Fi and WiGig to the ns-3 open-source community. ns-3 is a popular open-source network simulator, with the characteristic to be multi-RAT and so naturally allows the evaluation of coexistence scenarios between different technologies. These models, for which I led the development, are by academic citations, the most used open-source simulation models for LTE and NR and havereceived fundings from industry (Ubiquisys, WFA, SpiderCloud, Interdigital, Facebook) and federal agencies (NIST, LLNL) over the years.Aquesta tesi doctoral té la característica d’allargar-se durant un llarg període de temps ja que mentre treballava en ella, treballava com a enginyera investigadora a CTTC amb tasques de desenvolupament molt exigents. Això ha endarrerit el dipositar-la més del que m’hagués agradat. D’altra banda, això m’ha donat el privilegi de ser testimoni i estudiar com han evolucionat les tecnologies sense fils durant més d’una dècada des del 4G fins al 5G i més enllà. Quan vaig començar la tesi doctoral, IEEE i 3GPP estaven definint les dues tecnologies sense fils principals en aquell moment, Wi-Fi i LTE, que cobreixen dos objectius de mercat substancialment complementaris. Wi-Fi va ser dissenyat per funcionar principalment en interiors, en espectre sense llicència, i pretenia ser una tecnologia senzilla i barata. La seva tecnologia primària per a la convivència es basava en el supòsit que l’espectre en el que estava operant era de franc, i, per tant, es va dissenyar simplement evitant interferències a través del famós protocol CSMA/CA. D’altra banda, 3GPP estava dissenyant tecnologies per a espectres amb llicència, un tipus d’espectre costós. Com a resultat, LTE està dissenyat per treure’n el màxim profit alhora que proporciona el millor QoE en escenaris principalment a l’aire lliure. La tesi doctoral comença amb aquest context i evoluciona amb aquestes dues tecnologies. En els primers capítols, estudiem solucions de gestió de recursos de radio per a operacions en espectre de Wi-Fi sense llicència i LTE amb llicència. Hem anticipat l’actual tendència fonamental d’aprenentatge automàtic treballant solucions de gestió de recursos de radio basades en l’aprenentatge automàtic per millorar l’LTE i Wi-Fi en el seu espectre respectiu. Prestem especial atenció als desplegaments de cèl·lules petites destinades a millorar la eficiència d’espectre llicenciat, reproduint escenaris de petit abast típics de la configuració Wi-Fi. IEEE i 3GPP van seguir evolucionant les tecnologies al llarg dels anys: El Wi-Fi s’ha convertit en una tecnologia molt més complexa i sofisticada, incorporant les característiques clau de les tecnologies cel·lulars, com ara HARQ i la reutilització espacial. D’altra banda, des de la versió 13, també s’han dissenyat xarxes cel·lulars per a espectre sense llicència. Com a resultat, els dos darrers capítols d’aquesta tesi es centren en aquests escenaris de convivència, on s’ha de dissenyar LTE per conviure amb la Wi-Fi de manera justa, i NR, l’accés a la radio per a 5G amb Wi-Fi a 5 GHz i WiGig a 60 GHz. A diferència de LTE, que es va adaptar per funcionar en espectre sense llicència, NR-U està dissenyat de forma nativa amb aquesta característica, inclosa la seva capacitat per operar sense llicència de forma autònoma completa, una nova fita fonamental per al mòbil. En aquest context, el nostre focus d’anàlisi canvia. Considerem que aquestes dues famílies de tecnologia ja no estan orientades cap a la complementarietat, sinó que ara competeixen, i afirmem que aquesta serà el tendència per als propers anys. Per permetre la investigació en aquests escenaris multi-RAT, un altre resultat fonamental d’aquesta tesi doctoral, a més de les aportacions científiques, és l’alliberament de models d’alta fidelitat per a LTE i NR i la seva coexistència amb Wi-Fi a la comunitat de codi obert ns-3. ns-3 és un popular simulador de xarxa de codi obert, amb la característica de ser multi-RAT i, per tant, permet l’avaluació de manera natural d’escenaris de convivència entre diferents tecnologies. Aquests models, pels quals he liderat el desenvolupament, són per cites acadèmiques, els models de simulació de codi obert més utilitzats per a LTE i NR i que han rebut finançament de la indústria (Ubiquisys, WFA, SpiderCloud, Interdigital, Facebook) i agències federals (NIST, LLNL) al llarg dels anys.Esta tesis doctoral tiene la característica de extenderse durante mucho tiempo porque mientras trabajaba en ella, trabajaba como ingeniera de investigación en CTTC con tareas de desarrollo muy exigentes. Esto ha retrasado el depósito más de lo que me hubiera gustado. Por otro lado, gracias a ello, he tenido el privilegio de presenciar y estudiar como las tecnologías inalámbricas han evolucionado durante una década, de 4G a 5G y más allá. Cuando comencé mi tesis doctoral, IEEE y 3GPP estaban definiendo las dos principales tecnologías inalámbricas en ese momento, Wi-Fi y LTE, cumpliendo dos objetivos de mercado sustancialmente complementarios. Wi-Fi fue diseñado para funcionar principalmente en interiores, en un espectro sin licencia, y estaba destinado a ser una tecnología simple y barata. Su tecnología primaria para la convivencia se basaba en el supuesto en que el espectro en el que estaba operando era gratis, y así fue diseñado simplemente evitando interferencias a través del famoso protocolo CSMA/CA. Por otro lado, 3GPP estaba diseñando tecnologías para espectro con licencia, un tipo de espectro costoso. Como resultado, LTE está diseñado para aprovechar el espectro al máximo proporcionando al mismo tiempo el mejor QoE en escenarios principalmente al aire libre. La tesis doctoral parte de este contexto y evoluciona con estas dos tecnologías. En los primeros capítulos, estudiamos las soluciones de gestión de recursos de radio para operación en espectro Wi-Fi sin licencia y LTE con licencia. Anticipamos la tendencia ahora fundamental de aprendizaje automático trabajando en soluciones de gestión de recursos de radio para mejorar LTE y funcionamiento deWi-Fi en su respectivo espectro. Prestamos especial atención a las implementaciones de células pequeñas destinadas a mejorar la eficiencia de espectro licenciado, reproduciendo los típicos escenarios de rango pequeño de la configuración Wi-Fi. IEEE y 3GPP siguieron evolucionando las tecnologías a lo largo de los años: Wi-Fi se ha convertido en una tecnología mucho más compleja y sofisticada, incorporando las características clave de las tecnologías celulares, como HARQ, OFDMA, MU-MIMO, MAC scheduling y la reutilización espacial. Por otro lado, desde la Release 13, también se han diseñado redes celulares para espectro sin licencia. Como resultado, los dos últimos capítulos de esta tesis se centran en estos escenarios de convivencia, donde LTE debe diseñarse para coexistir con Wi-Fi de manera justa, y NR, el acceso por radio para 5G con Wi-Fi en 5 GHz y WiGig en 60 GHz. A diferencia de LTE, que se adaptó para operar en espectro sin licencia, NR-U está diseñado de forma nativa con esta función, incluyendo su capacidad para operar sin licencia de forma completamente independiente, un nuevo hito fundamental para los celulares. En este contexto, cambia nuestro enfoque de análisis. Consideramos que estas dos familias tecnológicas ya no tienen como objetivo la complementariedad, sino que ahora están compitiendo, y afirmamos que esta será la tendencia para los próximos años. Para permitir la investigación en estos escenarios de múltiples RAT, otro resultado fundamental de esta tesis doctoral, además de los aportes científicos, es el lanzamiento de modelos de alta fidelidad para LTE y NR y su coexistencia con Wi-Fi y WiGig a la comunidad de código abierto de ns-3. ns-3 es un simulador popular de red de código abierto, con la característica de ser multi-RAT y así, naturalmente, permite la evaluación de escenarios de convivencia entre diferentes tecnologías. Estos modelos, para los cuales lideré el desarrollo, son por citas académicas, los modelos de simulación de código abierto más utilizados para LTE y NR y han recibido fondos de la industria (Ubiquisys, WFA, SpiderCloud, Interdigital, Facebook) y agencias federales (NIST, LLNL) a lo largo de los años.Postprint (published version

    JTIT

    Get PDF
    kwartalni

    Novel Neural Network Applications to Mode Choice in Transportation: Estimating Value of Travel Time and Modelling Psycho-Attitudinal Factors

    Get PDF
    Whenever researchers wish to study the behaviour of individuals choosing among a set of alternatives, they usually rely on models based on the random utility theory, which postulates that the single individuals modify their behaviour so that they can maximise of their utility. These models, often identified as discrete choice models (DCMs), usually require the definition of the utilities for each alternative, by first identifying the variables influencing the decisions. Traditionally, DCMs focused on observable variables and treated users as optimizing tools with predetermined needs. However, such an approach is in contrast with the results from studies in social sciences which show that choice behaviour can be influenced by psychological factors such as attitudes and preferences. Recently there have been formulations of DCMs which include latent constructs for capturing the impact of subjective factors. These are called hybrid choice models or integrated choice and latent variable models (ICLV). However, DCMs are not exempt from issues, like, the fact that researchers have to choose the variables to include and their relations to define the utilities. This is probably one of the reasons which has recently lead to an influx of numerous studies using machine learning (ML) methods to study mode choice, in which researchers tried to find alternative methods to analyse travellers’ choice behaviour. A ML algorithm is any generic method that uses the data itself to understand and build a model, improving its performance the more it is allowed to learn. This means they do not require any a priori input or hypotheses on the structure and nature of the relationships between the several variables used as its inputs. ML models are usually considered black-box methods, but whenever researchers felt the need for interpretability of ML results, they tried to find alternative ways to use ML methods, like building them by using some a priori knowledge to induce specific constrains. Some researchers also transformed the outputs of ML algorithms so that they could be interpreted from an economic point of view, or built hybrid ML-DCM models. The object of this thesis is that of investigating the benefits and the disadvantages deriving from adopting either DCMs or ML methods to study the phenomenon of mode choice in transportation. The strongest feature of DCMs is the fact that they produce very precise and descriptive results, allowing for a thorough interpretation of their outputs. On the other hand, ML models offer a substantial benefit by being truly data-driven methods and thus learning most relations from the data itself. As a first contribution, we tested an alternative method for calculating the value of travel time (VTT) through the results of ML algorithms. VTT is a very informative parameter to consider, since the time consumed by individuals whenever they need to travel normally represents an undesirable factor, thus they are usually willing to exchange their money to reduce travel times. The method proposed is independent from the mode-choice functions, so it can be applied to econometric models and ML methods equally, if they allow the estimation of individual level probabilities. Another contribution of this thesis is a neural network (NN) for the estimation of choice models with latent variables as an alternative to DCMs. This issue arose from wanting to include in ML models not only level of service variables of the alternatives, and socio-economic attributes of the individuals, but also psycho-attitudinal indicators, to better describe the influence of psychological factors on choice behaviour. The results were estimated by using two different datasets. Since NN results are dependent on the values of their hyper-parameters and on their initialization, several NNs were estimated by using different hyper-parameters to find the optimal values, which were used to verify the stability of the results with different initializations

    Next-Generation Self-Organizing Networks through a Machine Learning Approach

    Get PDF
    Fecha de lectura de Tesis Doctoral: 17 Diciembre 2018.Para reducir los costes de gestión de las redes celulares, que, con el tiempo, aumentaban en complejidad, surgió el concepto de las redes autoorganizadas, o self-organizing networks (SON). Es decir, la automatización de las tareas de gestión de una red celular para disminuir los costes de infraestructura (CAPEX) y de operación (OPEX). Las tareas de las SON se dividen en tres categorías: autoconfiguración, autooptimización y autocuración. El objetivo de esta tesis es la mejora de las funciones SON a través del desarrollo y uso de herramientas de aprendizaje automático (machine learning, ML) para la gestión de la red. Por un lado, se aborda la autocuración a través de la propuesta de una novedosa herramienta para una diagnosis automática (RCA), consistente en la combinación de múltiples sistemas RCA independientes para el desarrollo de un sistema compuesto de RCA mejorado. A su vez, para aumentar la precisión de las herramientas de RCA mientras se reducen tanto el CAPEX como el OPEX, en esta tesis se proponen y evalúan herramientas de ML de reducción de dimensionalidad en combinación con herramientas de RCA. Por otro lado, en esta tesis se estudian las funcionalidades multienlace dentro de la autooptimización y se proponen técnicas para su gestión automática. En el campo de las comunicaciones mejoradas de banda ancha, se propone una herramienta para la gestión de portadoras radio, que permite la implementación de políticas del operador, mientras que, en el campo de las comunicaciones vehiculares de baja latencia, se propone un mecanismo multicamino para la redirección del tráfico a través de múltiples interfaces radio. Muchos de los métodos propuestos en esta tesis se han evaluado usando datos provenientes de redes celulares reales, lo que ha permitido demostrar su validez en entornos realistas, así como su capacidad para ser desplegados en redes móviles actuales y futuras

    Mobility management in multi-RAT multiI-band heterogeneous networks

    Get PDF
    Support for user mobility is the raison d'etre of mobile cellular networks. However, mounting pressure for more capacity is leading to adaption of multi-band multi-RAT ultra-dense network design, particularly with the increased use of mmWave based small cells. While such design for emerging cellular networks is expected to offer manyfold more capacity, it gives rise to a new set of challenges in user mobility management. Among others, frequent handovers (HO) and thus higher impact of poor mobility management on quality of user experience (QoE) as well as link capacity, lack of an intelligent solution to manage dual connectivity (of user with both 4G and 5G cells) activation/deactivation, and mmWave cell discovery are the most critical challenges. In this dissertation, I propose and evaluate a set of solutions to address the aforementioned challenges. The beginning outcome of our investigations into the aforementioned problems is the first ever taxonomy of mobility related 3GPP defined network parameters and Key Performance Indicators (KPIs) followed by a tutorial on 3GPP-based 5G mobility management procedures. The first major contribution of the thesis here is a novel framework to characterize the relationship between the 28 critical mobility-related network parameters and 8 most vital KPIs. A critical hurdle in addressing all mobility related challenges in emerging networks is the complexity of modeling realistic mobility and HO process. Mathematical models are not suitable here as they cannot capture the dynamics as well as the myriad parameters and KPIs involved. Existing simulators also mostly either omit or overly abstract the HO and user mobility, chiefly because the problems caused by poor HO management had relatively less impact on overall performance in legacy networks as they were not multi-RAT multi-band and therefore incurred much smaller number of HOs compared to emerging networks. The second key contribution of this dissertation is development of a first of its kind system level simulator, called SyntheticNET that can help the research community in overcoming the hurdle of realistic mobility and HO process modeling. SyntheticNET is the very first python-based simulator that fully conforms to 3GPP Release 15 5G standard. Compared to the existing simulators, SyntheticNET includes a modular structure, flexible propagation modeling, adaptive numerology, realistic mobility patterns, and detailed HO evaluation criteria. SyntheticNET’s python-based platform allows the effective application of Artificial Intelligence (AI) to various network functionalities. Another key challenge in emerging multi-RAT technologies is the lack of an intelligent solution to manage dual connectivity with 4G as well 5G cell needed by a user to access 5G infrastructure. The 3rd contribution of this thesis is a solution to address this challenge. I present a QoE-aware E-UTRAN New Radio-Dual Connectivity (EN-DC) activation scheme where AI is leveraged to develop a model that can accurately predict radio link failure (RLF) and voice muting using the low-level measurements collected from a real network. The insights from the AI based RLF and mute prediction models are then leveraged to configure sets of 3GPP parameters to maximize EN-DC activation while keeping the QoE-affecting RLF and mute anomalies to minimum. The last contribution of this dissertation is a novel solution to address mmWave cell discovery problem. This problem stems from the highly directional nature of mmWave transmission. The proposed mmWave cell discovery scheme builds upon a joint search method where mmWave cells exploit an overlay coverage layer from macro cells sharing the UE location to the mmWave cell. The proposed scheme is made more practical by investigating and developing solutions for the data sparsity issue in model training. Ability to work with sparse data makes the proposed scheme feasible in realistic scenarios where user density is often not high enough to provide coverage reports from each bin of the coverage area. Simulation results show that the proposed scheme, efficiently activates EN-DC to a nearby mmWave 5G cell and thus substantially reduces the mmWave cell discovery failures compared to the state of the art cell discovery methods

    Systems engineering approaches to safety in transport systems

    Get PDF
    openDuring driving, driver behavior monitoring may provide useful information to prevent road traffic accidents caused by driver distraction. It has been shown that 90% of road traffic accidents are due to human error and in 75% of these cases human error is the only cause. Car manufacturers have been interested in driver monitoring research for several years, aiming to enhance the general knowledge of driver behavior and to evaluate the functional state as it may drastically influence driving safety by distraction, fatigue, mental workload and attention. Fatigue and sleepiness at the wheel are well known risk factors for traffic accidents. The Human Factor (HF) plays a fundamental role in modern transport systems. Drivers and transport operators control a vehicle towards its destination in according to their own sense, physical condition, experience and ability, and safety strongly relies on the HF which has to take the right decisions. On the other hand, we are experiencing a gradual shift towards increasingly autonomous vehicles where HF still constitutes an important component, but may in fact become the "weakest link of the chain", requiring strong and effective training feedback. The studies that investigate the possibility to use biometrical or biophysical signals as data sources to evaluate the interaction between human brain activity and an electronic machine relate to the Human Machine Interface (HMI) framework. The HMI can acquire human signals to analyse the specific embedded structures and recognize the behavior of the subject during his/her interaction with the machine or with virtual interfaces as PCs or other communication systems. Based on my previous experience related to planning and monitoring of hazardous material transport, this work aims to create control models focused on driver behavior and changes of his/her physiological parameters. Three case studies have been considered using the interaction between an EEG system and external device, such as driving simulators or electronical components. A case study relates to the detection of the driver's behavior during a test driver. Another case study relates to the detection of driver's arm movements according to the data from the EEG during a driver test. The third case is the setting up of a Brain Computer Interface (BCI) model able to detect head movements in human participants by EEG signal and to control an electronic component according to the electrical brain activity due to head turning movements. Some videos showing the experimental results are available at https://www.youtube.com/channel/UCj55jjBwMTptBd2wcQMT2tg.openXXXIV CICLO - INFORMATICA E INGEGNERIA DEI SISTEMI/ COMPUTER SCIENCE AND SYSTEMS ENGINEERING - Ingegneria dei sistemiZero, Enric

    Object tracking in video with TensorFlow

    Get PDF
    This Thesis[13] was born as collaboration between the BSC Computer Science Department [5] and the UPC Image Processing Group [23], with the purpose to develop an hybrid thesis on Deep Learning. Nowadays, the interest around Machine Learning, is one of the fastest growing. So far from the side of the BSC Computer Science Department [5], that mainly uses his computational power for data mining and modelling analysis, the main purpose was to verify the difficulty to adapt his infrastructure “asterix”, from the GPU Center of Excellence at BSC/UPC [4], the to Deep Learning. Instead, from the side of UPC IPG, there was the interest to test the environment developing a model for Object Tracking in Video that was suitable for the ILSVRC VID challenge [43]. To achieve the first goal and analyze the workload on the machine, I started to become an active user of the TensorFlow [21] community, learning from posts and blogs and I decided to implement a Virtual Environment that, led us to use different dependencies and different versions of the library software, depending on the model and purpose to reach. Till now, from the computer science point of view, this environment was the best choice and the most useful experience to work with, showing the easiness of use and implementation. I had some problems only with third part libraries, specifics for the Visual Recognition area, like OpenCV. To develop the model for VID challenge, I began learning the basic knowledge for Deep Learning concepts, through on-line courses as the one of Stanford. Then I passed to the deepest and complex knowledge regarding the Visual Recognition topic, reading papers and understanding the main strategies and models that would be useful to me later, during the development. The discovery of many new concepts gave me enthusiasm and scared me at the same time, theory and practice were complementary, but it wasn’t easy to pass from the first to the second one. These latter were the most difficult part of the project, because it wasn’t enough adapting my previous knowledge and programming skills to the new ones and mainly to the TensorFlow Environment. The Python library due to its recent birth hasn’t developed many models or components yet, as for others environments like Caffe [3] or Theano[50], but the community interest is growing so fast that, luckily, I didn’t had to start from scratch. I used some available models directly from Google [1] and some GitHub Project, like TensorBox[44] from a Stanford Phd Student[45] and tested others, like YOLO TF version[16]. The main components were some of the GitHub project I found, but none of them left me withouth problems. Due to the time constraints, I started trying to extend OverFeat (TensorBox Project) [44] from single to multi class, spending efforts and many time trying to make the difference, increasing the quality of the model, on which I made also some important contribution, solving some of the main code pitfalls. At the end, the big reverse engineering work I realized with the help of the author, it didn’t give the expected results. So I had to change the main architecture composition, using other strategies and introducing other redundant components to achieve a theoretically still image detection model. I had to introduce a time and space analysis to correlate results between frames and be more consistent in the detection and tracking of the objects themselves. Starting form the modular architecture proposed by K. Kang et al.[32], I decided to use the single class OverFeat implementation as General Object detector, training it on the whole class set and followed it with other components. After the General Detector, I implemented a Tracker & Smoother to be more consistent in shape and motion during time and space on the whole frames set, using the knowledge of Slow and Steady features analysis explained by Dinesh Jayaraman and Kristen Grauman [31]. Inception component, the final one, is the most redundant module, because it’s at the base of OverFeat architecture; but its use was the faster and only solution to label easily each objects. Thanks to the available model [20] implemented by Google [1], that it was trained on the ILSVRC Classification task, I had only to retrain it on a really smaller class set, thirty instead of one hundred, a workload sustainable by any personal computer available on the market. The complete architecture was composed by three main components in the following order of connection: General Detector, Tracker and Smoother, Inception. Finally, I reached a working Environment and Model, that led me to submit results for the challenge, evaluate the workload for the “asterix” infrastructure, from GPU Center of Excellence at BSC/UPC, and prove how a Department can adapt and develop a working Deep Learning Research and Development area in few months. The methodology I used could be defined Fast Learning and Fast Developing. Since I had to start everything from scratch, first of all, the necessary basic theory knowledge, after, the most complex and specific one and finally implement them in the short time interval possible, wasn’t easy at all. These reasons according with the time constraints, pushed me to learn and develop in the fastest possible way, using tricks and tips and available components, saving time for the error solving. This latter is a consistent part of my work, solving run and project problems, which took me sometimes hours, sometimes entire days and once caused me a system-crash of ”asterix” infrastructure, suffering ten days of black out during August, due to the summer period. The model reached the last position in the VID ILSVRC competition, because of its low mAP results. As I will explain, the source of these results is the first component, that afterwards hasn’t modules able to boost its accuracy; at the same time I will highlight how a different order of components and better implementation of them, for example a trainable Tracker plus Smoother, can be the starting improvements for this first draft work. Moreover, some other precautions can be taken on the Dataset for the train of single components, boosting their accuracy; I only trained the provided Train database, without using tricks and tips on it. The thesis goals were fully reached in the best way I could, solving and walking through a path full of pitfalls and problems, which made my project harder to end
    corecore