87 research outputs found

    Deep Video Analytics of Humans: From Action Recognition to Forgery Detection

    Get PDF
    In this work, we explore a variety of techniques and applications for visual problems involving videos of humans in the contexts of activity detection, pose detection, and forgery detection. The first works discussed here address the issue of human activity detection in untrimmed video where the actions performed are spatially and temporally sparse. The video may therefore contain long sequences of frames where no actions occur, and the actions that do occur will often only comprise a very small percentage of the pixels on the screen. We address this with a two-stage architecture that first creates many coarse proposals with high recall, and then classifies and refines them to create temporally accurate activity proposals. We present two methods that follow this high-level paradigm: TRI-3D and CHUNK-3D. This work on activity detection is then extended to include results on few-shot learning. In this domain, a system must learn to perform detection given only an extremely limited set of training examples. We propose a method we call a Self-Denoising Neural Network (SDNN), which takes inspiration from Denoising Autoencoders, in order to solve this problem, both in the context of activity detection and image classification. We also propose a method that performs optical character recognition on real world images when no labels are available in the language we wish to transcribe. Specifically, we build an accurate transcription system for Hebrew street name signs when no labeled training data is available. In order to do this, we divide the problem into two components and address each separately: content, which refers to the characters and language structure, and style, which refers to the domain of the images (for example, real or synthetic). We train with simple synthetic Hebrew street signs to address the content components, and with labeled French street signs to address the style. We continue our analysis by proposing a method for automatic detection of facial forgeries in videos and images. This work approaches the problem of facial forgery detection by breaking the face into multiple regions and training separate classifiers for each part. The end result is a collection of high-quality facial forgery detectors that are both accurate and explainable. We exploit this explainability by providing extensive empirical analysis of our method's results. Next, we present work that focuses on multi-camera, multi-person 3D human pose estimation from video. To address this problem, we aggregate the outputs of a 2D human pose detector across cameras and actors using a novel factor graph formulation, which we optimize using the loopy belief propagation algorithm. In particular, our factor graph introduces a temporal smoothing term to create smooth transitions between poses across frames. Finally, our last proposed method covers activity detection, pose detection, and tracking in the game of Ping Pong, where we present a new dataset, dubbed SPIN, with extensive annotations. We introduce several tasks with this dataset, including the task of predicting the future actions of players and tracking ball movements. To evaluate our performance on these tasks, we present a novel recurrent gated CNN architecture

    Addressing training data sparsity and interpretability challenges in AI based cellular networks

    Get PDF
    To meet the diverse and stringent communication requirements for emerging networks use cases, zero-touch arti cial intelligence (AI) based deep automation in cellular networks is envisioned. However, the full potential of AI in cellular networks remains hindered by two key challenges: (i) training data is not as freely available in cellular networks as in other fields where AI has made a profound impact and (ii) current AI models tend to have black box behavior making operators reluctant to entrust the operation of multibillion mission critical networks to a black box AI engine, which allow little insights and discovery of relationships between the configuration and optimization parameters and key performance indicators. This dissertation systematically addresses and proposes solutions to these two key problems faced by emerging networks. A framework towards addressing the training data sparsity challenge in cellular networks is developed, that can assist network operators and researchers in choosing the optimal data enrichment technique for different network scenarios, based on the available information. The framework encompasses classical interpolation techniques, like inverse distance weighted and kriging to more advanced ML-based methods, like transfer learning and generative adversarial networks, several new techniques, such as matrix completion theory and leveraging different types of network geometries, and simulators and testbeds, among others. The proposed framework will lead to more accurate ML models, that rely on sufficient amount of representative training data. Moreover, solutions are proposed to address the data sparsity challenge specifically in Minimization of drive test (MDT) based automation approaches. MDT allows coverage to be estimated at the base station by exploiting measurement reports gathered by the user equipment without the need for drive tests. Thus, MDT is a key enabling feature for data and artificial intelligence driven autonomous operation and optimization in current and emerging cellular networks. However, to date, the utility of MDT feature remains thwarted by issues such as sparsity of user reports and user positioning inaccuracy. For the first time, this dissertation reveals the existence of an optimal bin width for coverage estimation in the presence of inaccurate user positioning, scarcity of user reports and quantization error. The presented framework can enable network operators to configure the bin size for given positioning accuracy and user density that results in the most accurate MDT based coverage estimation. The lack of interpretability in AI-enabled networks is addressed by proposing a first of its kind novel neural network architecture leveraging analytical modeling, domain knowledge, big data and machine learning to turn black box machine learning models into more interpretable models. The proposed approach combines analytical modeling and domain knowledge to custom design machine learning models with the aim of moving towards interpretable machine learning models, that not only require a lesser training time, but can also deal with issues such as sparsity of training data and determination of model hyperparameters. The approach is tested using both simulated data and real data and results show that the proposed approach outperforms existing mathematical models, while also remaining interpretable when compared with black-box ML models. Thus, the proposed approach can be used to derive better mathematical models of complex systems. The findings from this dissertation can help solve the challenges in emerging AI-based cellular networks and thus aid in their design, operation and optimization

    Deep Learning for Building Footprint Generation from Optical Imagery

    Get PDF
    Auf Deep Learning basierende Methoden haben vielversprechende Ergebnisse für die Aufgabe der Erstellung von Gebäudegrundrissen gezeigt, aber sie haben zwei inhärente Einschränkungen. Erstens zeigen die extrahierten Gebäude verschwommene Gebäudegrenzen und Klecksformen. Zweitens sind für das Netzwerktraining massive Annotationen auf Pixelebene erforderlich. Diese Dissertation hat eine Reihe von Methoden entwickelt, um die oben genannten Probleme anzugehen. Darüber hinaus werden die entwickelten Methoden in praktische Anwendungen umgesetzt

    Deep Learning Methods for Nonlinearity Mitigation in Coherent Fiber-Optic Communication Links

    Get PDF
    Nowadays, the demand for telecommunication services is rapidly growing. To meet this everincreasing connectivity demand telecommunication industry needs to maintain the exponential growth of capacity supply. One of the central efforts in this initiative is directed towards coherent fiber-optic communication systems, the backbone of modern telecommunication infrastructure. Nonlinear distortions, i.e., the ones dependent on the transmitted signal, are widely considered to be one of the major limiting factors of these systems. When mitigating these distortions, we can’t rely on the pre-recorded information about channel properties, which is often missing or incorrect, and, therefore, have to resort to adaptive mitigation techniques, learning the link properties by themselves. Unfortunately, the existing practical approaches are suboptimal: they assume weak nonlinear distortion and propose its compensation via a cascade of separately trained sub-optimal algorithms. Deep learning, a subclass of machine learning very popular nowadays, proposes a way to address these problems. First, deep learning solutions can approximate well an arbitrary nonlinear function without making any prior assumptions about it. Second, deep learning solutions can effectively optimize a cluster of single-purpose algorithms, which leads them to a global performance optimum. In this thesis, two deep-learning solutions for nonlinearity mitigation in high-baudrate coherent fiber-optic communication links are proposed. The first one is the data augmentation technique for improving the training of supervised-learned algorithms for the compensation of nonlinear distortion. Data augmentation encircles a set of approaches for enhancing the size and the quality of training datasets so that they can lead us to better supervised learned models. This thesis shows that specially designed data augmentation techniques can be a very efficient tool for the development of powerful supervised-learned nonlinearity compensation algorithms. In various testcases studied both numerically and experimentally, the suggested augmentation is shown to lead to the reduction of up to 6× in the size of the dataset required to achieve the desired performance and a nearly 2× reduction in the training complexity of a nonlinearity compensation algorithm. The proposed approach is generic and can be applied to enhance a multitude of supervised-learned nonlinearity compensation techniques. The second one is the end-to-end learning procedure enabling optimization of the joint probabilistic and geometric shaping of symbol sequences. In a general end-to-end learning approach, the whole system is implemented as a single trainable NN from bits-in to bits-out. The novelty of the proposed approach is in using cost-effective channel model based on the perturbation theory and the refined symbol probabilities training procedure. The learned constellation shaping demonstrates a considerable mutual information gains in single-channel 64 GBd transmission through both single-span 170 km and multi-span 30x80 km single-mode fiber links. The suggested end-to-end learning procedure is applicable to an arbitrary coherent fiber-optic communication link

    Trusted Artificial Intelligence in Manufacturing; Trusted Artificial Intelligence in Manufacturing

    Get PDF
    The successful deployment of AI solutions in manufacturing environments hinges on their security, safety and reliability which becomes more challenging in settings where multiple AI systems (e.g., industrial robots, robotic cells, Deep Neural Networks (DNNs)) interact as atomic systems and with humans. To guarantee the safe and reliable operation of AI systems in the shopfloor, there is a need to address many challenges in the scope of complex, heterogeneous, dynamic and unpredictable environments. Specifically, data reliability, human machine interaction, security, transparency and explainability challenges need to be addressed at the same time. Recent advances in AI research (e.g., in deep neural networks security and explainable AI (XAI) systems), coupled with novel research outcomes in the formal specification and verification of AI systems provide a sound basis for safe and reliable AI deployments in production lines. Moreover, the legal and regulatory dimension of safe and reliable AI solutions in production lines must be considered as well. To address some of the above listed challenges, fifteen European Organizations collaborate in the scope of the STAR project, a research initiative funded by the European Commission in the scope of its H2020 program (Grant Agreement Number: 956573). STAR researches, develops, and validates novel technologies that enable AI systems to acquire knowledge in order to take timely and safe decisions in dynamic and unpredictable environments. Moreover, the project researches and delivers approaches that enable AI systems to confront sophisticated adversaries and to remain robust against security attacks. This book is co-authored by the STAR consortium members and provides a review of technologies, techniques and systems for trusted, ethical, and secure AI in manufacturing. The different chapters of the book cover systems and technologies for industrial data reliability, responsible and transparent artificial intelligence systems, human centered manufacturing systems such as human-centred digital twins, cyber-defence in AI systems, simulated reality systems, human robot collaboration systems, as well as automated mobile robots for manufacturing environments. A variety of cutting-edge AI technologies are employed by these systems including deep neural networks, reinforcement learning systems, and explainable artificial intelligence systems. Furthermore, relevant standards and applicable regulations are discussed. Beyond reviewing state of the art standards and technologies, the book illustrates how the STAR research goes beyond the state of the art, towards enabling and showcasing human-centred technologies in production lines. Emphasis is put on dynamic human in the loop scenarios, where ethical, transparent, and trusted AI systems co-exist with human workers. The book is made available as an open access publication, which could make it broadly and freely available to the AI and smart manufacturing communities

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    Identification through Finger Bone Structure Biometrics

    Get PDF

    Proceedings of the 2021 Symposium on Information Theory and Signal Processing in the Benelux, May 20-21, TU Eindhoven

    Get PDF
    • …
    corecore