15 research outputs found

    Detection of glaucoma using three-stage training with EfficientNet

    Full text link
    [EN] This paper sets forth a methodology that is based on three-stage-training of a state-of-the-art network architecture previously trained on Imagenet, and iteratively finetuned in three steps; freezing first all layers, then re-training a specific number of them and finally training all the architecture from scratch, to achieve a system with high accuracy and reliability. To determine the performance of our technique a dataset consisting of 17.070 color cropped samples of fundus images, and that includes two classes, normal and abnormal, is used. Extensive evaluations using baselines models (VGG16, InceptionV3 and Resnet50) are carried out, in addition to thorough experimentation with the proposed pipeline using variants of EfficientNet and EfficientNetV2. The training procedure is described accurately, putting emphasis on the number of parameters trained, the confusion matrices (with analysis of false positives and false negatives), accuracy, and F1-score obtained at each stage of the proposed methodology. The results achieved show that the intelligent system presented for the task at hand is reliable, presents high precision, its predictions are consistent and the number of parameters needed to train are low compared to other alternatives.This work is supported by the HK Innovation and Technology Commission (InnoHK Project CIMDA), the HK Research Grants Council (Project CityU 11204821) and City University of Hong Kong (Project 9610034). We acknowledge the support of Universitat Politècnica de València; R&D project PID2021-122580NB-I00, funded by MCIN/AEI/ 10.13039/501100011033 and ERDF.De Zarzà, I.; De Curtò, J.; Tavares De Araujo Cesariny Calafate, CM. (2022). Detection of glaucoma using three-stage training with EfficientNet. Intelligent Systems with Applications. 16:1-10. https://doi.org/10.1016/j.iswa.2022.2001401101

    AI-Enhanced Methods in Autonomous Systems: Large Language Models, DL Techniques, and Optimization Algorithms

    Full text link
    Tesis por compendio[ES] La proliferación de sistemas autónomos y su creciente integración en la vida humana cotidiana han abierto nuevas fronteras de investigación y desarrollo. Dentro de este ámbito, la presente tesis se adentra en las aplicaciones multifacéticas de los LLMs (Large Language Models), técnicas de DL (Deep Learning) y algoritmos de optimización en el ámbito de estos sistemas autónomos. A partir de los principios de los métodos potenciados por la Inteligencia Artificial (IA), los estudios englobados en este trabajo convergen en la exploración y mejora de distintos sistemas autónomos que van desde sistemas de platooning de camiones en sistemas de comunicaciones Beyond 5G (B5G), Sistemas Multi-Agente (SMA), Vehículos Aéreos No Tripulados (UAV), estimación del área de incendios forestales, hasta la detección temprana de enfermedades como el glaucoma. Un enfoque de investigación clave, perseguido en este trabajo, gira en torno a la implementación innovadora de controladores PID adaptativos en el platooning de vehículos, facilitada a través de la integración de los LLMs. Estos controladores PID, cuando se infunden con capacidades de IA, ofrecen nuevas posibilidades en términos de eficiencia, fiabilidad y seguridad de los sistemas de platooning. Desarrollamos un modelo de DL que emula un controlador PID adaptativo, mostrando así su potencial en las redes y radios habilitadas para IA. Simultáneamente, nuestra exploración se extiende a los sistemas multi-agente, proponiendo una Teoría Coevolutiva Extendida (TCE) que amalgama elementos de la dinámica coevolutiva, el aprendizaje adaptativo y las recomendaciones de estrategias basadas en LLMs. Esto permite una comprensión más matizada y dinámica de las interacciones estratégicas entre agentes heterogéneos en los SMA. Además, nos adentramos en el ámbito de los vehículos aéreos no tripulados (UAVs), proponiendo un sistema para la comprensión de vídeos que crea una log de la historia basada en la descripción semántica de eventos y objetos presentes en una escena capturada por un UAV. El uso de los LLMs aquí permite razonamientos complejos como la predicción de eventos con mínima intervención humana. Además, se aplica una metodología alternativa de DL para la estimación del área afectada durante los incendios forestales. Este enfoque aprovecha una nueva arquitectura llamada TabNet, integrada con Transformers, proporcionando así una estimación precisa y eficiente del área. En el campo de la salud, nuestra investigación esboza una metodología exitosa de detección temprana del glaucoma. Utilizando un enfoque de entrenamiento de tres etapas con EfficientNet en imágenes de retina, logramos una alta precisión en la detección de los primeros signos de esta enfermedad. A través de estas diversas aplicaciones, el foco central sigue siendo la exploración de metodologías avanzadas de IA dentro de los sistemas autónomos. Los estudios dentro de esta tesis buscan demostrar el poder y el potencial de las técnicas potenciadas por la IA para abordar problemas complejos dentro de estos sistemas. Estas investigaciones en profundidad, análisis experimentales y soluciones desarrolladas arrojan luz sobre el potencial transformador de las metodologías de IA en la mejora de la eficiencia, fiabilidad y seguridad de los sistemas autónomos, contribuyendo en última instancia a la futura investigación y desarrollo en este amplio campo.[CA] La proliferació de sistemes autònoms i la seua creixent integració en la vida humana quotidiana han obert noves fronteres de recerca i desenvolupament. Dins d'aquest àmbit, la present tesi s'endinsa en les aplicacions multifacètiques dels LLMs (Large Language Models), tècniques de DL (Deep Learning) i algoritmes d'optimització en l'àmbit d'aquests sistemes autònoms. A partir dels principis dels mètodes potenciats per la Intel·ligència Artificial (IA), els estudis englobats en aquest treball convergeixen en l'exploració i millora de diferents sistemes autònoms que van des de sistemes de platooning de camions en sistemes de comunicacions Beyond 5G (B5G), Sistemes Multi-Agent (SMA), Vehicles Aeris No Tripulats (UAV), estimació de l'àrea d'incendis forestals, fins a la detecció precoç de malalties com el glaucoma. Un enfocament de recerca clau, perseguit en aquest treball, gira entorn de la implementació innovadora de controladors PID adaptatius en el platooning de vehicles, facilitada a través de la integració dels LLMs. Aquests controladors PID, quan s'infonen amb capacitats d'IA, ofereixen noves possibilitats en termes d'eficiència, fiabilitat i seguretat dels sistemes de platooning. Desenvolupem un model de DL que emula un controlador PID adaptatiu, mostrant així el seu potencial en les xarxes i ràdios habilitades per a IA. Simultàniament, la nostra exploració s'estén als sistemes multi-agent, proposant una Teoria Coevolutiva Estesa (TCE) que amalgama elements de la dinàmica coevolutiva, l'aprenentatge adaptatiu i les recomanacions d'estratègies basades en LLMs. Això permet una comprensió més matissada i dinàmica de les interaccions estratègiques entre agents heterogenis en els SMA. A més, ens endinsem en l'àmbit dels Vehicles Aeris No Tripulats (UAVs), proposant un sistema per a la comprensió de vídeos que crea un registre de la història basat en la descripció semàntica d'esdeveniments i objectes presents en una escena capturada per un UAV. L'ús dels LLMs aquí permet raonaments complexos com la predicció d'esdeveniments amb mínima intervenció humana. A més, s'aplica una metodologia alternativa de DL per a l'estimació de l'àrea afectada durant els incendis forestals. Aquest enfocament aprofita una nova arquitectura anomenada TabNet, integrada amb Transformers, proporcionant així una estimació precisa i eficient de l'àrea. En el camp de la salut, la nostra recerca esbossa una metodologia exitosa de detecció precoç del glaucoma. Utilitzant un enfocament d'entrenament de tres etapes amb EfficientNet en imatges de retina, aconseguim una alta precisió en la detecció dels primers signes d'aquesta malaltia. A través d'aquestes diverses aplicacions, el focus central continua sent l'exploració de metodologies avançades d'IA dins dels sistemes autònoms. Els estudis dins d'aquesta tesi busquen demostrar el poder i el potencial de les tècniques potenciades per la IA per a abordar problemes complexos dins d'aquests sistemes. Aquestes investigacions en profunditat, anàlisis experimentals i solucions desenvolupades llançen llum sobre el potencial transformador de les metodologies d'IA en la millora de l'eficiència, fiabilitat i seguretat dels sistemes autònoms, contribuint en última instància a la futura recerca i desenvolupament en aquest ampli camp.[EN] The proliferation of autonomous systems, and their increasing integration with day-to-day human life, have opened new frontiers of research and development. Within this scope, the current thesis dives into the multifaceted applications of Large Language Models (LLMs), Deep Learning (DL) techniques, and Optimization Algorithms within the realm of these autonomous systems. Drawing from the principles of AI-enhanced methods, the studies encapsulated within this work converge on the exploration and enhancement of different autonomous systems ranging from B5G Truck Platooning Systems, Multi-Agent Systems (MASs), Unmanned Aerial Vehicles, Forest Fire Area Estimation, to the early detection of diseases like Glaucoma. A key research focus, pursued in this work, revolves around the innovative deployment of adaptive PID controllers in vehicle platooning, facilitated through the integration of LLMs. These PID controllers, when infused with AI capabilities, offer new possibilities in terms of efficiency, reliability, and security of platooning systems. We developed a DL model that emulates an adaptive PID controller, thereby showcasing its potential in AI-enabled radio and networks. Simultaneously, our exploration extends to multi-agent systems, proposing an Extended Coevolutionary (EC) Theory that amalgamates elements of coevolutionary dynamics, adaptive learning, and LLM-based strategy recommendations. This allows for a more nuanced and dynamic understanding of the strategic interactions among heterogeneous agents in MASs. Moreover, we delve into the realm of Unmanned Aerial Vehicles (UAVs), proposing a system for video understanding that employs a language-based world-state history of events and objects present in a scene captured by a UAV. The use of LLMs here enables open-ended reasoning such as event forecasting with minimal human intervention. Furthermore, an alternative DL methodology is applied for the estimation of the affected area during forest fires. This approach leverages a novel architecture called TabNet, integrated with Transformers, thus providing accurate and efficient area estimation. In the field of healthcare, our research outlines a successful early detection methodology for glaucoma. Using a three-stage training approach with EfficientNet on retinal images, we achieved high accuracy in detecting early signs of this disease. Across these diverse applications, the core focus remains: the exploration of advanced AI methodologies within autonomous systems. The studies within this thesis seek to demonstrate the power and potential of AI-enhanced techniques in tackling complex problems within these systems. These in-depth investigations, experimental analyses, and developed solutions shed light on the transformative potential of AI methodologies in improving the efficiency, reliability, and security of autonomous systems, ultimately contributing to future research and development in this expansive field.De Zarzà I Cubero, I. (2023). AI-Enhanced Methods in Autonomous Systems: Large Language Models, DL Techniques, and Optimization Algorithms [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/202201Compendi

    Une théorie unifiante de l’apprentissage: DL rencontre méthodes à noyaux

    No full text
    Wir führen ein Framework ein, um Kernel-Approximationen in der Mini-Batch-Einstellung mit Stochastic Gradient Descent (SGD) als Alternative zu Deep Learning zu verwenden. Basierend auf Random Kitchen Sinks bieten wir eine C++ Bibliothek für ML in großem Maßstab. Es enthält eine CPU-optimierte Implementierung des Algorithmus in Le et al. 2013, mit der ungefähre Kernel-Erweiterungen in logarithmisch linearer Zeit berechnet werden können. Der Algorithmus erfordert die Berechnung des Produkts der Matrizen Walsh Hadamard. Es wurde ein cachefreundlicher Fast Walsh Hadamard entwickelt, der eine überzeugende Geschwindigkeit erreicht und die aktuellen Methoden auf dem neuesten Stand der Technik übertrifft. McKernel legt die Grundlage für eine neue Lernarchitektur, die es ermöglicht, eine nichtlineare Klassifizierung in großem Maßstab zu erhalten, die Blitzkernerweiterungen und einen linearen Klassifizierer kombiniert. Es funktioniert in der Mini-Batch-Einstellung analog zu neuronalen Netzen. Wir zeigen die Gültigkeit unserer Methode durch umfangreiche Experimente mit MNIST und FASHION MNIST.Wir schlagen auch eine neue Architektur vor, um die Überparametrisierung in neuronalen Netzen zu reduzieren. Es wird ein Operand für die schnelle Berechnung im Rahmen von Deep Learning eingeführt, der gelernte Gewichte nutzt. Der Formalismus wird ausführlich beschrieben und bietet sowohl eine genaue Aufklärung der Mechanik als auch der theoretischen Implikationen.We introduce a framework to use kernel approximates in the mini-batch setting with Stochastic Gradient Descent (SGD) as an alternative to Deep Learning. Based on Random Kitchen Sinks, we provide a C++ library for Large-scale ML. It contains a CPU optimized implementation of the algorithm in Le et al. 2013, that allows the computation of approximated kernel expansions in log-linear time. The algorithm requires to compute the product of matrices Walsh Hadamard. A cache friendly Fast Walsh Hadamard that achieves compelling speed and outperforms current state-of-the-art methods has been developed.McKernel establishes the foundation of a new architecture of learning that allows to obtain large-scale non-linear classification combining lightning kernel expansions and a linear classifier. It travails in the mini-batch setting working analogously to Neural Networks. We show the validity of our method through extensive experiments on MNIST and FASHION MNIST.We also propose a new architecture to reduce over-parametrization in Neural Networks. It introduces an operand for rapid computation in the framework of Deep Learning that leverages learned weights. The formalism is described in detail providing both an accurate elucidation of the mechanics and the theoretical implications.Nous introduisons un framework pour utiliser les méthodes à noyaux dans le paramètre mini-batch avec Stochastic Gradient Descent (SGD) comme alternative à Deep Learning. Basé sur Random Kitchen Sinks, nous fournissons une bibliothèque C ++ pour le ML à grande échelle. Il contient une implémentation optimisée pour le processeur de l'algorithme de Le et al. 2013, qui permet le calcul des extensions approximatives du noyau en temps log-linéaire. L'algorithme nécessite de calculer le produit des matrices de Walsh Hadamard. Un Fast Walsh Hadamard compatible avec le cache, qui atteint une vitesse irréprochable et surpasse les méthodes actuelles de pointe, a été développé. McKernel jette les bases d'une nouvelle architecture d'apprentissage qui permet d'obtenir une classification non linéaire à grande échelle combinant des méthodes à noyaux rapides et un classificateur linéaire. Il fonctionne dans le cadre du mini-lot fonctionnant de manière analogue aux réseaux de neurones. Nous montrons la validité de notre méthode à travers des expériences approfondies sur MNIST et FASHION MNIST. Nous proposons également une nouvelle architecture pour réduire la sur-paramétrisation dans les réseaux de neurones. Il introduit un opérande pour le calcul rapide dans le cadre du Deep Learning qui exploite les poids appris. Le formalisme est décrit en détail, fournissant à la fois une élucidation précise de la mécanique et des implications théoriques

    Spectral Properties of Mimetic Operators for Robust Fluid–Structure Interaction in the Design of Aircraft Wings

    No full text
    This paper presents a comprehensive study on the spectral properties of mimetic finite-difference operators and their application in the robust fluid–structure interaction (FSI) analysis of aircraft wings under uncertain operating conditions. By delving into the eigenvalue behavior of mimetic Laplacian operators and extending the analysis to stochastic settings, we develop a novel stochastic mimetic framework tailored for addressing uncertainties inherent in the fluid dynamics and structural mechanics of aircraft wings. The framework integrates random matrix theory with mimetic discretization methods, enabling the incorporation of uncertainties in fluid properties, structural parameters, and coupling conditions at the fluid–structure interface. Through spectral and localization analysis of the coupled stochastic mimetic operator, we assess the system’s stability, sensitivity to perturbations, and computational efficiency. Our results highlight the potential of the stochastic mimetic approach for enhancing reliability and robustness in the design of aircraft wings, paving the way for optimization algorithms that integrate uncertainties directly into the design process. Our findings reveal a significant impact of stochastic perturbations on the spectral radius and eigenfunction localization, indicating heightened system sensitivity. The introduction of randomized singular value decomposition (RSVD) within our framework not only enhances computational efficiency but also preserves accuracy in low-rank approximations, which is critical for handling large-scale systems. Moreover, Monte Carlo simulations validate the robustness of our stochastic mimetic framework, showcasing its efficacy in capturing the nuanced dynamics of FSI under uncertainty. This study contributes to the fields of numerical methods and aerospace engineering by offering a rigorous and scalable approach for conducting uncertainty-aware FSI analysis, which is crucial for the development of safer and more efficient aircraft

    Optimizing Propellant Distribution for Interorbital Transfers

    No full text
    The advent of space exploration missions, especially those aimed at establishing a sustainable presence on the Moon and beyond, necessitates the development of efficient propulsion and mission planning techniques. This study presents a comprehensive analysis of chemical and electric propulsion systems for spacecraft, focusing on optimizing propellant distribution for missions involving transfers from Low-Earth Orbit (LEO) to Geostationary Orbit (GEO) and the Lunar surface. Using mathematical modeling and optimization algorithms, we calculate the delta-v requirements for key mission segments and determine the propellant mass required for each propulsion method. The results highlight the trade-offs between the high thrust of chemical propulsion and the high specific impulse of electric propulsion. An optimization model is developed to minimize the total propellant mass, considering a hybrid approach that leverages the advantages of both propulsion types. This research contributes to the field of aerospace engineering by providing insights into propulsion system selection and mission planning for future exploration missions to the Moon, Mars, and Venus

    Signature and Log-Signature for the Study of Empirical Distributions Generated with GANs

    Full text link
    [EN] In this paper, we address the research gap in efficiently assessing Generative Adversarial Network (GAN) convergence and goodness of fit by introducing the application of the Signature Transform to measure similarity between image distributions. Specifically, we propose the novel use of Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) Signature, along with Log-Signature, as alternatives to existing methods such as Fréchet Inception Distance (FID) and Multi-Scale Structural Similarity Index Measure (MS-SSIM). Our approach offers advantages in terms of efficiency and effectiveness, providing a comprehensive understanding and extensive evaluations of GAN convergence and goodness of fit. Furthermore, we present innovative analytical measures based on statistics by means of Kruskal--Wallis to evaluate the goodness of fit of GAN sample distributions. Unlike existing GAN measures, which are based on deep neural networks and require extensive GPU computations, our approach significantly reduces computation time and is performed on the CPU while maintaining the same level of accuracy. Our results demonstrate the effectiveness of the proposed method in capturing the intrinsic structure of the generated samples, providing meaningful insights into GAN performance. Lastly, we evaluate our approach qualitatively using Principal Component Analysis (PCA) and adaptive t-Distributed Stochastic Neighbor Embedding (t-SNE) for data visualization, illustrating the plausibility of our method.This work was supported by the HK Innovation and Technology Commission (InnoHK Project CIMDA). We acknowledge the support of R&D project PID2021-122580NB-I00, funded by MCIN/AEI/10.13039/501100011033 and ERDF. We thank the following funding sources from GOETHE-University Frankfurt am Main; DePP Dezentrale Plannung von Platoons im Straßengüterverkehr mit Hilfe einer KI auf Basis einzelner LKW and Center for Data Science & AI .De Curtò, J.; De Zarzà, I.; Roig, G.; Tavares De Araujo Cesariny Calafate, CM. (2023). Signature and Log-Signature for the Study of Empirical Distributions Generated with GANs. Electronics. 12(10). https://doi.org/10.3390/electronics12102192121

    Cascading and Ensemble Techniques in Deep Learning

    Full text link
    [EN] In this study, we explore the integration of cascading and ensemble techniques in Deep Learning (DL) to improve prediction accuracy on diabetes data. The primary approach involves creating multiple Neural Networks (NNs), each predicting the outcome independently, and then feeding these initial predictions into another set of NN. Our exploration starts from an initial preliminary study and extends to various ensemble techniques including bagging, stacking, and finally cascading. The cascading ensemble involves training a second layer of models on the predictions of the first. This cascading structure, combined with ensemble voting for the final prediction, aims to exploit the strengths of multiple models while mitigating their individual weaknesses. Our results demonstrate significant improvement in prediction accuracy, providing a compelling case for the potential utility of these techniques in healthcare applications, specifically for prediction of diabetes where we achieve compelling model accuracy of 91.5% on the test set on a particular challenging dataset, where we compare thoroughly against many other methodologies.We thank the following funding sources from GOETHE-University Frankfurt am Main; DePP Dezentrale Plannung von Platoons im Straßengüterverkehr mit Hilfe einer KI auf Basis einzelner LKW , Center for Data Science & AI and xAIBiology . We acknowledge the support of R&D project PID2021-122580NB-I00, funded by MCIN/AEI/10.13039/501100011033 and ERDF.De Zarzà, I.; De Curtò, J.; Hernández-Orallo, E.; Tavares De Araujo Cesariny Calafate, CM. (2023). Cascading and Ensemble Techniques in Deep Learning. Electronics. 12(15). https://doi.org/10.3390/electronics12153354121

    Optimizing Neural Networks for Imbalanced Data

    No full text
    Imbalanced datasets pose pervasive challenges in numerous machine learning (ML) applications, notably in areas such as fraud detection, where fraudulent cases are vastly outnumbered by legitimate transactions. Conventional ML methods often grapple with such imbalances, resulting in models with suboptimal performance concerning the minority class. This study undertakes a thorough examination of strategies for optimizing supervised learning algorithms when confronted with imbalanced datasets, emphasizing resampling techniques. Initially, we explore multiple methodologies, encompassing Gaussian Naive Bayes, linear and quadratic discriminant analysis, K-nearest neighbors (K-NN), support vector machines (SVMs), decision trees, and multi-layer perceptron (MLP). We apply these on a four-class spiral dataset, a notoriously demanding non-linear classification problem, to gauge their effectiveness. Subsequently, we leverage the garnered insights for a real-world credit card fraud detection task on a public dataset, where we achieve a compelling accuracy of 99.937%. In this context, we compare and contrast the performances of undersampling, oversampling, and the synthetic minority oversampling technique (SMOTE). Our findings highlight the potency of resampling strategies in augmenting model performance on the minority class; in particular, oversampling techniques achieve the best performance, resulting in an accuracy of 99.928% with a significantly low number of false negatives (21/227,451)

    Summarization of Videos with the Signature Transform

    No full text
    This manuscript presents a new benchmark for assessing the quality of visual summaries without the need for human annotators. It is based on the Signature Transform, specifically focusing on the RMSE and the MAE Signature and Log-Signature metrics, and builds upon the assumption that uniform random sampling can offer accurate summarization capabilities. We provide a new dataset comprising videos from Youtube and their corresponding automatic audio transcriptions. Firstly, we introduce a preliminary baseline for automatic video summarization, which has at its core a Vision Transformer, an image–text model pre-trained with Contrastive Language–Image Pre-training (CLIP), as well as a module of object detection. Following that, we propose an accurate technique grounded in the harmonic components captured by the Signature Transform, which delivers compelling accuracy. The analytical measures are extensively evaluated, and we conclude that they strongly correlate with the notion of a good summary
    corecore