2,526 research outputs found

    Data mining based cyber-attack detection

    Get PDF

    Deep Learning Models For Biomedical Data Analysis

    Get PDF
    The field of biomedical data analysis is a vibrant area of research dedicated to extracting valuable insights from a wide range of biomedical data sources, including biomedical images and genomics data. The emergence of deep learning, an artificial intelligence approach, presents significant prospects for enhancing biomedical data analysis and knowledge discovery. This dissertation focused on exploring innovative deep-learning methods for biomedical image processing and gene data analysis. During the COVID-19 pandemic, biomedical imaging data, including CT scans and chest x-rays, played a pivotal role in identifying COVID-19 cases by categorizing patient chest x-ray outcomes as COVID-19-positive or negative. While supervised deep learning methods have effectively recognized COVID-19 patterns in chest x-ray datasets, the availability of annotated training data remains limited. To address this challenge, the thesis introduced a semi-supervised deep learning model named ssResNet, built upon the Residual Neural Network (ResNet) architecture. The model combines supervised and unsupervised paths, incorporating a weighted supervised loss function to manage data imbalance. The strategies to diminish prediction uncertainty in deep learning models for critical applications like medical image processing is explore. It achieves this through an ensemble deep learning model, integrating bagging deep learning and model calibration techniques. This ensemble model not only boosts biomedical image segmentation accuracy but also reduces prediction uncertainty, as validated on a comprehensive chest x-ray image segmentation dataset. Furthermore, the thesis introduced an ensemble model integrating Proformer and ensemble learning methodologies. This model constructs multiple independent Proformers for predicting gene expression, their predictions are combined through weighted averaging to generate final predictions. Experimental outcomes underscore the efficacy of this ensemble model in enhancing prediction performance across various metrics. In conclusion, this dissertation advances biomedical data analysis by harnessing the potential of deep learning techniques. It devises innovative approaches for processing biomedical images and gene data. By leveraging deep learning\u27s capabilities, this work paves the way for further progress in biomedical data analytics and its applications within clinical contexts. Index Terms- biomedical data analysis, COVID-19, deep learning, ensemble learning, gene data analytics, medical image segmentation, prediction uncertainty, Proformer, Residual Neural Network (ResNet), semi-supervised learning

    Managing heterogeneous cues in social contexts. A holistic approach for social interactions analysis

    Get PDF
    Une interaction sociale désigne toute action réciproque entre deux ou plusieurs individus, au cours de laquelle des informations sont partagées sans "médiation technologique". Cette interaction, importante dans la socialisation de l'individu et les compétences qu'il acquiert au cours de sa vie, constitue un objet d'étude pour différentes disciplines (sociologie, psychologie, médecine, etc.). Dans le contexte de tests et d'études observationnelles, de multiples mécanismes sont utilisés pour étudier ces interactions tels que les questionnaires, l'observation directe des événements et leur analyse par des opérateurs humains, ou l'observation et l'analyse à posteriori des événements enregistrés par des spécialistes (psychologues, sociologues, médecins, etc.). Cependant, de tels mécanismes sont coûteux en termes de temps de traitement, ils nécessitent un niveau élevé d'attention pour analyser simultanément plusieurs descripteurs, ils sont dépendants de l'opérateur (subjectivité de l'analyse) et ne peuvent viser qu'une facette de l'interaction. Pour faire face aux problèmes susmentionnés, il peut donc s'avérer utile d'automatiser le processus d'analyse de l'interaction sociale. Il s'agit donc de combler le fossé entre les processus d'analyse des interactions sociales basés sur l'homme et ceux basés sur la machine. Nous proposons donc une approche holistique qui intègre des signaux hétérogènes multimodaux et des informations contextuelles (données "exogènes" complémentaires) de manière dynamique et optionnelle en fonction de leur disponibilité ou non. Une telle approche permet l'analyse de plusieurs "signaux" en parallèle (où les humains ne peuvent se concentrer que sur un seul). Cette analyse peut être encore enrichie à partir de données liées au contexte de la scène (lieu, date, type de musique, description de l'événement, etc.) ou liées aux individus (nom, âge, sexe, données extraites de leurs réseaux sociaux, etc.) Les informations contextuelles enrichissent la modélisation des métadonnées extraites et leur donnent une dimension plus "sémantique". La gestion de cette hétérogénéité est une étape essentielle pour la mise en œuvre d'une approche holistique. L'automatisation de la capture et de l'observation " in vivo " sans scénarios prédéfinis lève des verrous liés à i) la protection de la vie privée et à la sécurité ; ii) l'hétérogénéité des données ; et iii) leur volume. Par conséquent, dans le cadre de l'approche holistique, nous proposons (1) un modèle de données complet préservant la vie privée qui garantit le découplage entre les méthodes d'extraction des métadonnées et d'analyse des interactions sociales ; (2) une méthode géométrique non intrusive de détection par contact visuel ; et (3) un modèle profond de classification des repas français pour extraire les informations du contenu vidéo. L'approche proposée gère des signaux hétérogènes provenant de différentes modalités en tant que sources multicouches (signaux visuels, signaux vocaux, informations contextuelles) à différentes échelles de temps et différentes combinaisons entre les couches (représentation des signaux sous forme de séries temporelles). L'approche a été conçue pour fonctionner sans dispositifs intrusifs, afin d'assurer la capture de comportements réels et de réaliser l'observation naturaliste. Nous avons déployé l'approche proposée sur la plateforme OVALIE qui vise à étudier les comportements alimentaires dans différents contextes de la vie réelle et qui est située à l'Université Toulouse-Jean Jaurès, en France.Social interaction refers to any interaction between two or more individuals, in which information sharing is carried out without any mediating technology. This interaction is a significant part of individual socialization and experience gaining throughout one's lifetime. It is interesting for different disciplines (sociology, psychology, medicine, etc.). In the context of testing and observational studies, multiple mechanisms are used to study these interactions such as questionnaires, direct observation and analysis of events by human operators, or a posteriori observation and analysis of recorded events by specialists (psychologists, sociologists, doctors, etc.). However, such mechanisms are expensive in terms of processing time. They require a high level of attention to analyzing several cues simultaneously. They are dependent on the operator (subjectivity of the analysis) and can only target one side of the interaction. In order to face the aforementioned issues, the need to automatize the social interaction analysis process is highlighted. So, it is a question of bridging the gap between human-based and machine-based social interaction analysis processes. Therefore, we propose a holistic approach that integrates multimodal heterogeneous cues and contextual information (complementary "exogenous" data) dynamically and optionally according to their availability or not. Such an approach allows the analysis of multi "signals" in parallel (where humans are able only to focus on one). This analysis can be further enriched from data related to the context of the scene (location, date, type of music, event description, etc.) or related to individuals (name, age, gender, data extracted from their social networks, etc.). The contextual information enriches the modeling of extracted metadata and gives them a more "semantic" dimension. Managing this heterogeneity is an essential step for implementing a holistic approach. The automation of " in vivo " capturing and observation using non-intrusive devices without predefined scenarios introduces various issues that are related to data (i) privacy and security; (ii) heterogeneity; and (iii) volume. Hence, within the holistic approach we propose (1) a privacy-preserving comprehensive data model that grants decoupling between metadata extraction and social interaction analysis methods; (2) geometric non-intrusive eye contact detection method; and (3) French food classification deep model to extract information from the video content. The proposed approach manages heterogeneous cues coming from different modalities as multi-layer sources (visual signals, voice signals, contextual information) at different time scales and different combinations between layers (representation of the cues like time series). The approach has been designed to operate without intrusive devices, in order to ensure the capture of real behaviors and achieve the naturalistic observation. We have deployed the proposed approach on OVALIE platform which aims to study eating behaviors in different real-life contexts and it is located in University Toulouse-Jean Jaurès, France

    CONTRIBUTIONS TO EFFICIENT AUTOMATIC TRANSCRIPTION OF VIDEO LECTURES

    Full text link
    Tesis por compendio[ES] Durante los últimos años, los repositorios multimedia en línea se han convertido en fuentes clave de conocimiento gracias al auge de Internet, especialmente en el área de la educación. Instituciones educativas de todo el mundo han dedicado muchos recursos en la búsqueda de nuevos métodos de enseñanza, tanto para mejorar la asimilación de nuevos conocimientos, como para poder llegar a una audiencia más amplia. Como resultado, hoy en día disponemos de diferentes repositorios con clases grabadas que siven como herramientas complementarias en la enseñanza, o incluso pueden asentar una nueva base en la enseñanza a distancia. Sin embargo, deben cumplir con una serie de requisitos para que la experiencia sea totalmente satisfactoria y es aquí donde la transcripción de los materiales juega un papel fundamental. La transcripción posibilita una búsqueda precisa de los materiales en los que el alumno está interesado, se abre la puerta a la traducción automática, a funciones de recomendación, a la generación de resumenes de las charlas y además, el poder hacer llegar el contenido a personas con discapacidades auditivas. No obstante, la generación de estas transcripciones puede resultar muy costosa. Con todo esto en mente, la presente tesis tiene como objetivo proporcionar nuevas herramientas y técnicas que faciliten la transcripción de estos repositorios. En particular, abordamos el desarrollo de un conjunto de herramientas de reconocimiento de automático del habla, con énfasis en las técnicas de aprendizaje profundo que contribuyen a proporcionar transcripciones precisas en casos de estudio reales. Además, se presentan diferentes participaciones en competiciones internacionales donde se demuestra la competitividad del software comparada con otras soluciones. Por otra parte, en aras de mejorar los sistemas de reconocimiento, se propone una nueva técnica de adaptación de estos sistemas al interlocutor basada en el uso Medidas de Confianza. Esto además motivó el desarrollo de técnicas para la mejora en la estimación de este tipo de medidas por medio de Redes Neuronales Recurrentes. Todas las contribuciones presentadas se han probado en diferentes repositorios educativos. De hecho, el toolkit transLectures-UPV es parte de un conjunto de herramientas que sirve para generar transcripciones de clases en diferentes universidades e instituciones españolas y europeas.[CA] Durant els últims anys, els repositoris multimèdia en línia s'han convertit en fonts clau de coneixement gràcies a l'expansió d'Internet, especialment en l'àrea de l'educació. Institucions educatives de tot el món han dedicat molts recursos en la recerca de nous mètodes d'ensenyament, tant per millorar l'assimilació de nous coneixements, com per poder arribar a una audiència més àmplia. Com a resultat, avui dia disposem de diferents repositoris amb classes gravades que serveixen com a eines complementàries en l'ensenyament, o fins i tot poden assentar una nova base a l'ensenyament a distància. No obstant això, han de complir amb una sèrie de requisits perquè la experiència siga totalment satisfactòria i és ací on la transcripció dels materials juga un paper fonamental. La transcripció possibilita una recerca precisa dels materials en els quals l'alumne està interessat, s'obri la porta a la traducció automàtica, a funcions de recomanació, a la generació de resums de les xerrades i el poder fer arribar el contingut a persones amb discapacitats auditives. No obstant, la generació d'aquestes transcripcions pot resultar molt costosa. Amb això en ment, la present tesi té com a objectiu proporcionar noves eines i tècniques que faciliten la transcripció d'aquests repositoris. En particular, abordem el desenvolupament d'un conjunt d'eines de reconeixement automàtic de la parla, amb èmfasi en les tècniques d'aprenentatge profund que contribueixen a proporcionar transcripcions precises en casos d'estudi reals. A més, es presenten diferents participacions en competicions internacionals on es demostra la competitivitat del programari comparada amb altres solucions. D'altra banda, per tal de millorar els sistemes de reconeixement, es proposa una nova tècnica d'adaptació d'aquests sistemes a l'interlocutor basada en l'ús de Mesures de Confiança. A més, això va motivar el desenvolupament de tècniques per a la millora en l'estimació d'aquest tipus de mesures per mitjà de Xarxes Neuronals Recurrents. Totes les contribucions presentades s'han provat en diferents repositoris educatius. De fet, el toolkit transLectures-UPV és part d'un conjunt d'eines que serveix per generar transcripcions de classes en diferents universitats i institucions espanyoles i europees.[EN] During the last years, on-line multimedia repositories have become key knowledge assets thanks to the rise of Internet and especially in the area of education. Educational institutions around the world have devoted big efforts to explore different teaching methods, to improve the transmission of knowledge and to reach a wider audience. As a result, online video lecture repositories are now available and serve as complementary tools that can boost the learning experience to better assimilate new concepts. In order to guarantee the success of these repositories the transcription of each lecture plays a very important role because it constitutes the first step towards the availability of many other features. This transcription allows the searchability of learning materials, enables the translation into another languages, provides recommendation functions, gives the possibility to provide content summaries, guarantees the access to people with hearing disabilities, etc. However, the transcription of these videos is expensive in terms of time and human cost. To this purpose, this thesis aims at providing new tools and techniques that ease the transcription of these repositories. In particular, we address the development of a complete Automatic Speech Recognition Toolkit with an special focus on the Deep Learning techniques that contribute to provide accurate transcriptions in real-world scenarios. This toolkit is tested against many other in different international competitions showing comparable transcription quality. Moreover, a new technique to improve the recognition accuracy has been proposed which makes use of Confidence Measures, and constitutes the spark that motivated the proposal of new Confidence Measures techniques that helped to further improve the transcription quality. To this end, a new speaker-adapted confidence measure approach was proposed for models based on Recurrent Neural Networks. The contributions proposed herein have been tested in real-life scenarios in different educational repositories. In fact, the transLectures-UPV toolkit is part of a set of tools for providing video lecture transcriptions in many different Spanish and European universities and institutions.Agua Teba, MÁD. (2019). CONTRIBUTIONS TO EFFICIENT AUTOMATIC TRANSCRIPTION OF VIDEO LECTURES [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/130198TESISCompendi

    Design Principles for Special Purpose, Embodied, Conversational Intelligence with Environmental Sensors (SPECIES) Agents

    Get PDF
    As information systems increase their ability to gather and analyze data from the natural environment and as computational power increases, the next generation of human-computer interfaces will be able to facilitate more lifelike and natural interactions with humans. This can be accomplished by using sensors to non-invasively gather information from the user, using artificial intelligence to interpret this information to perceive users’ emotional and cognitive states, and using customized interfaces and responses based on embodied-conversational-agent (avatar) technology to respond to the user. We refer to this novel and unique class of intelligent agents as Special Purpose Embodied Conversational Intelligence with Environmental Sensors (SPECIES) agents. In this paper, we build on interpersonal communication theory to specify four essential design principles of all SPECIES agents. We also share findings of initial research that demonstrates how SPECIES agents can be deployed to augment human tasks. Results of this paper organize future research efforts in collectively studying and creating more robust, influential, and intelligent SPECIES agents
    corecore