3 research outputs found

    Mixed convolutional and long short-term memory network for the detection of lethal ventricular arrhythmia

    Get PDF
    Early defibrillation by an automated external defibrillator (AED) is key for the survival of out-of-hospital cardiac arrest (OHCA) patients. ECG feature extraction and machine learning have been successfully used to detect ventricular fibrillation (VF) in AED shock decision algorithms. Recently, deep learning architectures based on 1D Convolutional Neural Networks (CNN) have been proposed for this task. This study introduces a deep learning architecture based on 1D-CNN layers and a Long Short-Term Memory (LSTM) network for the detection of VF. Two datasets were used, one from public repositories of Holter recordings captured at the onset of the arrhythmia, and a second from OHCA patients obtained minutes after the onset of the arrest. Data was partitioned patient-wise into training (80%) to design the classifiers, and test (20%) to report the results. The proposed architecture was compared to 1D-CNN only deep learners, and to a classical approach based on VF-detection features and a support vector machine (SVM) classifier. The algorithms were evaluated in terms of balanced accuracy (BAC), the unweighted mean of the sensitivity (Se) and specificity (Sp). The BAC, Se, and Sp of the architecture for 4-s ECG segments was 99.3%, 99.7%, and 98.9% for the public data, and 98.0%, 99.2%, and 96.7% for OHCA data. The proposed architecture outperformed all other classifiers by at least 0.3-points in BAC in the public data, and by 2.2-points in the OHCA data. The architecture met the 95% Sp and 90% Se requirements of the American Heart Association in both datasets for segment lengths as short as 3-s. This is, to the best of our knowledge, the most accurate VF detection algorithm to date, especially on OHCA data, and it would enable an accurate shock no shock diagnosis in a very short time.This study was supported by the Ministerio de Econom铆a, Industria y Competitividad, Gobierno de Espa帽a (ES) (TEC-2015-64678-R) to UI and EA and by Euskal Herriko Unibertsitatea (ES) (GIU17/031) to UI and EA. The funders, Tecnalia Research and Innovation and Banco Bilbao Vizcaya Argentaria (BBVA), provided support in the form of salaries for authors AP, AA, FAA, CF, EG, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the author contributions section

    Self-supervised learning for image-to-image translation in the small data regime

    Get PDF
    La irrupci贸 a gran escala de Xarxes Neuronals Convolucionals Profundes (CNNs) a la visi贸 per computador des de 2012 ha du茂t a un paradigma predominant d'interpretaci贸 de la imatge consistent en un proc茅s d'aprenentage completament supervisat amb conjunts massius de dades etiquetades. Aquesta aproximaci贸 ha resultat ser 煤til per a solucionar una mir铆ada de tasques de visi贸 per computador, amb resultats sense precedents, a costa d'emprar grans quantitats de dades anotades, recursos computacionals de gran magnitud, i tot el coneixement previ possible sobre la tasca a resoldre. Tot i que t猫cniques senzilles com el fine tuning han obtingut un gran impacte, el seu 猫xit quan la quantitat de dades etiquetades al domini objectiu 茅s petita es mant茅 limitat. A m茅s a m茅s, el car脿cter no est脿tic de les fonts de generaci贸 de dades resulta en canvis inesperats en la distribuci贸 d'aquestes dades, degradant el rendiment dels models. Com a conseq眉猫ncia, hi ha una demanda de m猫todes que puguin explotar elements de coneixement previ i fonts d'informaci贸 m茅s enll脿 del conjunt d'etiquetes generades per un hum脿 expert, per a que puguin adaptar-se a nous dominis que constitueixen un r猫gim d'escasses dades etiquetades. Aquesta tesi s'adre莽a a aquesta classe d'escenaris en tres problemes de transformaci贸 imatge a imatge. Contribueix amb una s猫rie de metodologies basades en coneixement a priori dels diferents elements del proc茅s de formaci贸 de la imatge. Primer introdu茂m un marc conceptual, eficient en l'煤s de dades, per al tractament del desenfocament, basat en un model capa莽 de produir degradacions locals sint猫tiques per貌 realistes. Aquest marc es pot instanciar de tres maneres diferents: en t猫cnica auto-supervisada, feblement supervisada, o semi-supervisada, i resulta ser superior a les corresponent versions completament supervisades. El coneixement del proc茅s de formaci贸 del color en la imatge 茅s aprofitat despr茅s per a recopilar parelles entrada/objectiu d'imatges en el context de reconstrucci贸 hiperespectral de la imatge. Emprem una CNN per a resoldre aquest problema, la qual cosa ens permet per primera vegada explotar context espacial i aconseguir resultats que estableixen un nou estat de l'art a partir de un conjunt redu茂t d'imatges hiperespectrals. En la nostra darrera contribuci贸 a l'脿mbit de la transformaci贸 d'imatge a imatge en problemes amb poques dades anotades, presentem la nova tasca semi-supervisada de segmentaci贸 sem脿ntica amb zero parells i amb vistes creuades: considerem el cas de recol路locaci贸 de camera en un sistema de segmentaci贸 sem脿ntica desplegat. Assumint que podem obtenir un conjunt adicional de pars d'imatges, no etiquetades per貌 s铆 sincronitzades, de noves escenes emprant la posici贸 de c脿mera original i nova, presentem ZPCVNet, un model que possibilita la generaci贸 de prediccions sem脿ntiques denses tant en vistes de inici com vistes objectiu. La car猫ncia de bases de dades p煤bliques per a poder desenvolupar la metodolgia proposta ens condueix a la creaci贸 de MVMO, una base de dades p煤blica de gran escala, multi-vista, multi-objecte, renderitzada mitjan莽ant path tracing, amb anotacions per-vista de segmentaci贸 sem脿ntica. Pensem que MVMO promour脿 futura recerca en la molt interessant per貌 poc explorada 脿rea de la segmentaci贸 sem脿ntica multi-vista i amb vistes creuades. Finalment, en una pe莽a de recerca aplicada amb aplicaci贸 directa en un context de Forn d'Arc El猫ctric (EAF) en una acereria, considerem el problema d'estimaci贸 simult脿nia de la temperatura i la emissivitat espectral de mostres emissives calentes. Dissenyem el nostre propi sistema de captura, capa莽 de registrar senyal radiant entrant per un forat de 8cm de di脿metre localitzat fins a 20m de dist脿ncia. Llavors definim un model f铆sicament prec铆s de transfer猫ncia radiant. Resolem aquest problema invers sense la necessitat de dades anotades, mitjan莽ant una aproximaci贸 Bayesiana basada en programaci贸 probabil铆stica, que proporciona estimacions consistents amb mesures de nivell de laboratori.La irrupci贸n masiva de las Redes Neuronales Convolucionales (CNN) en visi贸n artificial a partir de 2012 condujo a un dominio del paradigma consistente en el aprendizaje extremo-a-extremo totalmente supervisado sobre bases de datos de im谩genes de gran escala. Esta aproximaci贸n demostr贸 ser extremadamente 煤til para la resoluci贸n de innumerables tareas de visi贸n artificial con un rendimiento predictivo sin precedentes, a costa de requerir grandes cantidades de datos anotados y de recursos de computaci贸n, y de tener que descartar nuestro conocimiento previo sobre la tarea. Pese a que los m茅todos sencillos de aprendizaje por transferencia, tales como el fine-tuning, han logrado un impacto notable, su 茅xito se ve mermado cuando la cantidad de datos anotados en el dominio de destino es reducida. Asimismo, el car谩cter no est谩tico de las fuentes de generaci贸n de datos deriva, en desplazamientos de la distribuci贸n de los datos que dan lugar a una degradaci贸n del rendimiento. En consecuencia, existe una demanda de m茅todos que puedan explotar tanto nuestro conocimiento a priori como fuentes de informaci贸n adicionales a las anotaciones manuales, de manera que puedan adaptarse a nuevos dominios que constituyen un r茅gimen de escasez de datos anotados. La presente tesis aborda dicho escenario en tres problemas de aprendizaje para mapeo imagen-a-imagen. En ella se hacen contribuciones que se apoyan en nuestro conocimiento previo sobre diferentes elementos del proceso de formaci贸n de im谩genes: presentamos primero un marco de trabajo eficiente (en cuanto a uso de datos) para la detecci贸n de borrosidad, en base a un modelo capaz de producir degradaciones locales sint茅ticas realistas. La propuesta se compone de tres implementaciones (una auto-supervisada, una de supervisi贸n d茅bil, y una semi-supervisada), y supera a alternativas totalmente supervisadas. A continuaci贸n, empleamos nuestro conocimiento del dominio de la formaci贸n de im谩genes en color para recopilar as铆 parejas de imagenes de entrada y objetivo para la tarea de reconstrucci贸n de imagen hiperespectral. Acometemos este problema haciendo uso de una CNN que nos permite explotar el contexto espacial y lograr resultados que suponen una avance en el estado de la t茅cnica, dado un conjunto de im谩genes hiperespectrales limitado. En nuestra siguiente contribuci贸n, presentamos la tarea semi-supervisada de segmentaci贸n sem谩ntica de vista cruzada con cero-pares: consideramos el caso de reubicaci贸n de la c谩mara en un sistema de segmentaci贸n sem谩ntica monocular ya implantado. Asumiendo que podemos obtener un conjunto adicional de pares de im谩genes sincronizadas pero no anotadas de nuevas escenas desde ambas ubicaciones de c谩mara, presentamos ZPCVNet, un modelo que posibilita la generaci贸n de predicciones sem谩nticas densas bajo ambas referencias. La inexistencia de bases de datos adecuadas para poder desarrollar este planteamiento nos condujo a la creaci贸n de MVMO, una base de datos de gran escala de im谩genes Multi-Vista y Multi-Objeto, renderizadas mediante path tracing, y con anotaciones para segmentaci贸n sem谩ntica para cada vista. Esperamos que MVMO estimule futuras investigaciones en las 谩reas de la segmentaci贸n sem谩ntica multi-vista y de vista cruzada. Por 煤ltimo, en un ejercicio de investigaci贸n aplicada de utilidad directa en el contexto de monitorizaci贸n del proceso en una planta de acer铆a con horno el茅ctrico de arco (EAF), consideramos el problema de estimaci贸n conjunta de la temperatura y la emisividad espectral para muestras emisivas calientes distantes. Dise帽amos nuestro propio dispositivo, el cual incorpora tres espectr贸metros puntuales y es capaz de registrar la se帽al de radiancia procedente de un punto de 8cm ubicado a 20m de distancia. Asimismo, formulamos un modelo de transporte radiativo riguroso, para as铆 resolver este problema inverso sin requerir dato anotado alguno, empleando una aproximaci贸n bayesiana apoyada en un modelo de programaci贸n probabil铆stica que ofrece estimaciones de la distribuci贸n posterior de las variables aleatorias definidas consistentes con las mediciones de laboratorio.The mass irruption of Deep Convolutional Neural Networks (CNNs) in computer vision since 2012 led to a dominance of the image understanding paradigm consisting in an end-to-end fully supervised learning workflow over large-scale annotated datasets. This approach proved to be extremely useful at solving a myriad of classic and new computer vision tasks with unprecedented performance, at the expense of vast amounts of human-labeled data, extensive computational resources and the disposal of all of our prior knowledge on the task at hand. Even though simple transfer learning methods, such as fine-tuning, have achieved remarkable impact, their success when the amount of labeled data in the target domain is small is limited. Furthermore, the non-static nature of data generation sources will often derive in data distribution shifts that degrade the performance of deployed models. As a consequence, there is a growing demand for methods that can exploit elements of prior knowledge and sources of information other than the manually generated ground truth annotations of the images during the network training process, so that they can adapt to new domains that constitute, if not a small data regime, at least a small labeled data regime. This thesis targets such few or no labeled data scenario in three distinct image-to-image mapping learning problems. It contributes with various approaches that leverage our previous knowledge of different elements of the image formation process: We first present a data-efficient framework for both defocus and motion blur detection, based on a model able to produce realistic synthetic local degradations. The framework comprises a self-supervised, a weakly-supervised and a semi-supervised instantiation, and outperforms fully-supervised counterparts. Our knowledge on color image formation is then used to gather input and target ground truth image pairs for the RGB to hyperspectral image reconstruction task. We make use of a CNN to tackle this problem, which, for the first time, allows us to exploit spatial context and achieve state-of-the-art results given a limited hyperspectral image set. In our last contribution to the subfield of data-efficient image-to-image transformation problems, we present the novel semi-supervised task of zero-pair cross-view semantic segmentation: we consider the case of relocation of the camera in an end-to-end trained and deployed monocular, fixed-view semantic segmentation system often found in industry. Under the assumption that we are allowed to obtain an additional set of synchronized but unlabeled image pairs of new scenes from both original and new camera poses, we present ZPCVNet, a model and training procedure that enables the production of dense semantic predictions in either source or target views at inference time. The lack of existing suitable public datasets to develop this approach led us to the creation of MVMO, a large-scale Multi-View, Multi-Object path-traced dataset with per-view semantic segmentation annotations. We expect MVMO to propel future research in the exciting under-developed fields of cross-view and multi-view semantic segmentation. Last, in a piece of applied research of direct application in the context of process monitoring of an Electric Arc Furnace (EAF) in a steelmaking plant, we also consider the problem of simultaneously estimating the temperature and spectral emissivity of distant hot emissive samples. To that end, we design our own capturing device, which integrates three point spectrometers and is capable of registering the radiance signal incoming from an 8cm diameter spot located up to 20m away. We then define a physically accurate radiative transfer model and solve this inverse problem without the need for annotated data using a probabilistic programming-based Bayesian approach, which yields full posterior distribution estimates of the involved variables that are consistent with laboratory-grade measurements

    On the duality between retinex and image dehazing

    No full text
    Comunicaci贸 presentada a: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, celebrada a Salt Lake City, Estats Units d'Am猫rica, del 18 al 23 de juny de 2018.Image dehazing deals with the removal of undesired lossof visibility in outdoor images due to the presence of fog.Retinex is a color vision model mimicking the ability ofthe Human Visual System to robustly discount varying illu-minations when observing a scene under different spectrallighting conditions. Retinex has been widely explored inthe computer vision literature for image enhancement andother related tasks. While these two problems are appar-ently unrelated, the goal of this work is to show that theycan be connected by a simple linear relationship. Specif-ically, most Retinex-based algorithms have the character-istic feature of always increasing image brightness, whichturns them into ideal candidates for effective image dehaz-ing by directly applying Retinex to a hazy imagewhose in-tensities have been inverted. In this paper, we give theoret-ical proof that Retinex on inverted intensities is a solutionto the image dehazing problem. Comprehensive qualitativeand quantitative results indicate that several classical andmodern implementations of Retinex can be transformed intocompeting image dehazing algorithms performing on pairwith more complex fog removal methods, and can overcomesome of the main challenges associated with this problem.JVC was supported by the Spanish government grant ref.IJCI-2014-19516, and MB by European Research Coun-cil, Starting Grant ref. 306337, by the Spanish governmentgrant ref. TIN2015-71537-P, & by Icrea Academia Award
    corecore