71 research outputs found
InceptionCaps: A Performant Glaucoma Classification Model for Data-scarce Environment
Glaucoma is an irreversible ocular disease and is the second leading cause of
visual disability worldwide. Slow vision loss and the asymptomatic nature of
the disease make its diagnosis challenging. Early detection is crucial for
preventing irreversible blindness. Ophthalmologists primarily use retinal
fundus images as a non-invasive screening method. Convolutional neural networks
(CNN) have demonstrated high accuracy in the classification of medical images.
Nevertheless, CNN's translation-invariant nature and inability to handle the
part-whole relationship between objects make its direct application unsuitable
for glaucomatous fundus image classification, as it requires a large number of
labelled images for training. This work reviews existing state of the art
models and proposes InceptionCaps, a novel capsule network (CapsNet) based deep
learning model having pre-trained InceptionV3 as its convolution base, for
automatic glaucoma classification. InceptionCaps achieved an accuracy of 0.956,
specificity of 0.96, and AUC of 0.9556, which surpasses several
state-of-the-art deep learning model performances on the RIM-ONE v2 dataset.
The obtained result demonstrates the robustness of the proposed deep learning
model.Comment: 8 page
Climate variability and human livelihoods in western India: 1780-1860
This thesis presents a unique exploration of societal vulnerability to climate variability
through an analysis of the historical climatology of colonial western India between
1780 and 1860. It utilises a range of historical documentary sources, most notably
English language newspapers alongside materials written by officials of the British East
India Company and British and American missionaries. Information from these sources
is used to reconstruct past rainfall variability, with the resulting climatic chronology
used as a backdrop against which to examine societal responses to climate.
The study adopts a content analysis methodology to reconstruct monsoon intensity
from 1780-1860. This is calibrated against the instrumental rainfall record for western
India, which extends back to 1847. The reconstruction therefore represents a 67-year
extension of the monsoon record for western India. The extended chronology is
compared with existing reconstructions of climatic forcings related to monsoon
rainfall, including the strength of the Somali jet and indices of El Nino Southern
Oscillation. These suggest a stationarity in the relationship between these forcings and
monsoon rainfall during and after the study period, indicating that the reconstruction
methodology is robust.
The analysis of societal vulnerability to climate focuses upon severe drought episodes
identified through the rainfall reconstruction. Eight such episodes are identified, all
occurring where drought was widespread across the study area. Of these, five drought
episodes occurred after previous years of deficient monsoon rainfall. Vulnerability at
the local level appears to have been driven predominantly by indebtedness and a lack
of government accountability, coupled with limited markets. Institutional adaptation
policy changed significantly with the shift from Maratha to British rule in 1818 through
the adoption of laissez faire drought remediation. Evidence suggests that this did not
affect vulnerability significantly during the duration of the study period, as the
widespread acceptance of the doctrine amongst the colonial community avoided
institutional inertia. However, this may have served to increase vulnerability to
droughts in the later part of the nineteenth century
시공간 주의집중을 갖는 이중 흐름 행동인식 신경망
학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 전화숙.오늘날 활발한 심층 신경망 연구와 데이터 저장 및 처리 기술 발달로 인해 이 미지 뿐만 아니라 비디오와 같은 시간 흐름을 가진 대용량 데이터에서 다양한 인식 문제를 수행하는 연구가 더욱 더 많은 관심을 받고 있다. 그 중에서도 이중 흐름 신경망은 처음으로 신경망을 통한 학습이 기존의 수작업으로 뽑은 특징보다 (hand- crafted features) 좋은 성능을 보여준 이후로, 비디오 행동 인식에서 주류 아키텍쳐로 자리잡았다. 본 논문에서는 해당 아키텍쳐를 확장하여 비디오에서 동작 인식을 위해 독립적으로 훈련된 이중 흐름 신경망에 시공간 주의집중을 주는 아키텍쳐를 제안했 다. 본 논문에서는 cross attention을 통해 기존의 독립적인 신경망에 상호 보완적인 학습으로 성능 향상을 유도했다. HMDB-51의 표준 비디오 행동인식 벤치 마크에서 본 논문의 아키텍쳐의 성능을 실험하였으며, 기존의 아키텍쳐보다 개선된 성능을 얻을 수 있었다.Two-stream architecture has been mainstream since the success of [1], but two important information is processed independently and not interacted until the late fusion. We investigate a different spatio-temporal attention architecture based on two separate recognition streams (spatial and temporal), which interact with each other by cross attention. The spatial stream performs action recognition from still video frames, whilst the temporal stream is trained to recognise action from motion in the form of dense optical flow. Both streams convey their learned knowledge to the other stream in the form of attention maps. Cross attentions allow us to exploit the availability of supplemental information and enhance learning of the streams. To demonstrate the benefits of our proposed cross-stream spatio-temporal attention architecture, it has been evaluated on two standard action recognition benchmarks where it boosts the previous performance.요 약
제 1 장 서론
제 2 장 관련 연구
2.1 행동 인식에서의 이중 흐름 신경망
2.2 행동인식에서의 주의 집중(Attention)
제 3 장 시공간 주의집중을 갖는 이중 흐름 행동인식 신경망
3.1 효과적인 주의집중 추출
3.2 행동패턴 학습과정
제 4 장 실험
4.1 데이터셋과 구현 세부사항
4.2 성능 비교
제 5 장 결론
ABSTRACTMaste
Evaluation of fringe projection and laser scanning for 3d reconstruction of dental pieces
The rapid prototyping and copying of real 3D objects play a key role in some industries. Both applications rely on the generation of appropriated computer aided manufacturing (CAM) files. These files represent a set of coordinates of an object and can be understood by a computer numerically controlled (CNC) machine. Non-contact techniques, like laser scanning and fringe projection, are among the possibilities for obtaining such CAM files. In this work, a comparison between the two aforementioned non-contact techniques is presented. The comparison is made based on their performance as candidates for generating CAM files of objects of high reflectivity and maximum lateral dimensions of the order of 15 mm The parameters tested are the quality of the 3D reconstruction, the processing time, and the possibility of these being implemented in industrial scenarios, among others. Under the scope of these parameters, it is concluded that laser scanning offers superior performance for the kind of objects here considered. The techniques are evaluated with dental pieces in order to validate these methodologies in the rapid prototyping and copying of teeth
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Masked autoencoding has shown excellent performance on self-supervised video
representation learning. Temporal redundancy has led to a high masking ratio
and customized masking strategy in VideoMAE. In this paper, we aim to further
improve the performance of video masked autoencoding by introducing a motion
guided masking strategy. Our key insight is that motion is a general and unique
prior in video, which should be taken into account during masked pre-training.
Our motion guided masking explicitly incorporates motion information to build
temporal consistent masking volume. Based on this masking volume, we can track
the unmasked tokens in time and sample a set of temporal consistent cubes from
videos. These temporal aligned unmasked tokens will further relieve the
information leakage issue in time and encourage the MGMAE to learn more useful
structure information. We implement our MGMAE with an online efficient optical
flow estimator and backward masking map warping strategy. We perform
experiments on the datasets of Something-Something V2 and Kinetics-400,
demonstrating the superior performance of our MGMAE to the original VideoMAE.
In addition, we provide the visualization analysis to illustrate that our MGMAE
can sample temporal consistent cubes in a motion-adaptive manner for more
effective video pre-training.Comment: ICCV 2023 camera-ready versio
Segmentierung medizinischer Bilddaten und bildgestützte intraoperative Navigation
Die Entwicklung von Algorithmen zur automatischen oder semi-automatischen Verarbeitung von medizinischen Bilddaten hat in den letzten Jahren mehr und mehr an Bedeutung gewonnen. Das liegt zum einen an den immer besser werdenden medizinischen Aufnahmemodalitäten, die den menschlichen Körper immer feiner virtuell abbilden können. Zum anderen liegt dies an der verbesserten Computerhardware, die eine algorithmische Verarbeitung der teilweise im Gigabyte-Bereich liegenden Datenmengen in einer vernünftigen Zeit erlaubt. Das Ziel dieser Habilitationsschrift ist die Entwicklung und Evaluation von Algorithmen für die medizinische Bildverarbeitung. Insgesamt besteht die Habilitationsschrift aus einer Reihe von Publikationen, die in drei übergreifende Themenbereiche gegliedert sind:
-Segmentierung medizinischer Bilddaten anhand von vorlagenbasierten Algorithmen
-Experimentelle Evaluation quelloffener Segmentierungsmethoden unter medizinischen Einsatzbedingungen
-Navigation zur Unterstützung intraoperativer Therapien
Im Bereich Segmentierung medizinischer Bilddaten anhand von vorlagenbasierten Algorithmen wurden verschiedene graphbasierte Algorithmen in 2D und 3D entwickelt, die einen gerichteten Graphen mittels einer Vorlage aufbauen. Dazu gehört die Bildung eines Algorithmus zur Segmentierung von Wirbeln in 2D und 3D. In 2D wird eine rechteckige und in 3D eine würfelförmige Vorlage genutzt, um den Graphen aufzubauen und das Segmentierungsergebnis zu berechnen. Außerdem wird eine graphbasierte Segmentierung von Prostatadrüsen durch eine Kugelvorlage zur automatischen Bestimmung der Grenzen zwischen Prostatadrüsen und umliegenden Organen vorgestellt. Auf den vorlagenbasierten Algorithmen aufbauend, wurde ein interaktiver Segmentierungsalgorithmus, der einem Benutzer in Echtzeit das Segmentierungsergebnis anzeigt, konzipiert und implementiert. Der Algorithmus nutzt zur Segmentierung die verschiedenen Vorlagen, benötigt allerdings nur einen Saatpunkt des Benutzers. In einem weiteren Ansatz kann der Benutzer die Segmentierung interaktiv durch zusätzliche Saatpunkte verfeinern. Dadurch wird es möglich, eine semi-automatische Segmentierung auch in schwierigen Fällen zu einem zufriedenstellenden Ergebnis zu führen.
Im Bereich Evaluation quelloffener Segmentierungsmethoden unter medizinischen Einsatzbedingungen wurden verschiedene frei verfügbare Segmentierungsalgorithmen anhand von Patientendaten aus der klinischen Routine getestet. Dazu gehörte die Evaluierung der semi-automatischen Segmentierung von Hirntumoren, zum Beispiel Hypophysenadenomen und Glioblastomen, mit der frei verfügbaren Open Source-Plattform 3D Slicer. Dadurch konnte gezeigt werden, wie eine rein manuelle Schicht-für-Schicht-Vermessung des Tumorvolumens in der Praxis unterstützt und beschleunigt werden kann. Weiterhin wurde die Segmentierung von Sprachbahnen in medizinischen Aufnahmen von Hirntumorpatienten auf verschiedenen Plattformen evaluiert.
Im Bereich Navigation zur Unterstützung intraoperativer Therapien wurden Softwaremodule zum Begleiten von intra-operativen Eingriffen in verschiedenen Phasen einer Behandlung (Therapieplanung, Durchführung, Kontrolle) entwickelt. Dazu gehört die erstmalige Integration des OpenIGTLink-Netzwerkprotokolls in die medizinische Prototyping-Plattform MeVisLab, die anhand eines NDI-Navigationssystems evaluiert wurde. Außerdem wurde hier ebenfalls zum ersten Mal die Konzeption und Implementierung eines medizinischen Software-Prototypen zur Unterstützung der intraoperativen gynäkologischen Brachytherapie vorgestellt. Der Software-Prototyp enthielt auch ein Modul zur erweiterten Visualisierung bei der MR-gestützten interstitiellen gynäkologischen Brachytherapie, welches unter anderem die Registrierung eines gynäkologischen Brachytherapie-Instruments in einen intraoperativen Datensatz einer Patientin ermöglichte. Die einzelnen Module führten zur Vorstellung eines umfassenden bildgestützten Systems für die gynäkologische Brachytherapie in einem multimodalen Operationssaal. Dieses System deckt die prä-, intra- und postoperative Behandlungsphase bei einer interstitiellen gynäkologischen Brachytherapie ab
- …