Search CORE

35 research outputs found

ImageCLEF 2014: Overview and analysis of the results

Author: A. Mueen
A. Vailaya
B. Caputo
B.C. Ko
F. Frate Del
J. Villena Román
K. Saenko
K.-S. Goh
M.A. Friedl
R. Shi
R.C. Wong
S. Kim
S.B. Park
T. Tommasi
T. Tsikrika
X. Qi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This paper presents an overview of the ImageCLEF 2014 evaluation lab. Since its first edition in 2003, ImageCLEF has become one of the key initiatives promoting the benchmark evaluation of algorithms for the annotation and retrieval of images in various domains, such as public and personal images, to data acquired by mobile robot platforms and medical archives. Over the years, by providing new data collections and challenging tasks to the community of interest, the ImageCLEF lab has achieved an unique position in the image annotation and retrieval research landscape. The 2014 edition consists of four tasks: domain adaptation, scalable concept image annotation, liver CT image annotation and robot vision. This paper describes the tasks and the 2014 competition, giving a unifying perspective of the present activities of the lab while discussing future challenges and opportunities.This work has been partially supported by the tranScriptorium FP7 project under grant #600707 (M. V., R. P.).Caputo, B.; Müller, H.; Martinez-Gomez, J.; Villegas Santamaría, M.; Acar, B.; Patricia, N.; Marvasti, N.... (2014). ImageCLEF 2014: Overview and analysis of the results. En Information Access Evaluation. Multilinguality, Multimodality, and Interaction: 5th International Conference of the CLEF Initiative, CLEF 2014, Sheffield, UK, September 15-18, 2014. Proceedings. Springer Verlag (Germany). 192-211. https://doi.org/10.1007/978-3-319-11382-1_18S192211Bosch, A., Zisserman, A.: Image classification using random forests and ferns. In: Proc. CVPR (2007)Caputo, B., Müller, H., Martinez-Gomez, J., Villegas, M., Acar, B., Patricia, N., Marvasti, N., Üsküdarlı, S., Paredes, R., Cazorla, M., Garcia-Varea, I., Morell, V.: ImageCLEF 2014: Overview and analysis of the results. In: Kanoulas, E., et al. (eds.) CLEF 2014. LNCS, vol. 8685, Springer, Heidelberg (2014)Caputo, B., Patricia, N.: Overview of the ImageCLEF 2014 Domain Adaptation Task. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes (2014)de Carvalho Gomes, R., Correia Ribas, L., Antnio de Castro Jr., A., Nunes Gonalves, W.: CPPP/UFMS at ImageCLEF 2014: Robot Vision Task. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes (2014)Del Frate, F., Pacifici, F., Schiavon, G., Solimini, C.: Use of neural networks for automatic classification from high-resolution images. IEEE Transactions on Geoscience and Remote Sensing 45(4), 800–809 (2007)Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, p. II–1002. IEEE (2004)Friedl, M.A., Brodley, C.E.: Decision tree classification of land cover from remotely sensed data. Remote Sensing of Environment 61(3), 399–409 (1997)Goh, K.-S., Chang, E.Y., Li, B.: Using one-class and two-class svms for multiclass image annotation. IEEE Transactions on Knowledge and Data Engineering 17(10), 1333–1346 (2005)Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: Proc. CVPR. Extended Version Considering its Additional MaterialJie, L., Tommasi, T., Caputo, B.: Multiclass transfer learning from unconstrained priors. In: Proc. ICCV (2011)Kim, S., Park, S., Kim, M.: Image classification into object / non-object classes. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 393–400. Springer, Heidelberg (2004)Ko, B.C., Lee, J., Nam, J.Y.: Automatic medical image annotation and keyword-based image retrieval using relevance feedback. Journal of Digital Imaging 25(4), 454–465 (2012)Kökciyan, N., Türkay, R., Üsküdarlı, S., Yolum, P., Bakır, B., Acar, B.: Semantic Description of Liver CT Images: An Ontological Approach. IEEE Journal of Biomedical and Health Informatics (2014)Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE (2006)Martinez-Gomez, J., Garcia-Varea, I., Caputo, B.: Overview of the imageclef 2012 robot vision task. In: CLEF (Online Working Notes/Labs/Workshop) (2012)Martinez-Gomez, J., Garcia-Varea, I., Cazorla, M., Caputo, B.: Overview of the imageclef 2013 robot vision task. In: CLEF 2013 Evaluation Labs and Workshop, Online Working Notes (2013)Martinez-Gomez, J., Cazorla, M., Garcia-Varea, I., Morell, V.: Overview of the ImageCLEF 2014 Robot Vision Task. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes (2014)Mueen, A., Zainuddin, R., Baba, M.S.: Automatic multilevel medical image annotation and retrieval. Journal of Digital Imaging 21(3), 290–295 (2008)Muller, H., Clough, P., Deselaers, T., Caputo, B.: ImageCLEF: experimental evaluation in visual information retrieval. Springer (2010)Park, S.B., Lee, J.W., Kim, S.K.: Content-based image classification using a neural network. Pattern Recognition Letters 25(3), 287–300 (2004)Patricia, N., Caputo, B.: Learning to learn, from transfer learning to domain adaptation: a unifying perspective. In: Proc. CVPR (2014)Pronobis, A., Caputo, B.: The robot vision task. In: Muller, H., Clough, P., Deselaers, T., Caputo, B. (eds.) ImageCLEF. The Information Retrieval Series, vol. 32, pp. 185–198. Springer, Heidelberg (2010)Pronobis, A., Christensen, H., Caputo, B.: Overview of the imageclef@ icpr 2010 robot vision track. In: Recognizing Patterns in Signals, Speech, Images and Videos, pp. 171–179 (2010)Qi, X., Han, Y.: Incorporating multiple svms for automatic image annotation. Pattern Recognition 40(2), 728–741 (2007)Reshma, I.A., Ullah, M.Z., Aono, M.: KDEVIR at ImageCLEF 2014 Scalable Concept Image Annotation Task: Ontology based Automatic Image Annotation. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes. Sheffield, UK, September 15-18 (2014)Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010)Sahbi, H.: CNRS - TELECOM ParisTech at ImageCLEF 2013 Scalable Concept Image Annotation Task: Winning Annotations with Context Dependent SVMs. In: CLEF 2013 Evaluation Labs and Workshop, Online Working Notes, Valencia, Spain, September 23-26 (2013)Sethi, I.K., Coman, I.L., Stan, D.: Mining association rules between low-level image features and high-level concepts. In: Aerospace/Defense Sensing, Simulation, and Controls, pp. 279–290. International Society for Optics and Photonics (2001)Shi, R., Feng, H., Chua, T.-S., Lee, C.-H.: An adaptive image content representation and segmentation approach to automatic image annotation. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 545–554. Springer, Heidelberg (2004)Tommasi, T., Caputo, B.: Frustratingly easy nbnn domain adaptation. In: Proc. ICCV (2013)Tommasi, T., Quadrianto, N., Caputo, B., Lampert, C.H.: Beyond dataset bias: Multi-task unaligned shared knowledge transfer. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 1–15. Springer, Heidelberg (2013)Tsikrika, T., de Herrera, A.G.S., Müller, H.: Assessing the scholarly impact of imageCLEF. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds.) CLEF 2011. LNCS, vol. 6941, pp. 95–106. Springer, Heidelberg (2011)Ünay, D., Soldea, O., Akyüz, S., Çetin, M., Erçil, A.: Medical image retrieval and automatic annotation: Vpa-sabanci at imageclef 2009. In: The Cross-Language Evaluation Forum (CLEF) (2009)Vailaya, A., Figueiredo, M.A., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Transactions on Image Processing 10(1), 117–130 (2001)Villegas, M., Paredes, R.: Overview of the ImageCLEF 2012 Scalable Web Image Annotation Task. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy, September 17-20 (2012), http://mvillegas.info/pub/Villegas12_CLEF_Annotation-Overview.pdfVillegas, M., Paredes, R.: Overview of the ImageCLEF 2014 Scalable Concept Image Annotation Task. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes, Sheffield, UK, September 15-18 (2014), http://mvillegas.info/pub/Villegas14_CLEF_Annotation-Overview.pdfVillegas, M., Paredes, R., Thomee, B.: Overview of the ImageCLEF 2013 Scalable Concept Image Annotation Subtask. In: CLEF 2013 Evaluation Labs and Workshop, Online Working Notes, Valencia, Spain, September 23-26 (2013), http://mvillegas.info/pub/Villegas13_CLEF_Annotation-Overview.pdfVillena Román, J., González Cristóbal, J.C., Goñi Menoyo, J.M., Martínez Fernández, J.L.: MIRACLE’s naive approach to medical images annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(7), 1088–1099 (2005)Wong, R.C., Leung, C.H.: Automatic semantic annotation of real-world web images. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(11), 1933–1944 (2008)Yang, C., Dong, M., Fotouhi, F.: Image content annotation using bayesian framework and complement components analysis. In: IEEE International Conference on Image Processing, ICIP 2005, vol. 1, pp. I–1193. IEEE (2005)Yılmaz, K.Y., Cemgil, A.T., Simsekli, U.: Generalised coupled tensor factorisation. In: Advances in Neural Information Processing Systems, pp. 2151–2159 (2011)Zhang, Y., Qin, J., Chen, F., Hu, D.: NUDTs Participation in ImageCLEF Robot Vision Challenge 2014. In: CLEF 2014 Evaluation Labs and Workshop, Online Working Notes (2014

Crossref

RiuNet

Archivio della ricerca- Università di Roma La Sapienza

Experiences from the ImageCLEF Medical Retrieval and Annotation Tasks

Author: A Depeursinge
A Hanbury
Allan Hanbury
BH Menze
CV Thornley
H Müller
H Müller
Henning Müller
Jayashree Kalpathy-Cramer
M Krenn
Müller H Boyer C, Gaudinat A, Hersh W, Geissbuhler A (2007a) Analyzing web log files of the health on the net HONmedia search engine to define typical image search tasks for image retrieval evaluation. In: MedInfo
O Jimenez-del-Toro
P Clough
Paul Clough
Paul Clough
Paul Clough
R Buyya
T Gollub
T Heimann
T Tommasi
Theodora Tsikrika
Theodora Tsikrika
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/08/2019
Field of study

The medical tasks in ImageCLEF have been run every year from 2004-2018 and many different tasks and data sets have been used over these years. The created resources are being used by many researchers well beyond the actual evaluation campaigns and are allowing to compare the performance of many techniques on the same grounds and in a reproducible way. Many of the larger data sets are from the medical literature, as such images are easier to obtain and to share than clinical data, which was used in a few smaller ImageCLEF challenges that are specifically marked with the disease type and anatomic region. This chapter describes the main results of the various tasks over the years, including data, participants, types of tasks evaluated and also the lessons learned in organizing such tasks for the scientific community

University of Essex Research Repository

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Anatomy-specific classification of medical images using deep convolutional nets

Author: Kim Lauren
Lee Christopher T.
Lu Le
Roth Holger R.
Seff Ari
Shin Hoo-Chang
Summers Ronald M.
Yao Jianhua
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/04/2015
Field of study

Automated classification of human anatomy is an important prerequisite for many computer-aided diagnosis systems. The spatial complexity and variability of anatomy throughout the human body makes classification difficult. "Deep learning" methods such as convolutional networks (ConvNets) outperform other state-of-the-art methods in image classification tasks. In this work, we present a method for organ- or body-part-specific anatomical classification of medical images acquired using computed tomography (CT) with ConvNets. We train a ConvNet, using 4,298 separate axial 2D key-images to learn 5 anatomical classes. Key-images were mined from a hospital PACS archive, using a set of 1,675 patients. We show that a data augmentation approach can help to enrich the data set and improve classification performance. Using ConvNets and data augmentation, we achieve anatomy-specific classification error of 5.9 % and area-under-the-curve (AUC) values of an average of 0.998 in testing. We demonstrate that deep learning can be used to train very reliable and accurate classifiers that could initialize further computer-aided diagnosis.Comment: Presented at: 2015 IEEE International Symposium on Biomedical Imaging, April 16-19, 2015, New York Marriott at Brooklyn Bridge, NY, US

arXiv.org e-Print Archive

CiteSeerX

Crossref

Preface : CLEF (working notes) 2014

Author: Cappellato Linda
Ferro Nicola
Halvey Martin
Kraaij Wessel
Publication venue
Publication date: 01/09/2014
Field of study

The CLEF 2014 conference is the fifteenth edition of the popular CLEF campaign and workshop series which has run since 2000 contributing to the systematic evaluation of multilingual and multimodal information access systems, primarily through experimentation on shared tasks. In 2010 CLEF was launched in a new format, as a conference with research presentations, panels, poster and demo sessions and laboratory evaluation workshops. These are proposed and operated by groups of organizers volunteering their time and effort to define, promote, administrate and run an evaluation activity

University of Strathclyde Institutional Repository

Recuperação de informação multimodal em repositórios de imagem médica

Author: Pinho Eduardo Miguel Coutinho Gomes de
Publication venue
Publication date: 25/07/2019
Field of study

The proliferation of digital medical imaging modalities in hospitals and other diagnostic facilities has created huge repositories of valuable data, often not fully explored. Moreover, the past few years show a growing trend of data production. As such, studying new ways to index, process and retrieve medical images becomes an important subject to be addressed by the wider community of radiologists, scientists and engineers. Content-based image retrieval, which encompasses various methods, can exploit the visual information of a medical imaging archive, and is known to be beneficial to practitioners and researchers. However, the integration of the latest systems for medical image retrieval into clinical workflows is still rare, and their effectiveness still show room for improvement. This thesis proposes solutions and methods for multimodal information retrieval, in the context of medical imaging repositories. The major contributions are a search engine for medical imaging studies supporting multimodal queries in an extensible archive; a framework for automated labeling of medical images for content discovery; and an assessment and proposal of feature learning techniques for concept detection from medical images, exhibiting greater potential than feature extraction algorithms that were pertinently used in similar tasks. These contributions, each in their own dimension, seek to narrow the scientific and technical gap towards the development and adoption of novel multimodal medical image retrieval systems, to ultimately become part of the workflows of medical practitioners, teachers, and researchers in healthcare.A proliferação de modalidades de imagem médica digital, em hospitais, clínicas e outros centros de diagnóstico, levou à criação de enormes repositórios de dados, frequentemente não explorados na sua totalidade. Além disso, os últimos anos revelam, claramente, uma tendência para o crescimento da produção de dados. Portanto, torna-se importante estudar novas maneiras de indexar, processar e recuperar imagens médicas, por parte da comunidade alargada de radiologistas, cientistas e engenheiros. A recuperação de imagens baseada em conteúdo, que envolve uma grande variedade de métodos, permite a exploração da informação visual num arquivo de imagem médica, o que traz benefícios para os médicos e investigadores. Contudo, a integração destas soluções nos fluxos de trabalho é ainda rara e a eficácia dos mais recentes sistemas de recuperação de imagem médica pode ser melhorada. A presente tese propõe soluções e métodos para recuperação de informação multimodal, no contexto de repositórios de imagem médica. As contribuições principais são as seguintes: um motor de pesquisa para estudos de imagem médica com suporte a pesquisas multimodais num arquivo extensível; uma estrutura para a anotação automática de imagens; e uma avaliação e proposta de técnicas de representation learning para deteção automática de conceitos em imagens médicas, exibindo maior potencial do que as técnicas de extração de features visuais outrora pertinentes em tarefas semelhantes. Estas contribuições procuram reduzir as dificuldades técnicas e científicas para o desenvolvimento e adoção de sistemas modernos de recuperação de imagem médica multimodal, de modo a que estes façam finalmente parte das ferramentas típicas dos profissionais, professores e investigadores da área da saúde.Programa Doutoral em Informátic

Repositório Institucional da Universidade de Aveiro

Use Case Oriented Medical Visual Information Retrieval & System Evaluation

Author: Garcia Seco De Herrera Alba
Publication venue
Publication date: 01/01/2015
Field of study

Large amounts of medical visual data are produced daily in hospitals, while new imaging techniques continue to emerge. In addition, many images are made available continuously via publications in the scientific literature and can also be valuable for clinical routine, research and education. Information retrieval systems are useful tools to provide access to the biomedical literature and fulfil the information needs of medical professionals. The tools developed in this thesis can potentially help clinicians make decisions about difficult diagnoses via a case-based retrieval system based on a use case associated with a specific evaluation task. This system retrieves articles from the biomedical literature when querying with a case description and attached images. This thesis proposes a multimodal approach for medical case-based retrieval with focus on the integration of visual information connected to text. Furthermore, the ImageCLEFmed evaluation campaign was organised during this thesis promoting medical retrieval system evaluation

University of Essex Research Repository

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Archive ouverte UNIGE

Image Area Reduction for Efficient Medical Image Retrieval

Author: CAMLICA ZEHRA
Publication venue: 'University of Waterloo'
Publication date: 01/01/2015
Field of study

Content-based image retrieval (CBIR) has been one of the most active areas in medical image analysis in the last two decades because of the steadily increase in the number of digital images used. Efficient diagnosis and treatment planning can be supported by developing retrieval systems to provide high-quality healthcare. Extensive research has attempted to improve the image retrieval efficiency. The critical factors when searching in large databases are time and storage requirements. In general, although many methods have been suggested to increase accuracy, fast retrieval has been rather sporadically investigated. In this thesis, two different approaches are proposed to reduce both time and space requirements for medical image retrieval. The IRMA data set is used to validate the proposed methods. Both methods utilized Local Binary Pattern (LBP) histogram features which are extracted from 14,410 X-ray images of IRMA dataset. The first method is image folding that operates based on salient regions in an image. Saliency is determined by a context-aware saliency algorithm which includes folding the image. After the folding process, the reduced image area is used to extract multi-block and multi-scale LBP features and to classify these features by multi-class Support vector machine (SVM). The other method consists of classification and distance-based feature similarity. Images are firstly classified into general classes by utilizing LBP features. Subsequently, the retrieval is performed within the class to locate the most similar images. Between the retrieval and classification processes, LBP features are eliminated by employing the error histogram of a shallow (n/p/n) autoencoder to quantify the retrieval relevance of image blocks. If the region is relevant, the autoencoder gives large error for its decoding. Hence, via examining the autoencoder error of image blocks, irrelevant regions can be detected and eliminated. In order to calculate similarity within general classes, the distance between the LBP features of relevant regions is calculated. The results show that the retrieval time can be reduced, and the storage requirements can be lowered without significant decrease in accuracy

University of Waterloo's Institutional Repository

The Liver Tumor Segmentation Benchmark (LiTS)

Author: Amitai Michal Marianne
Antonelli Michela
Bae Woong
Bakas Spyridon
Beetz Marcel
Bellver Míriam
Ben-Cohen Avi
Bi Lei
Bilic Patrick
Braren Rickmer
Cardoso Jorge
Chartrand Gabriel
Chen Hao
Chlebus Grzegorz
Christ Patrick
Dam Erik B
Dou Qi
Drozdzal Michal
Ettlinger Florian
Ezhov Ivan
Fu Chi-Wing
Georgescu Bogdan
Giró-I-Nieto Xavier
Greenspan Hayit
Gruen Felix
Han Xu
Heinemann Volker
Heng Pheng-Ann
Hesser Jürgen
Hofmann Felix
Holch Julian Walter
Hostettler Alexandre
Hu Xiaobin
Hülsemeyer Christian
Igel Christian
Isensee Fabian
Jacobs Colin
Jia Fucang
Joskowicz Leo
Jäger Paul
Kadoury Samuel
Kaissis Georgios
Kaluva Krishna Chaitanya
Khened Mahendra
Kim Ildoo
Kim Jae-Hun
Kim Sungwoong
Kirschke Jan
Kofler Florian
Kohl Simon
Konopczynski Tomasz
Kori Avinash
Krishnamurthi Ganapathy
Lev-Cohain Naama
Li Fan
Li Hongchao
Li Hongwei Bran
Li Junbo
Li Xiaomeng
Lipková Jana
Lohöfer Fabian
Lowengrub John
Ma Jun
Maier-Hein Klaus
Mamani Gabriel Efrain Humpire
Maninis Kevis-Kokitsi
Meine Hans
Menze Bjoern
Merhof Dorit
Moltz Jan Hendrik
Navarro Fernando
Paetzold Johannes C
Pai Akshay
Pal Christopher
Perslev Mathias
Petersen Jens
Piraud Marie
Pont-Tuset Jordi
Qi Jin
Qi Xiaojuan
Rempfler Markus
Rippel Oliver
Roth Karsten
Sarasua Ignacio
Schenk Andrea
Sekuboyina Anjany
Shen Zengming
Shit Suprosanna
Soler Luc
Sommer Wieland
Sosna Jacob
Szeskin Adi
Tang An
Torres Jordi
van Ginneken Bram
Vivanti Refael
Vorontsov Eugene
Wachinger Christian
Wang Chunliang
Weninger Leon
Wiestler Benedikt
Wu Jianrong
Xu Daguang
Yang Xiaoping
Yu Simon Chun-Ho
Yuan Yading
Yue Miao
Zhang Liping
Zhang Zhiheng
Publication venue
Publication date: 13/01/2019
Field of study

In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LITS) organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2016 and International Conference On Medical Image Computing Computer Assisted Intervention (MICCAI) 2017. Twenty four valid state-of-the-art liver and liver tumor segmentation algorithms were applied to a set of 131 computed tomography (CT) volumes with different types of tumor contrast levels (hyper-/hypo-intense), abnormalities in tissues (metastasectomie) size and varying amount of lesions. The submitted algorithms have been tested on 70 undisclosed volumes. The dataset is created in collaboration with seven hospitals and research institutions and manually reviewed by independent three radiologists. We found that not a single algorithm performed best for liver and tumors. The best liver segmentation algorithm achieved a Dice score of 0.96(MICCAI) whereas for tumor segmentation the best algorithm evaluated at 0.67(ISBI) and 0.70(MICCAI). The LITS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource.Comment: conferenc

arXiv.org e-Print Archive

Copenhagen University Research Information System

PolyPublie

Visual search for musical performances and endoscopic videos

Author: Roldán Carlos Jennifer
Publication venue: Universitat Politècnica de Catalunya
Publication date: 07/05/2015
Field of study

[ANGLÈS] This project explores the potential of LIRE, an en existing Content-Based Image Retrieval (CBIR) system, when used to retrieve medical videos. These videos are recording of the live streams used by surgeons during the endoscopic procedures, captured from inside of the subject. The growth of such video content stored in servers requires search engines capable to assist surgeons in their management and retrieval. In our tool, queries are formulated by visual examples and those allow surgeons to re-find shots taken during the procedure. This thesis presents an extension and adaptation of Lire for video retrieval based on visual features and late fusion. The results are assessed from two perspectives: a quantitative and qualitative one. While the quantitative one follows the standard practices and metrics for video retrieval, the qualitative assessment has been based on an empirical social study using a semi-interactive web-interface. In particular, a thinking aloud test was applied to analyze if the user expectations and requirements were fulfilled. Due to the scarcity of surgeons available for the qualitative tests, a second domain was also addressed: videos captured at musical performances. These type of videos has also experienced an exponential growth with the advent of affordable multimedia smart phones, available to a large audience. Analogously to the endoscopic videos, searching in a large data set of such videos is a challenging topic.[CASTELLÀ] Este proyecto investiga el potencial de Lire, un sistema existente de recuperación basado en contenido de imagen (CBIR) utilizado en el dominio médico. Estos vídeos son grabaciones a tiempo real del interior de los pacientes y son utilizados por cirujanos durante las operaciones de endoscopia. La creciente demanda de este conjunto de vídeos que son almacenados en diferentes servidores, requiere nuevos motores de búsqueda capaces de dar soporte al trabajo de los médicos con su gestión y posterior recuperación cuando se necesite. En nuestra herramienta, las consultas son formuladas mediante ejemplos visuales. Esto permite a los cirujanos volver a encontrar los diferentes instantes capturados durante las intervenciones. En esta tesis se presenta una extensión y adaptación de Lire para la recuperación de vídeo basado en las características visuales y métodos de late fusion. Los resultados son evaluados desde dos perspectivas: una cuantitativa y una cualitativa. Mientras que la parte cuantitativa sigue el estándar de las prácticas y métricas empleadas en vídeo retrieval, la evaluación cualitativa ha sido basada en un estudio social empírico mediante una interfaz web semi-interactiva. Particularmente, se ha emprendido el método "thinking aloud test" para analizar si nuestra herramienta cumple con las expectativas y necesidades de los usuarios a la hora de utilizar la aplicación. Debido a la escasez de médicos disponibles para llevar a cabo las pruebas cualitativas, el trabajo se ha dirigido también a un segundo dominio: conjunto de vídeos de acontecimientos musicales. Este tipo de vídeos también ha experimentado un crecimiento exponencial con la llegada de los smart phones y se encuentran al alcance de un público muy amplio. Análogamente a los vídeos endoscópicos, hacer una busca en una gran base de datos de este tipo también es un tema difícil y motivo de estudio.[CATALÀ] Aquest projecte investiga el potencial de Lire, un sistema existent de recuperació basat en contingut d'imatge (CBIR) utilitzat en el domini mèdic. Aquests vídeos són enregistraments a temps real de l'interior dels pacients i són utilitzats per cirurgians durant les operacions d'endoscòpia. La creixent demanda d'aquest conjunt de vídeos que són emmagatzemats a diferents servidors, requereix nous motors de cerca capaços de donar suport a la feina dels metges amb la seva gestió i posterior recuperació quan es necessiti. A la nostra eina, les consultes són formulades mitjançant exemples visuals. Això permet als cirurgians tornar a trobar els diferents instants capturats durant la intervenció. En aquesta tesi es presenta una extensió i adaptació del Lire per a la recuperació de vídeo basat en característiques visuals i late fusion. Els resultats són avaluats des de dues perspectives: una quantitativa i una qualitativa. Mentre que la part quantitativa segueix l'estàndard de les pràctiques i mètriques per vídeo retrieval, l'avaluació qualitativa ha estat basada en un estudi social empíric mitjançant una interfície web semiinteractiva. Particularment, s'ha emprès el mètode "thinking aloud test" per analitzar si la nostra eina compleix amb les expectatives i necessitats dels usuaris a l'hora d'utilitzar l'aplicació. A causa de l'escassetat de metges disponibles per dur a terme les proves qualitatives, el treball s'ha adreçat també a un segon domini: conjunt de vídeos d'esdeveniments musicals. Aquest tipus de vídeos també ha experimentat un creixement exponencial amb l'arribada dels smart phones i es troben a l'abast d'un públic molt ampli. Anàlogament als vídeos endoscòpics, fer una cerca en una gran base de dades d'aquest tipus també és un tema difícil i motiu d'estudi

UPCommons. Portal del coneixement obert de la UPC