Search CORE

64 research outputs found

Report on the Evaluation-as-a-Service (EaaS) Expert Workshop

Author: Allan Hanbury
Anastasia Krithara
Balikas Georgios
Frank Hopfgartner
Henning Müller
Ivan Eggel
Jayashree Kalpathy-Cramer
Jimmy
Jimmy Lin
Krisztian Balog
Martin Potthast
Noriko Kando
Ounis Iadh
Potthast Martin
Simon Mercer
Tim Gollub
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/06/2015
Field of study

In this report, we summarize the outcome of the "Evaluation-as-a-Service" workshop that was held on the 5th and 6th March 2015 in Sierre, Switzerland. The objective of the meeting was to bring together initiatives that use cloud infrastructures, virtual machines, APIs (Application Programming Interface) and related projects that provide evaluation of information retrieval or machine learning tools as a service

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Enlighten

Recent trends in digital text forensics and its evaluation

Author: D. Roure De
E. Stamatatos
E. Stamatatos
H. Blockeel
J.S. Downie
J.W. Pennebaker
J.W. Pennebaker
M. Koppel
M. Koppel
M. Koppel
M. Wojnarski
P. Clough
P. Juola
S. Argamon
S. Argamon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40802-1_28This paper outlines the concepts and achievements of our evaluation lab on digital text forensics, PAN 13, which called for original research and development on plagiarism detection, author identification, and author profiling. We present a standardized evaluation framework for each of the three tasks and discuss the evaluation results of the altogether 58 submitted contributions. For the first time, instead of accepting the output of software runs, we collected the softwares themselves and run them on a computer cluster at our site. As evaluation and experimentation platform we use TIRA, which is being developed at the Webis Group in Weimar. TIRA can handle large-scale software submissions by means of virtualization, sandboxed execution, tailored unit testing, and staged submission. In addition to the achieved evaluation results, a major achievement of our lab is that we now have the largest collection of state-of-the-art approaches with regard to the mentioned tasks for further analysis at our disposal.This work was partially supported by the WIQ-EI IRSES project (Grant No. 269180) within the FP7 Marie Curie action.Gollub, T.; Potthast, M.; Beyer, A.; Busse, M.; Rangel Pardo, FM.; Rosso, P.; Stamatatos, E.... (2013). Recent trends in digital text forensics and its evaluation. En Information Access Evaluation. Multilinguality, Multimodality, and Visualization. Springer Verlag (Germany). 282-302. https://doi.org/10.1007/978-3-642-40802-1_28S282302Aleman, Y., Loya, N., Vilarino Ayala, D., Pinto, D.: Two Methodologies Applied to the Author Profiling Task—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Argamon, S., Juola, P.: Overview of the International Authorship Identification Competition at PAN-2011. In: Proc. of CLEF 2011 (2011)Argamon, S., Koppel, M., Fine, J., Shimoni, A.R.: Gender, Genre, and Writing Style in Formal Written Texts. TEXT 23, 321–346 (2003)Argamon, S., Koppel, M., Pennebaker, J.W., Schler, J.: Automatically Profiling the Author of an Anonymous Text. Commun. ACM 52(2), 119–123 (2009)Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: EvaluatIR: An Online Tool for Evaluating and Comparing IR Systems. In: Proc. of SIGIR 2009 (2009)Blockeel, H., Vanschoren, J.: Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 6–17. Springer, Heidelberg (2007)Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating Gender on Twitter. In: Proc. EMNLP 2011 (2011)Clough, P., Stevenson, M.: Developing a Corpus of Plagiarised Short Answers. Lang. Resour. Eval. 45, 5–24 (2011)Clough, P., Gaizauskas, R., Piao, S.S.L., Wilks, Y.: METER: MEasuring TExt Reuse. In: Proc. ACL 2002 (2002)De Roure, D., Goble, C., Stevens, R.: The Design and Realisation of the myExperiment Virtual Research Environment for Social Sharing of Workflows. Future Gener. Comp. Sy. 25, 561–567 (2009)Caurcel Diaz, A.A., Gomez Hidalgo, J.M.: Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Downie, J.S.: The Music Information Retrieval Evaluation Exchange (2005–2007): A Window into Music Information Retrieval Research. Acoust. Sc. and Tech. 29(4), 247–255 (2008)Hernandez Farias, D.I., Guzman-Cabrera, R., Reyes, A., Rocha, M.A.: Semantic-based Features for Author Profiling Identification: First Insights—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Flekova, L., Gurevych, I.: Can We Hide in the Web? Large Scale Simultaneous Age and Gender Author Profiling in Social Media–Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Forner, P., Navigli, R., Tufis, D. (eds.): CLEF 2013 Evaluation Labs and Workshop – Working Notes Papers (2013)Gillam, L.: Readability for author profiling?—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Gollub, T., Burrows, S., Stein, B.: First Experiences with TIRA for Reproducible Evaluation in Information Retrieval. In: Proc. of OSIR at SIGIR 2012 (August 2012)Gollub, T., Stein, B., Burrows, S.: Ousting Ivory Tower Research: Towards a Web Framework for Providing Experiments as a Service. In: Proc. of SIGIR 2012 (2012)Gollub, T., Stein, B., Burrows, S., Hoppe, D.: TIRA: Configuring, Executing, and Disseminating Information Retrieval Experiments. In: Proc. of TIR at DEXA 2012. IEEE (2012)Goswami, S., Sarkar, S., Rustagi, M.: Stylometric Analysis of Bloggers’ Age and Gender. In: Proc. of ICWSM 2009 (2009)Haggag, O., El-Beltagy, S.: Plagiarism Candidate Retrieval Using Selective Query Formulation and Discriminative Query Scoring—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Holmes, J., Meyerhoff, M.: The Handbook of Language and Gender. Blackwell Handbooks in Linguistics. Wiley (2003)Inches, G., Crestani, F.: Overview of the International Sexual Predator Identification Competition at PAN-2012. In: Proc. of CLEF 2012 (2012)Juola, P.: Authorship Attribution. Found. and Trends in IR 1, 234–334 (2008)Juola, P.: Ad-hoc Authorship Attribution Competition. In: Proc. of ALLC 2004 (2004)Juola, P.: An Overview of the Traditional Authorship Attribution Subtask. In: Proc. of CLEF 2012 (2012)Koppel, M., Winter, Y.: Determining if Two Documents are by the Same Author. Journal of the American Society for Information Science and Technology (to appear)Koppel, M., Argamon, S., Shimoni, A.R.: Automatically Categorizing Written Texts by Author Gender. Literary and Linguistic Computing 17(4), 401–412 (2002)Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring Differentiability: Unmasking Pseudonymous Authors. Journal of Machine Learning Research 8, 1261–1276 (2007)Koppel, M., Schler, J., Argamon, S.: Authorship Attribution in the Wild. Language Resources and Evaluation 45, 83–94 (2011)Kong, L., Qi, H., Du, C., Wang, M., Han, Z.: Approaches for Source Retrieval and Text Alignment of Plagiarism Detection—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Lim, W.Y., Goh, J., Thing, V.L.L.: Content-centric age and gender profiling—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Pastor Lopez-Monroy, A., Montes-Y-Gomez, M., Jair Escalante, H., Villasenor-Pineda, L., Villatoro-Tello, E.: INAOE’s participation at PAN’13: Author Profiling task—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Meina, M., Brodzinska, K., Celmer, B., Czokow, M., Patera, M., Pezacki, J., Wilk, M.: Ensemble-based Classification for Author Profiling using Various Features—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Nguyen, D., Gravel, R., Trieschnigg, D., Meder, T.: “How Old Do You Think I Am?”; A Study of Language and Age in Twitter. In: Proc. of ICWSM 2013 (2013)Nguyen, D., Smith, N.A., Rosé, C.P.: Author Age Prediction from Text Using Linear Regression. In: Proc. of LaTeCH at ACL-HLTGopal Patra, B., Banerjee, S., Das, D., Saikh, T., Bandyopadhyay, S.: Automatic Author Profiling Based on Linguistic and Stylistic Features—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Peersman, C., Daelemans, W., Van Vaerenbergh, L.: Predicting Age and Gender in Online Social Networks. In: Proc. of SMUC 2011 (2011)Pennebaker, J.W.: The Secret Life of Pronouns: What Our Words Say About Us. Bloomsbury, USA (2013)Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.G.: Psychological Aspects of Natural Language Use: Our Words, Our Selves. Annual Review of Psychology 54(1), 547–577 (2003)Potthast, M., Stein, B., Eiselt, A., Barrón-Cedeño, A., Rosso, P.: Overview of the 1st International Competition on Plagiarism Detection. In: Proc. of PAN at SEPLN 2009 (2009)Potthast, M., Barrón-Cedeño, A., Eiselt, A., Stein, B., Rosso, P.: Overview of the 2nd International Competition on Plagiarism Detection. In: Proc. of CLEF 2010 (2010)Potthast, M., Stein, B., Barrón-Cedeño, A., Rosso, P.: An Evaluation Framework for Plagiarism Detection. In: Proc. of COLING 2010 (2010)Potthast, M., Eiselt, A., Barrón-Cedeño, A., Stein, B., Rosso, P.: Overview of the 3rd International Competition on Plagiarism Detection. In: Proc. of CLEF 2011 (2011)Potthast, M., Gollub, T., Hagen, M., Graßegger, J., Kiesel, J., Michel, M., Oberländer, A., Tippmann, M., Barrón-Cedeño, A., Gupta, P., Rosso, P., Stein, B.: Overview of the 4th International Competition on Plagiarism Detection. In: Proc. of CLEF 2012 (2012)Potthast, M., Hagen, M., Stein, B., Graßegger, J., Michel, M., Tippmann, M., Welsch, C.: ChatNoir: A Search Engine for the ClueWeb09 Corpus. In: Proc. of SIGIR 2012 (2012)Potthast, M., Gollub, T., Hagen, M., Tippmann, M., Kiesel, J., Rosso, P., Stamatatos, E., Stein, B.: Overview of the 5th International Competition on Plagiarism Detection. In: Proc. of CLEF 2013 (2013)Potthast, M., Hagen, M., Völske, M., Stein, B.: Crowdsourcing Interaction Logs to Understand Text Reuse from the Web. In: Proc. of ACL 2013. ACM (to appear, August 2013b)Rodíguez Torrejón, D.A., Martín Ramos, J.M.: Text Alignment Module in CoReMo 2.1 Plagiarism Detector—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Santosh, K., Bansal, R., Shekhar, M., Varma, V.: Author Profiling: Predicting Age and Gender from Blogs—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Schler, J., Koppel, M., Argamon, S., Pennebaker, J.W.: Effects of Age and Gender on Blogging. In: Proc. of CAAW 2006 (2006)Stamatatos, E.: A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology 60, 538–556 (2009)Stamatatos, E.: Plagiarism Detection Using Stopword N-grams. Journal of the American Society for Information Science and Technology 62(12), 2512–2527 (2011)Stein, B., Meyer zu Eißen, S., Potthast, M.: Strategies for Retrieving Plagiarized Documents. In: Proc. of SIGIR 2007 (2007)Suchomel, Š., Kasprzak, J., Brandejs, M.: Diverse Queries and Feature Type Selection for Plagiarism Discovery—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Williams, K., Chen, H., Chowdhury, S.R., Giles, C.L.: Unsupervised Ranking for Plagiarism Source Retrieval—Notebook for PAN at CLEF 2013. In: Forner, et al. (eds.) [15]Wojnarski, M., Stawicki, S., Wojnarowski, P.: TunedIT.org: System for Automated Evaluation of Algorithms in Repeatable Experiments. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS, vol. 6086, pp. 20–29. Springer, Heidelberg (2010)Zhang, C., Zhang, P.: Predicting Gender from Blog Posts. Technical report, University of Massachusetts Amherst, USA (2010

Crossref

RiuNet

Evaluation-as-a-service for the computational sciences: overview and outlook

Author: Balog K.
Brodt T.
Cormack G.V.
Eggel I.
Gollub T.
Hanbury A.
Hopfgartner F.
Kalpathy-Cramer J.
Kando N.
Kato M.P.
Krithara A.
Lin J.
Mercer S.
Muller H.
Potthast M.
Viegas E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/11/2018
Field of study

Evaluation in empirical computer science is essential to show progress and assess technologies developed. Several research domains such as information retrieval have long relied on systematic evaluation to measure progress: here, the Cranfield paradigm of creating shared test collections, defining search tasks, and collecting ground truth for these tasks has persisted up until now. In recent years, however, several new challenges have emerged that do not fit this paradigm very well: extremely large data sets, confidential data sets as found in the medical domain, and rapidly changing data sets as often encountered in industry. Crowdsourcing has also changed the way in which industry approaches problem-solving with companies now organizing challenges and handing out monetary awards to incentivize people to work on their challenges, particularly in the field of machine learning. This article is based on discussions at a workshop on Evaluation-as-a-Service (EaaS). EaaS is the paradigm of not providing data sets to participants and have them work on the data locally, but keeping the data central and allowing access via Application Programming Interfaces (API), Virtual Machines (VM), or other possibilities to ship executables. The objectives of this article are to summarize and compare the current approaches and consolidate the experiences of these approaches to outline the next steps of EaaS, particularly toward sustainable research infrastructures. The article summarizes several existing approaches to EaaS and analyzes their usage scenarios and also the advantages and disadvantages. The many factors influencing EaaS are summarized, and the environment in terms of motivations for the various stakeholders, from funding agencies to challenge organizers, researchers and participants, to industry interested in supplying real-world problems for which they require solutions. EaaS solves many problems of the current research environment, where data sets are often not accessible to many researchers. Executables of published tools are equally often not available making the reproducibility of results impossible. EaaS, however, creates reusable/citable data sets as well as available executables. Many challenges remain, but such a framework for research can also foster more collaboration between researchers, potentially increasing the speed of obtaining research results

White Rose Research Online

Experiences from the ImageCLEF Medical Retrieval and Annotation Tasks

Author: A Depeursinge
A Hanbury
Allan Hanbury
BH Menze
CV Thornley
H Müller
H Müller
Henning Müller
Jayashree Kalpathy-Cramer
M Krenn
Müller H Boyer C, Gaudinat A, Hersh W, Geissbuhler A (2007a) Analyzing web log files of the health on the net HONmedia search engine to define typical image search tasks for image retrieval evaluation. In: MedInfo
O Jimenez-del-Toro
P Clough
Paul Clough
Paul Clough
Paul Clough
R Buyya
T Gollub
T Heimann
T Tommasi
Theodora Tsikrika
Theodora Tsikrika
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/08/2019
Field of study

The medical tasks in ImageCLEF have been run every year from 2004-2018 and many different tasks and data sets have been used over these years. The created resources are being used by many researchers well beyond the actual evaluation campaigns and are allowing to compare the performance of many techniques on the same grounds and in a reproducible way. Many of the larger data sets are from the medical literature, as such images are easier to obtain and to share than clinical data, which was used in a few smaller ImageCLEF challenges that are specifically marked with the disease type and anatomic region. This chapter describes the main results of the various tasks over the years, including data, participants, types of tasks evaluated and also the lessons learned in organizing such tasks for the scientific community

University of Essex Research Repository

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Report from Dagstuhl Seminar 23031: Frontiers of Information Access Experimentation for Research and Education

Author: Bauer Christine
Carterette Ben
Faggioli Guglielmo
Ferro Nicola
Fuhr Norbert
Publication venue
Publication date: 01/01/2023
Field of study

This report documents the program and the outcomes of Dagstuhl Seminar 23031 ``Frontiers of Information Access Experimentation for Research and Education'', which brought together 37 participants from 12 countries. The seminar addressed technology-enhanced information access (information retrieval, recommender systems, natural language processing) and specifically focused on developing more responsible experimental practices leading to more valid results, both for research as well as for scientific education. The seminar brought together experts from various sub-fields of information access, namely IR, RS, NLP, information science, and human-computer interaction to create a joint understanding of the problems and challenges presented by next generation information access systems, from both the research and the experimentation point of views, to discuss existing solutions and impediments, and to propose next steps to be pursued in the area in order to improve not also our research methods and findings but also the education of the new generation of researchers and developers. The seminar featured a series of long and short talks delivered by participants, who helped in setting a common ground and in letting emerge topics of interest to be explored as the main output of the seminar. This led to the definition of five groups which investigated challenges, opportunities, and next steps in the following areas: reality check, i.e. conducting real-world studies, human-machine-collaborative relevance judgment frameworks, overcoming methodological challenges in information retrieval and recommender systems through awareness and education, results-blind reviewing, and guidance for authors.Comment: Dagstuhl Seminar 23031, report

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Geographic information extraction from texts

Author: Hu Xuke
Hu Yingjie
Kersten Jens
Resch Bernd
Publication venue
Publication date: 05/12/2023
Field of study

A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

Institute of Transport Research:Publications

Resting state fMRI experimental and analytical methodology: a functional connectivity analysis

Author: Fernandes Catarina Dinis
Publication venue
Publication date: 01/01/2013
Field of study

Tese de mestrado integrado em Engenharia Biomédica e Biofísica , apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2013O ser humano desde sempre se sentiu fascinado pelo estudo do seu próprio corpo assim como das suas propriedades funcionais. Do desejo de compreender e explorar o corpo humano surgiram então técnicas que permitem o seu estudo de modo não invasivo. Entre as primeiras técnicas de imagiologia encontram-se os Raios-X, a tomografia axial computadorizada (TAC) e a terapia por emissão de positrões (PET: do inglês “Positron Emission Therapy"). Contudo, todas elas utilizam radiação ionizante, e como tal surgiu o desejo de desenvolver novas metodologias igualmente não invasivas mas que por seu lado não utilizem qualquer tipo de radiação ionizante. Entre estas técnicas encontra-se a imagiologia por ressonância magnética (MRI: do inglês “Magnetic Resonance Imaging”) que pode ser utilizada para estudar as estruturas anatómicas mas também os seus mecanismos funcionais através da aplicação da técnica de ressonância magnética funcional (fMRI: do inglês “functional magnetic resonance imaging). Contrariamente às técnicas que utilizam radiação ionizante, a imagiologia por ressonância magnética tira partido do facto de o ser humano ser maioritariamente constituído por água. Um ser humano adulto é por norma constituído por cerca de 70 – 80% de água (H2O) o que se reflecte numa grande abundância de protões – núcleo 1H. Quando submetidos a um forte campo magnético, o momento magnético destas partículas tende a alinhar-se de acordo com a direcção do campo magnético externo (B0). Após alinhados os protões são então submetidos a um pulso de radiofrequência (com frequência igual à frequência de Larmor destas partículas) que é absorvido e modifica o momento magnético (i.e. Spin) dos protões. Quando este pulso é desligado, o spin dos protões retorna ao equilíbrio termodinâmico, de acordo com a direcção do campo magnético B0, emitindo energia sob a forma de radiofrequência (RF). Estes mecanismos de relaxação diferem consoante o conteúdo em água dos tecidos e são estes que permitem a identificação da sua estrutura. Gradientes de campo magnético são tamb ém utilizados de modo a criar ligeiras diferenças no campo magnético que permitem a codificação do sinal com informação espacial. A imagiologia por ressonância magnética faz, nos dias de hoje, parte da rotina hospitalar providenciando imagens com grande precisão e resolução anatómica. Todavia a informação estrutural nem sempre é suficiente para estudar patologias que não exibem diferenças anatómicas, tais como depressão ou esquizofrenia. Surge então a ressonância magnética funcional, que utiliza o nível de oxigenação do sangue (BOLD: do inglês “Blood-oxygenation level dependent”) como uma medida indirecta de activação neuronal. Através da utilização desta técnica é então possível mapear zonas cerebrais responsáveis pelo processamento de sinais como por exemplo estímulos visuais, tácteis ou auditivos. A título de exemplo, temos o estudo de doenças como o autismo ou até mesmo de distúrbios de consciência. A nível clínico a ressonância magnética funcional é utilizada para mapear funções críticas como por exemplo a fala, o movimento, o planeamento de tarefas, etc. Esta técnica oferece aos profissionais de saúde a chance de desenvolver um melhor planeamento cirúrgico sendo que é também aplicada no planeamento de tratamentos de radioterapia a nível cerebral com o intuito de mapear funcionalmente o cérebro e detectar os efeitos que tumores, AVC e lesões cerebrais possam ter ao nível da reestruturação das suas funções. Até muito recentemente a grande maioria da informação disponível acerca da conectividade anatómica cerebral era estritamente proveniente de estudos efectuados em primatas, recorrendo ao uso de técnicas extremamente invasivas (Felleman, Van Essen 1991, Jones, Powell 1970, Mesulam 2000, Ungerleider, Haxby 1994) assim como do estudo de lesões em casos humanos (ex: (Geschwind 1965)). Frinston (Friston et al. 1993) utilizando PET e Biswal (Biswal et al. 1995) através do uso de fMRI foram os primeiros a identificar que para além das ligações anatómicas entre diferentes estruturas cerebrais é também possível identificar ligações funcionais entre regiões que à primeira vista parecem não ter qualquer tipo de ligação. À técnica que usa MRI no estudo da conectividade funcional foi dado o nome de conectividade funcional de ressonância magnética (fcMRI: do inglês “Functional connectivity MRI”). Esta utiliza ressonância magnética funcional e as oscilações de baixa frequência ao nível do sinal BOLD em cada voxel para estabelecer correlações. Com base na ideia de que duas zonas se podem dizer funcionalmente relacionadas se estas se encontram a operar no mesmo processo, é portanto possível assumir que as variações no seu sinal BOLD serão bastante semelhantes exibindo uma alta correlação. A título de exemplo vejamos duas regiões do córtex motor primário, localizadas em hemisférios opostos, e que contudo apresentam sinais BOLD altamente correlacionados. Com esta ideia em mente foi então desenvolvido o conceito de redes funcionais que são usualmente estudadas durante períodos de repouso. Exactamente durante esta condição foi verificada a existência de uma rede funcional extremamente consistente entre indivíduos, e mesmo entre diferentes estados como durante o sono ou anestesia. A esta rede foi dado o nome de “Default-mode network” (Raichle et al. 2001) sendo que esta inclui regiões do córtex posterior cingulado, precuneus e do córtex prefrontal medial. A “defaultmode network” é a rede mais estudada, mas para além desta existem outras redes tal como a rede visual, a auditiva, a de controle executivo, a de atenção, entre outras. Estas redes encontram-se frequentemente interrompidas ou modificadas em casos de doença. Os projectos descritos no âmbito desta dissertação focam-se no estudo destas redes bem como das suas propriedades em casos de doença (distúrbios de consciência, AVC) e durante a performance de actividade física. A fim de estudar estas redes funcionais foram utilizados diferentes métodos para o cálculo da conectividade funcional. Entre os mais reconhecidos métodos de cálculo de conectividade funcional encontram-se a análise com base numa região de interesse, a análise através do estudo da independência entre componentes bem como métodos que permitem o cálculo da conectividade cerebral a nível global. Os métodos que utilizam uma região de interesse focam-se no cálculo da conectividade entre esta região e o resto do cérebro através do uso de medidas de correlação. O segundo método mencionado separa as várias redes neuronais com base na máximizacao da sua independência estatística. Por último, os métodos de análise global calculam a correlação das série temporal de cada voxel com todos os outros voxeis do cérebro. A contribuição da autora para os estudos descritos ao longo desta dissertacao focou-se no uso de duas destas técnicas – “seed-based analysis” e “wGBC”- no cálculo da conectividade cerebral em cada um dos diferentes projectos. No primeiro projecto, descrito no capítulo 3 desta dissertação, são apresentadas vários paradigmas que em conjunto com o uso de ressonância magnética funcional, foram desenhados para detectar consciência e percepção em doentes que sofrem de distúrbios de consciência. Estes paradigmas foram testados num grupo de voluntários saudáveis de modo a verificar se são adequados ou se necessitam de ser optimizados. A autora foi então responsável por executar uma análise individual e de grupo da activação induzida pela execução destes mesmos paradigmas. O desenvolvimento de paradigmas adequados a estes pacientes, combinadas com o uso de fMRI vem complementar e melhorar o diagnóstico e prognóstico destes doentes. No capítulo 4 desta dissertação a autora focou-se na análise da conectividade funcional em pacientes que foram diagnosticados com um pequeno AVC, com enxaquecas e com TIAs. Este procedimento utilizou técnicas de cálculo da conectividade com regiões de interesse e medidas globais de conectividade funcional. O objectivo deste estudo é uma vez mais averiguar se a inclusão de uma sequência de conectividade funcional poderá facilitar o diagnóstico destes doentes bem como o seu prognóstico. No quinto capítulo a autora foca-se no estudo das diferenças induzidas ao nível da conectividade funcional por uma única sessão de exercício físico. São uma vez mais utilizadas técnicas de cálculo da conectividade com regiões de interesse bem como outros métodos implementados por outros investigadores do departamento. É também incluído nesta dissertação um capítulo no qual foram analisadas as propriedades destas redes neuronais ao nível de uma população saudável. É importante que tanto as condições de aquisição dos dados de ressonância magnética funcional como as metodologias de análise estejam bem estabelecidas para que os dados provenientes de diferentes estudos sejam comparáveis e para que possamos estabelecer de forma fiável conclusões acerca de populações saudáveis e doentes. O conceito de repouso é ainda muito variável, particularmente quando é apenas pedido aos participantes que permaneçam calmos e imóveis. Certos estudos requerem que os participantes permaneçam de olhos fechados, outros de olhos abertos e outros ainda que fixem uma imagem projectada num ecrã. Uma grande variabilidade de estados podem ser originados com este design experimental, sendo que estes vão desde o simples devaneio em torno de um assunto, que por qualquer razão se encontra mais fortemente em mente, ou até mesmo o adormecer. Com o objecto de estudar estas variações, o capítulo 6 foca-se na investigação da conectividade cerebral resultante de duas diferentes situações bem como da sua variabilidade. Neste capítulo a autora procurou estudar a reprodutibilidade e confiança destas redes funcionais cerebrais quando é pedido aos participantes que executem uma tarefa de baixo requerimento cognitivo. A análise foi executada através do cálculo da correlação entre séries temporais bem como da sua análise estatística, utilizando medidas como o coeficiente de correlação intra-classes, que fornece uma estimativa de reprodutibilidade entre diferentes medições. Deste trabalho resultaram uma apresentação oral e a apresentação de um poster. Os resultados foram no geral positivos mas em alguns casos bastante ambíguos. As mais recentes publicações evidenciam o interesse em estudar não só a distribuição espacial destas redes como também as suas propriedades temporais que se parecem evidenciar como extremamente dinâmicas. Como tal fica aqui aberto o caminho para a continuação da exploração das redes funcionais cerebrais bem como da sua variabilidade. Numa nota final, consideramos importante salientar que o vasto estudo da conectividade cerebral assim como o dos seus mecanismos é ainda uma área de investigação com pouco mais de uma década e com um ainda longo caminho a percorrer.Conventional functional magnetic resonance imaging (fMRI) is used to measure small fluctuations in the blood oxygenation level dependent (BOLD) signal resulting from neural activation due to an external stimulus or task. Nonetheless, this imaging technique can also be applied to the study of functional connectivity in the human brain. Since it was first acknowledged that BOLD signal fluctuations also occur during resting periods that increased attention has been directed to the investigation of brain behaviour during this particular state. There is still an on-going debate as to whether these fluctuations actually reflect neuronal baseline activity or are just the result of physiological metabolism and therefore independent o neuronal function. Also, can this resting state activity be truly called a “baseline” for comparisons? Moreover, functional connectivity has identified several networks, of which the default mode network is the most robust. This network is believed to have a great importance in brain awareness and cognition. Further research is crucial to correctly understand these events and also to create a standardised methodology to perform the resting state fMRI acquisitions. The RESTATE (Resting State Techniques) project arises from the need to comprehend and correctly interpret the measured low frequency BOLD oscillations during resting periods. With this longitudinal study, comprising a baseline and a follow-up scan, we aim to assess the implications of using a low cognitive level paradigm upon the reproducibility of the data during functional connectivity analysis

Universidade de Lisboa: Repositório.UL

Digital History and Hermeneutics

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 06/07/2022
Field of study

For doing history in the digital age, we need to investigate the “digital kitchen” as the place where the “raw” is transformed into the “cooked”. The novel field of digital hermeneutics provides a critical and reflexive frame for digital humanities research by acquiring digital literacy and skills. The Doctoral Training Unit "Digital History and Hermeneutics" is applying this new digital practice by reflecting on digital tools and methods

Directory of Open Access Books (DOAB)

Biclustering electronic health records to unravel disease presentation patterns

Author: Matos Joana Sofia Santos de
Publication venue
Publication date: 01/01/2019
Field of study

Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2019A Esclerose Lateral Amiotrófica (ELA) é uma doença neurodegenerativa heterogénea com padrões de apresentação altamente variáveis. Dada a natureza heterogénea dos doentes com ELA, aquando do diagnóstico os clínicos normalmente estimam a progressão da doença utilizando uma taxa de decaimento funcional, calculada com base na Escala Revista de Avaliação Funcional de ELA (ALSFRS-R). A utilização de modelos de Aprendizagem Automática que consigam lidar com este padrões complexos é necessária para compreender a doença, melhorar os cuidados aos doentes e a sua sobrevivência. Estes modelos devem ser explicáveis para que os clínicos possam tomar decisões informadas. Desta forma, o nosso objectivo é descobrir padrões de apresentação da doença, para isso propondo uma nova abordagem de Prospecção de Dados: Descoberta de Meta-atributos Discriminativos (DMD), que utiliza uma combinação de Biclustering, Classificação baseada em Biclustering e Prospecção de Regras de Associação para Classificação. Estes padrões (chamados de Meta-atributos) são compostos por subconjuntos de atributos discriminativos conjuntamente com os seus valores, permitindo assim distinguir e caracterizar subgrupos de doentes com padrões similares de apresentação da doença. Os Registos de Saúde Electrónicos (RSE) utilizados neste trabalho provêm do conjunto de dados JPND ONWebDUALS (ONTology-based Web Database for Understanding Amyotrophic Lateral Sclerosis), composto por questões standardizadas acerca de factores de risco, mutações genéticas, atributos clínicos ou informação de sobrevivência de uma coorte de doentes e controlos seguidos pelo consórcio ENCALS (European Network to Cure ALS), que inclui vários países europeus, incluindo Portugal. Nesta tese a metodologia proposta foi utilizada na parte portuguesa do conjunto de dados ONWebDUALS para encontrar padrões de apresentação da doença que: 1) distinguissem os doentes de ELA dos seus controlos e 2) caracterizassem grupos de doentes de ELA com diferentes taxas de progressão (categorizados em grupos Lentos, Neutros e Rápidos). Nenhum padrão coerente emergiu das experiências efectuadas para a primeira tarefa. Contudo, para a segunda tarefa os padrões encontrados para cada um dos três grupos de progressão foram reconhecidos e validados por clínicos especialistas em ELA, como sendo características relevantes de doentes com progressão Lenta, Neutra e Rápida. Estes resultados sugerem que a nossa abordagem genérica baseada em Biclustering tem potencial para identificar padrões de apresentação noutros problemas ou doenças semelhantes.Amyotrophic Lateral Sclerosis (ALS) is a heterogeneous neurodegenerative disease with a high variability of presentation patterns. Given the heterogeneous nature of ALS patients and targeting a better prognosis, clinicians usually estimate disease progression at diagnosis using the rate of decay computed from the Revised ALS Functional Rating Scale (ALSFRS-R). In this context, the use of Machine Learning models able to unravel the complexity of disease presentation patterns is paramount for disease understanding, targeting improved patient care and longer survival times. Furthermore, explainable models are vital, since clinicians must be able to understand the reasoning behind a given model’s result before making a decision that can impact a patient’s life. Therefore we aim at unravelling disease presentation patterns by proposing a new Data Mining approach called Discriminative Meta-features Discovery (DMD), which uses a combination of Biclustering, Biclustering-based Classification and Class Association Rule Mining. These patterns (called Metafeatures) are composed of discriminative subsets of features together with their values, allowing to distinguish and characterize subgroups of patients with similar disease presentation patterns. The Electronic Health Record (EHR) data used in this work comes from the JPND ONWebDUALS (ONTology-based Web Database for Understanding Amyotrophic Lateral Sclerosis) dataset, comprised of standardized questionnaire answers regarding risk factors, genetic mutations, clinical features and survival information from a cohort of patients and controls from ENCALS (European Network to Cure ALS), a consortium of diverse European countries, including Portugal. In this work the proposed methodology was used on the ONWebDUALS Portuguese EHR data to find disease presentation patterns that: 1) distinguish the ALS patients from their controls and 2) characterize groups of ALS patients with different progression rates (categorized into Slow, Neutral and Fast groups). No clear pattern emerged from the experiments performed for the first task. However, in the second task the patterns found for each of the three progression groups were recognized and validated by ALS expert clinicians, as being relevant characteristics of slow, neutral and fast progressing patients. These results suggest that our generic Biclustering approach is a promising way to unravel disease presentation patterns and could be applied to similar problems and other diseases

Universidade de Lisboa: Repositório.UL