1,495 research outputs found

    Computing Healthcare Quality Indicators Automatically: Secondary Use of Patient Data and Semantic Interoperability

    Get PDF
    Harmelen, F.A.H. van [Promotor]Keizer, N.F. de [Copromotor]Cornet, R. [Copromotor]Teije, A.C.M. [Copromotor

    The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

    Get PDF
    Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge

    The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

    Get PDF
    Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge

    Initiating Self-Management of Lifestyle-Related Chronic Disease Resulting from Unhealthy Weight: Identifying Thoughts and Actions through Phenomenology

    Get PDF
    A heuristic phenomenological approach drew out participants identified factors that initiated self-management of unhealthy weight-related chronic diseases; resulting in severity attenuation of current chronic disease(s) and lowered risk of comorbidity development. Prior to this study, there have been three omissions relating to older individuals who have unhealthy weight-related chronic diseases. First, direct first-person narratives of the process to self-management have not been reported. Second, the self-identification of the initiating factors leading to self-management has not been found in scholarly literature. Finally, cognitive reflection, an activity that results in paradigmatic shifts in adult behaviors has not been discussed from the first-person perspective (Cranton, 2006; Taylor, 2009; Taylor & Cranton, 2012). Narratives of 10 participants resulted in understanding the challenges of managing chronic health conditions and factors associated with self-management, from these first-person perspectives. Two themes of factors that initiated the self-management process were developed, emanating from each participant\u27s conscious acknowledgment and acceptance of responsibility. The themes identified were (a) acknowledgment of the significant, \u27last straw\u27 diagnosis and (b) conscientiously accepting physical limitations resulting from respective chronic disease(s). It has been important to acknowledge the need for such a study from societal and personal perspectives. First, because individuals are living longer, the risk of developing chronic diseases such as hypertension, type-2 diabetes, cardiovascular disease is further enhanced if the individual has unhealthy weight (Ginsberg & MacCallum, 2009; Morrell, Lofgren, Burke, & Reilly, 2012). Voluntary lifestyle choices of poor nutrition and lack of routine physical activity significantly contribute to the etiology of unhealthy weight, while healthy weight can significantly reduce this risk and sustain and increase quality of life as aging proceeds (Centers for Disease Control and Prevention, 2005). While the necessity to maintain a healthy weight through aging is well documented (Centers for Disease Control and Prevention, 2014; Clark & Brancati, 2000), statistics continue to reflect increasing numbers of older individuals succumbing to unhealthy weight and concomitant chronic disease (Kart, Metress, & Metress, 1992). This study obtains information directly from participants who have experienced a paradigmatic shift to self-management of weight and concomitant chronic diseases as evidenced by participation in self-management programs and reductions of risks associated with weight and other comorbidities. Delimitations of a small homogeneous population segment do not preclude the value of the study. Limitations include employing heuristic phenomenology and a nonvalidated data collection instrument. However, this study provides a novel approach of data gathering from the first-person perspective. A future study should focus on the transformative learning process of transformative learning theory within the first phase of transtheoretical model of the stages of change theory to better understand the self-identified factors associated with how these factors initiated the self-management process. The efforts extended in this study should be of interest to individuals engaged in adult learning, healthcare and public health

    Semantic Segmentation of Ambiguous Images

    Get PDF
    Medizinische Bilder können schwer zu interpretieren sein. Nicht nur weil das Erkennen von Strukturen und möglichen Veränderungen Erfahrung und jahrelanges Training bedarf, sondern auch weil die dargestellten Messungen oft im Kern mehrdeutig sind. Fundamental ist dies eine Konsequenz dessen, dass medizinische Bild-Modalitäten, wie bespielsweise MRT oder CT, nur indirekte Messungen der zu Grunde liegenden molekularen Identitäten bereithalten. Die semantische Bedeutung eines Bildes kann deshalb im Allgemeinen nur gegeben einem größeren Bild-Kontext erfasst werden, welcher es oft allerdings nur unzureichend erlaubt eine eindeutige Interpretation in Form einer einzelnen Hypothese vorzunehmen. Ähnliche Szenarien existieren in natürlichen Bildern, in welchen die Kontextinformation, die es braucht um Mehrdeutigkeiten aufzulösen, limitiert sein kann, beispielsweise aufgrund von Verdeckungen oder Rauschen in der Aufnahme. Zusätzlich können überlappende oder vage Klassen-Definitionen zu schlecht gestellten oder diversen Lösungsräumen führen. Die Präsenz solcher Mehrdeutigkeiten kann auch das Training und die Leistung von maschinellen Lernverfahren beeinträchtigen. Darüber hinaus sind aktuelle Modelle ueberwiegend unfähig komplex strukturierte und diverse Vorhersagen bereitzustellen und stattdessen dazu gezwungen sich auf sub-optimale, einzelne Lösungen oder ununterscheidbare Mixturen zu beschränken. Dies kann besonders problematisch sein wenn Klassifikationsverfahren zu pixel-weisen Vorhersagen wie in der semantischen Segmentierung skaliert werden. Die semantische Segmentierung befasst sich damit jedem Pixel in einem Bild eine Klassen-Kategorie zuzuweisen. Diese Art des detailierten Bild-Verständnisses spielt auch eine wichtige Rolle in der Diagnose und der Behandlung von Krankheiten wie Krebs: Tumore werden häufig in MRT oder CT Bildern entdeckt und deren präzise Lokalisierung und Segmentierung ist von grosser Bedeutung in deren Bewertung, der Vorbereitung möglicher Biopsien oder der Planung von Fokal-Therapien. Diese klinischen Bildverarbeitungen, aber auch die optische Wahrnehmung unserer Umgebung im Rahmen von täglichen Aufgaben wie dem Autofahren, werden momentan von Menschen durchgeführt. Als Teil des zunehmenden Einbindens von maschinellen Lernverfahren in unsere Entscheidungsfindungsprozesse, ist es wichtig diese Aufgaben adequat zu modellieren. Dies schliesst Unsicherheitsabschätzungen der Modellvorhersagen mit ein, mitunter solche Unsicherheiten die den Bild-Mehrdeutigkeiten zugeschrieben werden können. Die vorliegende Thesis schlägt mehrere Art und Weisen vor mit denen mit einer mehrdeutigen Bild-Evidenz umgegangen werden kann. Zunächst untersuchen wir den momentanen klinischen Standard der im Falle von Prostata Läsionen darin besteht, die MRT-sichtbaren Läsionen subjektiv auf ihre Aggressivität hin zu bewerten, was mit einer hohen Variabilität zwischen Bewertern einhergeht. Unseren Studien zufolge können bereits einfache machinelle Lernverfahren und sogar simple quantitative MRT-basierte Parameter besser abschneiden als ein individueller, subjektiver Experte, was ein vielversprechendes Potential der Quantifizerung des Prozesses nahelegt. Desweiteren stellen wir die derzeit erfolgreichste Segmentierungsarchitektur auf einem stark mehrdeutigen Datensatz zur Probe der während klinischer Routine erhoben und annotiert wurde. Unsere Experimente zeigen, dass die standard Segmentierungsverlustfuntion in Szenarien mit starkem Annotationsrauschen sub-optimal sein kann. Als eine Alternative erproben wir die Möglichkeit ein Modell der Verlustunktion zu lernen mit dem Ziel die Koexistenz von plausiblen Lösungen während des Trainings zuzulassen. Wir beobachten gesteigerte Performanz unter Verwendung dieser Trainingsmethode für ansonsten unveränderte neuronale Netzarchitekturen und finden weiter gesteigerte relative Verbesserungen im Limit weniger Daten. Mangel an Daten und Annotationen, hohe Maße an Bild- und Annotationsrauschen sowie mehrdeutige Bild-Evidenz finden sich besonders häufig in Datensätzen medizinischer Bilder wieder. Dieser Teil der Thesis exponiert daher einige der Schwächen die standard Techniken des maschinellen Lernens im Lichte dieser Besonderheiten aufweisen können. Derzeitige Segmentierungsmodelle, wie die zuvor Herangezogenen, sind dahingehend eingeschränkt, dass sie nur eine einzige Vorhersage abgeben können. Dies kontrastiert die Beobachtung dass eine Gruppe von Annotierern, gegeben mehrdeutiger Bilddaten, typischer Weise eine Menge an diverser aber plausibler Annotationen produziert. Um die vorgenannte Modell-Einschränkung zu beheben und die angemessen probabilistische Behandlung der Aufgabe zu ermöglichen, entwickeln wir zwei Modelle, die eine Verteilung über plausible Annotationen vorhersagen statt nur einer einzigen, deterministischen Annotation. Das erste der beiden Modelle kombiniert ein `encoder-decoder\u27 Modell mit dem Verfahren der `variational inference\u27 und verwendet einen globalen `latent vector\u27, der den Raum der möglichen Annotationen für ein gegebenes Bild kodiert. Wir zeigen, dass dieses Modell deutlich besser als die Referenzmethoden abschneidet und gut kalibrierte Unsicherheiten aufweist. Das zweite Modell verbessert diesen Ansatz indem es eine flexiblere und hierarchische Formulierung verwendet, die es erlaubt die Variabilität der Segmentierungen auf verschiedenden Skalen zu erfassen. Dies erhöht die Granularität der Segmentierungsdetails die das Modell produzieren kann und erlaubt es unabhängig variierende Bildregionen und Skalen zu modellieren. Beide dieser neuartigen generativen Segmentierungs-Modelle ermöglichen es, falls angebracht, diverse und kohärente Bild Segmentierungen zu erstellen, was im Kontrast zu früheren Arbeiten steht, welche entweder deterministisch sind, die Modellunsicherheiten auf der Pixelebene modellieren oder darunter leiden eine unangemessen geringe Diversität abzubilden. Im Ergebnis befasst sich die vorliegende Thesis mit der Anwendung von maschinellem Lernen für die Interpretation medizinischer Bilder: Wir zeigen die Möglichkeit auf den klinischen Standard mit Hilfe einer quantitativen Verwendung von Bildparametern, die momentan nur subjektiv in Diagnosen einfliessen, zu verbessern, wir zeigen den möglichen Nutzen eines neuen Trainingsverfahrens um die scheinbare Verletzlichkeit der standard Segmentierungsverlustfunktion gegenüber starkem Annotationsrauschen abzumildern und wir schlagen zwei neue probabilistische Segmentierungsmodelle vor, die die Verteilung über angemessene Annotationen akkurat erlernen können. Diese Beiträge können als Schritte hin zu einer quantitativeren, verstärkt Prinzipien-gestützten und unsicherheitsbewussten Analyse von medizinischen Bildern gesehen werden -ein wichtiges Ziel mit Blick auf die fortschreitende Integration von lernbasierten Systemen in klinischen Arbeitsabläufen

    I Want an Omnipotent Doctor: North Korean Defectors’ Unmet Expectations of South Korean Medical Providers

    Get PDF
    This study examines North Korean defectors’ unmet expectations of South Korean medical providers from the perspectives of both North Korean defectors and their medical providers. Seventeen defectors and 12 medical providers were recruited for focus groups and in-depth interviews. Grounded theory was used for data analysis. Data indicates the North Korean defectors were not satisfied with their providers because they (1) preferred human techniques over computerized technology, (2) expected the doctors to be omnipotent, and (3) expected to receive emergency medical service but did not expect to pay for it. Their medical providers felt that it was impossible to satisfy the defectors because they expected to (1) receive medical services based on self-diagnosis and/or nonmedical personal needs, (2) have the doctor listen to their stories, and (3) receive medical services without booking an appointment. The findings of this study suggest that more efforts for mutual understanding and effective communication are urgently needed for both providers and defectors

    The Benefits and Risks of Telling and Listening to Stories of Difficulty Over Time: Experimentally Testing the Expressive Writing Paradigm in the Context of Interpersonal Communication Between Friends

    Get PDF
    The overarching goal of the current study was to determine the impact of talking interpersonally over time on emerging adults’ individual and relational health. Using an expressive writing study design (see Frattaroli, 2006), we assessed the degree to which psychological health improved over time for college students who told and listened to stories about friends’ current difficulties in comparison with tellers in control conditions. We also investigated the effects on tellers’ and listeners’ perceptions of each other’s communication competence, communicated perspective taking, and the degree to which each threatened the other’s face during the interaction over time to better understand the interpersonal communication complexities associated with talking about difficulty over time. After completing prestudy questionnaires, 49 friend pairs engaged in three interpersonal interactions over the course of 1 week wherein one talked about and one listened to a story of difficulty (treatment) or daily events (control). All participants completed a poststudy questionnaire 3 weeks later. Tellers’ negative affect decreased over time for participants exposed to the treatment group, although life satisfaction increased and positive affect decreased across time for participants regardless of condition. Perceptions of friends’ communication abilities decreased significantly over time for tellers. The current study contributes to the literature on expressive writing and social support by shedding light on the interpersonal implications of talking about difficulty, the often over looked effects of disclosure on listeners, and the health effects of talking about problems on college students’ health

    The Benefits and Risks of Telling and Listening to Stories of Difficulty Over Time: Experimentally Testing the Expressive Writing Paradigm in the Context of Interpersonal Communication Between Friends

    Get PDF
    The overarching goal of the current study was to determine the impact of talking interpersonally over time on emerging adults’ individual and relational health. Using an expressive writing study design (see Frattaroli, 2006), we assessed the degree to which psychological health improved over time for college students who told and listened to stories about friends’ current difficulties in comparison with tellers in control conditions. We also investigated the effects on tellers’ and listeners’ perceptions of each other’s communication competence, communicated perspective taking, and the degree to which each threatened the other’s face during the interaction over time to better understand the interpersonal communication complexities associated with talking about difficulty over time. After completing prestudy questionnaires, 49 friend pairs engaged in three interpersonal interactions over the course of 1 week wherein one talked about and one listened to a story of difficulty (treatment) or daily events (control). All participants completed a poststudy questionnaire 3 weeks later. Tellers’ negative affect decreased over time for participants exposed to the treatment group, although life satisfaction increased and positive affect decreased across time for participants regardless of condition. Perceptions of friends’ communication abilities decreased significantly over time for tellers. The current study contributes to the literature on expressive writing and social support by shedding light on the interpersonal implications of talking about difficulty, the often over looked effects of disclosure on listeners, and the health effects of talking about problems on college students’ health

    Computational analysis of human genomic variants and lncRNAs from sequence data

    Get PDF
    The high-throughput sequencing technologies have been developed and applied to the human genome studies for nearly 20 years. These technologies have provided numerous research applications and have significantly expanded our knowledge about the human genome. In this thesis, computational methods that utilize sequence data to study human genomic variants and transcripts were evaluated and developed. Indel represents insertion and deletion, which are two types of common genomic variants that are widespread in the human genome. Detecting indels from human genomes is the crucial step for diagnosing indel related genomic disorders and may potentially identify novel indel makers for studying certain diseases. Compared with previous techniques, the high-throughput sequencing technologies, especially the next- generation sequencing (NGS) technology, enable to detect indels accurately and efficiently in wide ranges of genome. In the first part of the thesis, tools with indel calling abilities are evaluated with an assortment of indels and different NGS settings. The results show that the selection of tools and NGS settings impact on indel detection significantly, which provide suggestions for tool selection and future developments. In bioinformatics analysis, an indel’s position can be marked inconsistently on the reference genome, which may result in an indel having different but equivalent representations and cause troubles for downstream. This problem is related to the complex sequence context of the indels, for example, short tandem repeats (STRs), where the same short stretch of nucleotides is amplified. In the second part of the thesis, a novel computational tool VarSCAT was described, which has various functions for annotating the sequence context of variants, including ambiguous positions, STRs, and other sequence context features. Analysis of several high- confidence human variant sets with VarSCAT reveals that a large number of genomic variants, especially indels, have sequence features associated with STRs. In the human genome, not all genes and their transcripts are translated into proteins. Long non-coding ribonucleic acid (lncRNA) is a typical example. Sequence recognition built with machine learning models have improved significantly in recent years. In the last part of the thesis, several machine learning-based lncRNA prediction tools were evaluated on their predictions for coding potentiality of transcripts. The results suggest that tools based on deep learning identify lncRNAs best. Ihmisen genomivarianttien ja lncRNA:iden laskennallinen analyysi sekvenssiaineistosta Korkean suorituskyvyn sekvensointiteknologioita on kehitetty ja sovellettu ihmisen genomitutkimuksiin lähes 20 vuoden ajan. Nämä teknologiat ovat mahdollistaneet ihmisen genomin laaja-alaisen tutkimisen ja lisänneet merkittävästi tietoamme siitä. Tässä väitöstyössä arvioitiin ja kehitettiin sekvenssiaineistoa hyödyntäviä laskennallisia menetelmiä ihmisen genomivarianttien sekä transkriptien tutkimiseen. Indeli on yhteisnimitys lisäys- eli insertio-varianteille ja häviämä- eli deleetio-varianteille, joita esiintyy koko genomin alueella. Indelien tunnistaminen on ratkaisevaa geneettisten poikkeavuuksien diagnosoinnissa ja eri sairauksiin liittyvien uusien indeli-markkereiden löytämisessä. Aiempiin teknologioihin verrattuna korkean suorituskyvyn sekvensointiteknologiat, erityisesti seuraavan sukupolven sekvensointi (NGS) mahdollistavat indelien havaitsemisen tarkemmin ja tehokkaammin laajemmilta genomialueilta. Väitöstyön ensimmäisessä osassa indelien kutsumiseen tarkoitettuja laskentatyökaluja arvioitiin käyttäen laajaa valikoimaa indeleitä ja erilaisia NGS-asetuksia. Tulokset osoittivat, että työkalujen valinta ja NGS-asetukset vaikuttivat indelien tunnistukseen merkittävästi ja siten ne voivat ohjata työkalujen valinnassa ja kehitystyössä. Bioinformatiivisessa analyysissä saman indelin sijainti voidaan merkitä eri kohtiin referenssigenomia, joka voi aiheuttaa ongelmia loppupään analyysiin, kuten indeli-kutsujen arviointiin. Tämä ongelma liittyy sekvenssikontekstiin, koska variantit voivat sijoittua lyhyille perättäisille tandem-toistojaksoille (STR), jossa sama lyhyt nukleotidijakso on monistunut. Väitöstyön toisessa osassa kehitettiin laskentatyökalu VarSCAT, jossa on eri toimintoja, mm. monitulkintaisten sijaintitietojen, vierekkäisten alueiden ja STR-alueiden tarkasteluun. Luotettaviksi arvioitujen ihmisen varianttiaineistojen analyysi VarSCAT-työkalulla paljasti, että monien geneettisten varianttien ja erityisesti indelien ominaisuudet liittyvät STR-alueisiin. Kaikkia ihmisen geenejä ja niiden geenituotteita, kuten esimerkiksi ei-koodaavia RNA:ta (lncRNA) ei käännetä proteiiniksi. Koneoppimismenetelmissä ja sekvenssitunnistuksessa on tapahtunut huomattavaa parannusta viime vuosina. Väitöstyön viimeisessä osassa arvioitiin useiden koneoppimiseen perustuvien lncRNA-ennustustyökalujen ennusteita. Tulokset viittaavat siihen, että syväoppimiseen perustuvat työkalut tunnistavat lncRNA:t parhaiten

    Human-in-the-Loop Learning From Crowdsourcing and Social Media

    Get PDF
    Computational social studies using public social media data have become more and more popular because of the large amount of user-generated data available. The richness of social media data, coupled with noise and subjectivity, raise significant challenges for computationally studying social issues in a feasible and scalable manner. Machine learning problems are, as a result, often subjective or ambiguous when humans are involved. That is, humans solving the same problems might come to legitimate but completely different conclusions, based on their personal experiences and beliefs. When building supervised learning models, particularly when using crowdsourced training data, multiple annotations per data item are usually reduced to a single label representing ground truth. This inevitably hides a rich source of diversity and subjectivity of opinions about the labels. Label distribution learning associates for each data item a probability distribution over the labels for that item, thus it can preserve diversities of opinions, beliefs, etc. that conventional learning hides or ignores. We propose a humans-in-the-loop learning framework to model and study large volumes of unlabeled subjective social media data with less human effort. We study various annotation tasks given to crowdsourced annotators and methods for aggregating their contributions in a manner that preserves subjectivity and disagreement. We introduce a strategy for learning label distributions with only five-to-ten labels per item by aggregating human-annotated labels over multiple, semantically related data items. We conduct experiments using our learning framework on data related to two subjective social issues (work and employment, and suicide prevention) that touch many people worldwide. Our methods can be applied to a broad variety of problems, particularly social problems. Our experimental results suggest that specific label aggregation methods can help provide reliable representative semantics at the population level
    • …
    corecore