78 research outputs found

    Three Essays on Enhancing Clinical Trial Subject Recruitment Using Natural Language Processing and Text Mining

    Get PDF
    Patient recruitment and enrollment are critical factors for a successful clinical trial; however, recruitment tends to be the most common problem in most clinical trials. The success of a clinical trial depends on efficiently recruiting suitable patients to conduct the trial. Every clinical trial research has a protocol, which describes what will be done in the study and how it will be conducted. Also, the protocol ensures the safety of the trial subjects and the integrity of the data collected. The eligibility criteria section of clinical trial protocols is important because it specifies the necessary conditions that participants have to satisfy. Since clinical trial eligibility criteria are usually written in free text form, they are not computer interpretable. To automate the analysis of the eligibility criteria, it is therefore necessary to transform those criteria into a computer-interpretable format. Unstructured format of eligibility criteria additionally create search efficiency issues. Thus, searching and selecting appropriate clinical trials for a patient from relatively large number of available trials is a complex task. A few attempts have been made to automate the matching process between patients and clinical trials. However, those attempts have not fully integrated the entire matching process and have not exploited the state-of-the-art Natural Language Processing (NLP) techniques that may improve the matching performance. Given the importance of patient recruitment in clinical trial research, the objective of this research is to automate the matching process using NLP and text mining techniques and, thereby, improve the efficiency and effectiveness of the recruitment process. This dissertation research, which comprises three essays, investigates the issues of clinical trial subject recruitment using state-of-the-art NLP and text mining techniques. Essay 1: Building a Domain-Specific Lexicon for Clinical Trial Subject Eligibility Analysis Essay 2: Clustering Clinical Trials Using Semantic-Based Feature Expansion Essay 3: An Automatic Matching Process of Clinical Trial Subject Recruitment In essay1, I develop a domain-specific lexicon for n-gram Named Entity Recognition (NER) in the breast cancer domain. The domain-specific dictionary is used for selection and reduction of n-gram features in clustering in eassy2. The domain-specific dictionary was evaluated by comparing it with Systematized Nomenclature of Medicine--Clinical Terms (SNOMED CT). The results showed that it add significant number of new terms which is very useful in effective natural language processing In essay 2, I explore the clustering of similar clinical trials using the domain-specific lexicon and term expansion using synonym from the Unified Medical Language System (UMLS). I generate word n-gram features and modify the features with the domain-specific dictionary matching process. In order to resolve semantic ambiguity, a semantic-based feature expansion technique using UMLS is applied. A hierarchical agglomerative clustering algorithm is used to generate clinical trial clusters. The focus is on summarization of clinical trial information in order to enhance trial search efficiency. Finally, in essay 3, I investigate an automatic matching process of clinical trial clusters and patient medical records. The patient records collected from a prior study were used to test our approach. The patient records were pre-processed by tokenization and lemmatization. The pre-processed patient information were then further enhanced by matching with breast cancer custom dictionary described in essay 1 and semantic feature expansion using UMLS Metathesaurus. Finally, I matched the patient record with clinical trial clusters to select the best matched cluster(s) and then with trials within the clusters. The matching results were evaluated by internal expert as well as external medical expert

    Automatic Framework to Aid Therapists to Diagnose Children who Stutter

    Get PDF

    Dyslexia in higher education

    Get PDF

    Automatic Detection of Dementia and related Affective Disorders through Processing of Speech and Language

    Get PDF
    In 2019, dementia is has become a trillion dollar disorder. Alzheimer’s disease (AD) is a type of dementia in which the main observable symptom is a decline in cognitive functions, notably memory, as well as language and problem-solving. Experts agree that early detection is crucial to effectively develop and apply interventions and treatments, underlining the need for effective and pervasive assessment and screening tools. The goal of this thesis is to explores how computational techniques can be used to process speech and language samples produced by patients suffering from dementia or related affective disorders, to the end of automatically detecting them in large populations us- ing machine learning models. A strong focus is laid on the detection of early stage dementia (MCI), as most clinical trials today focus on intervention at this level. To this end, novel automatic and semi-automatic analysis schemes for a speech-based cogni- tive task, i.e., verbal fluency, are explored and evaluated to be an appropriate screening task. Due to a lack of available patient data in most languages, world-first multilingual approaches to detecting dementia are introduced in this thesis. Results are encouraging and clear benefits on a small French dataset become visible. Lastly, the task of detecting these people with dementia who also suffer from an affective disorder called apathy is explored. Since they are more likely to convert into later stage of dementia faster, it is crucial to identify them. These are the fist experiments that consider this task us- ing solely speech and language as inputs. Results are again encouraging, both using only speech or language data elicited using emotional questions. Overall, strong results encourage further research in establishing speech-based biomarkers for early detection and monitoring of these disorders to better patients’ lives.Im Jahr 2019 ist Demenz zu einer Billionen-Dollar-Krankheit geworden. Die Alzheimer- Krankheit (AD) ist eine Form der Demenz, bei der das Hauptsymptom eine Abnahme der kognitiven Funktionen ist, insbesondere des Gedächtnisses sowie der Sprache und des Problemlösungsvermögens. Experten sind sich einig, dass eine frühzeitige Erkennung entscheidend für die effektive Entwicklung und Anwendung von Interventionen und Behandlungen ist, was den Bedarf an effektiven und durchgängigen Bewertungsund Screening-Tools unterstreicht. Das Ziel dieser Arbeit ist es zu erforschen, wie computergest ützte Techniken eingesetzt werden können, um Sprach- und Sprechproben von Patienten, die an Demenz oder verwandten affektiven Störungen leiden, zu verarbeiten, mit dem Ziel, diese in großen Populationen mit Hilfe von maschinellen Lernmodellen automatisch zu erkennen. Ein starker Fokus liegt auf der Erkennung von Demenz im Frühstadium (MCI), da sich die meisten klinischen Studien heute auf eine Intervention auf dieser Ebene konzentrieren. Zu diesem Zweck werden neuartige automatische und halbautomatische Analyseschemata für eine sprachbasierte kognitive Aufgabe, d.h. die verbale Geläufigkeit, erforscht und als geeignete Screening-Aufgabe bewertet. Aufgrund des Mangels an verfügbaren Patientendaten in den meisten Sprachen werden in dieser Arbeit weltweit erstmalig mehrsprachige Ansätze zur Erkennung von Demenz vorgestellt. Die Ergebnisse sind ermutigend und es werden deutliche Vorteile an einem kleinen französischen Datensatz sichtbar. Schließlich wird die Aufgabe untersucht, jene Menschen mit Demenz zu erkennen, die auch an einer affektiven Störung namens Apathie leiden. Da sie mit größerer Wahrscheinlichkeit schneller in ein späteres Stadium der Demenz übergehen, ist es entscheidend, sie zu identifizieren. Dies sind die ersten Experimente, die diese Aufgabe unter ausschließlicher Verwendung von Sprache und Sprache als Input betrachten. Die Ergebnisse sind wieder ermutigend, sowohl bei der Verwendung von reiner Sprache als auch bei der Verwendung von Sprachdaten, die durch emotionale Fragen ausgelöst werden. Insgesamt sind die Ergebnisse sehr ermutigend und ermutigen zu weiterer Forschung, um sprachbasierte Biomarker für die Früherkennung und Überwachung dieser Erkrankungen zu etablieren und so das Leben der Patienten zu verbessern

    Simulated role-playing from crowdsourced data

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 173-178).Collective Artificial Intelligence (CAl) simulates human intelligence from data contributed by many humans, mined for inter-related patterns. This thesis applies CAI to social role-playing, introducing an end-to-end process for compositing recorded performances from thousands of humans, and simulating open-ended interaction from this data. The CAI process combines crowdsourcing, pattern discovery, and case-based planning. Content creation is crowdsourced by recording role-players online. Browser-based tools allow nonexperts to annotate data, organizing content into a hierarchical narrative structure. Patterns discovered from data power a novel system combining plan recognition with case-based planning. The combination of this process and structure produces a new medium, which exploits a massive corpus to realize characters who interact and converse with humans. This medium enables new experiences in videogames, and new classes of training simulations, therapeutic applications, and social robots. While advances in graphics support incredible freedom to interact physically in simulations, current approaches to development restrict simulated social interaction to hand-crafted branches that do not scale to the thousands of possible patterns of actions and utterances observed in actual human interaction. There is a tension between freedom and system comprehension due to two bottlenecks, making open-ended social interaction a challenge. First is the authorial effort entailed to cover all possible inputs. Second, like other cognitive processes, imagination is a bounded resource. Any individual author only has so much imagination. The convergence of advances in connectivity, storage, and processing power is bringing people together in ways never before possible, amplifying the imagination of individuals by harnessing the creativity and productivity of the crowd, revolutionizing how we create media, and what media we can create. By embracing data-driven approaches, and capitalizing on the creativity of the crowd, authoring bottlenecks can be overcome, taking a step toward realizing a medium that robustly supports player choice. Doing so requires rethinking both technology and division of labor in media production. As a proof of concept, a CAI system has been evaluated by recording over 10,000 performances in The Restaurant Game, automating an Al-controlled waitress who interacts in the world, and converses with a human via text or speech. Quantitative results demonstrate how CAI supports significantly more open-ended interaction with humans, while focus groups reveal factors for improving engagement.by Jeffrey David Orkin.Ph.D

    Was the Patient Cured? Understanding Semantic Categories and Their Relationships in Patient Records

    Get PDF
    MEng thesisIn this thesis, we detail an approach to extracting key information in medical discharge summaries. Starting with a narrative patient report, we first identify and remove information that compromises privacy (de-identification);next we recognize words and phrases in the text belonging to semantic categories of interest to doctors (semantic category recognition).For disease and symptoms, we determine whether the problem is present, absent, uncertain, or associated with somebody else (assertion classification). Finally, we classify the semantic relationships existing between our categories (semantic relationship classification).Our approach utilizes a series of statistical models that rely heavily on local lexical and syntactic context, and achieve competitive results compared to more complexNLP solutions. We conclude the thesis by presenting the design for the Category and Relationship Extractor (CaRE). CaRE combines our solutions to de-identification, semantic category recognition, assertion classification, and semantic relationship classification into a singleapplication that facilitates the easy extraction of semantic information from medical text

    Networking acupuncture in Vietnam

    No full text
    This thesis proposes that medical anthropologists change the way we think about acupuncture in Vietnam. Acupuncture should not be conceived as a discrete medicophilosophical system as has been acupuncture’s textual identity in academic writings to date. Acupuncture is rather a performative network, in the sense used by Bruno Latour, constituted through energetic relationships between science, people, textbooks, classrooms, pedagogic practices, clinical technologies and much more. These come into interaction and their collaborations produce acupuncture in unexpected ways. This conclusion was generated through 15 months of ethnographic fieldwork with acupuncturists in Ho Chi Minh City and catchments from 2007-08. Fieldwork involved observing acupuncturists engage patients, participating in acupuncture classes and volunteering on acupuncture charity teaching and treating missions. A snowballing method was used to generate connections with a mobile and diverse group of medical specialists. First, it will be shown that in Vietnam, science and tradition were united in the creation of a New Medicine that must be considered on its own terms rather than as a grafting of two different types of medical system. The New Medicine modelled pedagogic and legitimacy-making practices which circulated in the city. Second, local formation of acupuncture objects and shaping of clinical treatment flatten out previously taken for granted hierarchies when describing clinical medical knowledge. The technology of vision was integral to the construction of such knowledge and when interrupted caused acupuncture to grind to a halt. Finally, person networks, after Mark Granovetter, were active in the city generating professional success and legality for practitioners but these will also be analysed using a Latourian approach. Recent ethnographic investigations of science and technology are used to help portray, more faithfully, the interactive dynamic of acupuncture experienced during fieldwork. Such writings extend the scope of what can be investigated as participating in the creation of medical realities in southern Vietnam. I argue that medical knowledge is a reality constructed through continual practices. Knowledge is not a commodity or eternally static entity, knowledge is what we do
    corecore