109 research outputs found

    Automated Reading Passage Generation with OpenAI's Large Language Model

    Full text link
    The widespread usage of computer-based assessments and individualized learning platforms has resulted in an increased demand for the rapid production of high-quality items. Automated item generation (AIG), the process of using item models to generate new items with the help of computer technology, was proposed to reduce reliance on human subject experts at each step of the process. AIG has been used in test development for some time. Still, the use of machine learning algorithms has introduced the potential to improve the efficiency and effectiveness of the process greatly. The approach presented in this paper utilizes OpenAI's latest transformer-based language model, GPT-3, to generate reading passages. Existing reading passages were used in carefully engineered prompts to ensure the AI-generated text has similar content and structure to a fourth-grade reading passage. For each prompt, we generated multiple passages, the final passage was selected according to the Lexile score agreement with the original passage. In the final round, the selected passage went through a simple revision by a human editor to ensure the text was free of any grammatical and factual errors. All AI-generated passages, along with original passages were evaluated by human judges according to their coherence, appropriateness to fourth graders, and readability

    Spatiotemporal anomaly detection: streaming architecture and algorithms

    Get PDF
    Includes bibliographical references.2020 Summer.Anomaly detection is the science of identifying one or more rare or unexplainable samples or events in a dataset or data stream. The field of anomaly detection has been extensively studied by mathematicians, statisticians, economists, engineers, and computer scientists. One open research question remains the design of distributed cloud-based architectures and algorithms that can accurately identify anomalies in previously unseen, unlabeled streaming, multivariate spatiotemporal data. With streaming data, time is of the essence, and insights are perishable. Real-world streaming spatiotemporal data originate from many sources, including mobile phones, supervisory control and data acquisition enabled (SCADA) devices, the internet-of-things (IoT), distributed sensor networks, and social media. Baseline experiments are performed on four (4) non-streaming, static anomaly detection multivariate datasets using unsupervised offline traditional machine learning (TML), and unsupervised neural network techniques. Multiple architectures, including autoencoders, generative adversarial networks, convolutional networks, and recurrent networks, are adapted for experimentation. Extensive experimentation demonstrates that neural networks produce superior detection accuracy over TML techniques. These same neural network architectures can be extended to process unlabeled spatiotemporal streaming using online learning. Space and time relationships are further exploited to provide additional insights and increased anomaly detection accuracy. A novel domain-independent architecture and set of algorithms called the Spatiotemporal Anomaly Detection Environment (STADE) is formulated. STADE is based on federated learning architecture. STADE streaming algorithms are based on a geographically unique, persistently executing neural networks using online stochastic gradient descent (SGD). STADE is designed to be pluggable, meaning that alternative algorithms may be substituted or combined to form an ensemble. STADE incorporates a Stream Anomaly Detector (SAD) and a Federated Anomaly Detector (FAD). The SAD executes at multiple locations on streaming data, while the FAD executes at a single server and identifies global patterns and relationships among the site anomalies. Each STADE site streams anomaly scores to the centralized FAD server for further spatiotemporal dependency analysis and logging. The FAD is based on recent advances in DNN-based federated learning. A STADE testbed is implemented to facilitate globally distributed experimentation using low-cost, commercial cloud infrastructure provided by Microsoftâ„¢. STADE testbed sites are situated in the cloud within each continent: Africa, Asia, Australia, Europe, North America, and South America. Communication occurs over the commercial internet. Three STADE case studies are investigated. The first case study processes commercial air traffic flows, the second case study processes global earthquake measurements, and the third case study processes social media (i.e., Twitterâ„¢) feeds. These case studies confirm that STADE is a viable architecture for the near real-time identification of anomalies in streaming data originating from (possibly) computationally disadvantaged, geographically dispersed sites. Moreover, the addition of the FAD provides enhanced anomaly detection capability. Since STADE is domain-independent, these findings can be easily extended to additional application domains and use cases

    Investigating High Speed Localization Microscopy Through Experimental Methods, Data Processing Methods, and Applications of Localization Microscopy to Biological Questions

    Get PDF
    Fluorescence Photoactivation Localization Microscopy(FPALM) and other super resolution localization microscopy techniques can resolve structures with nanoscale resolution. Unlike techniques of electron microscopy, they are also compatible with live cell and live animal studies, making FPALM and related techniques ideal for answering questions about the dynamic nature of molecular biology in living systems. Many processes in biology occur on rapid sub second time scales requiring the imaging technique to be capable of resolving these processes not just with a high enough spatial resolution, but with an appropriate temporal resolution. To that end, this Dissertation in part investigates high speed FPALM as an experimental technique showing images can be reconstructed with effective temporal resolutions of 0.1s. Using fluorescent proteins attached to an influenza viral protein, hemagglutinin(HA), questions of protein clustering and cluster dynamics on the host cell membrane are explored. The results indicate that these HA clusters may be more dynamic than previously thought. The principle disadvantage of the increased speed of imaging is the reduction in information that comes through collecting fewer photons to localize each molecule, and fewer molecules overall. As the molecules become dimmer, they also become harder to identify using conventional identification algorithms. Tools from machine learning and computer vision such as artificial neural networks(ANNs) have been shown to be adept at object identification. Here a method for repeatedly training an ANN is investigated. This method is shown to have exceptional performance on simulations indicating that it can be regarded as a method of high fidelity, even in the presence of weakly fluorescent molecules. Development of this technique can be used to recover more molecules from data sets with weaker molecular fluorescence, such as those obtained with high speed imaging, allowing for higher sampling, and overall higher spatial resolution of the final image. The combination of a high speed experimental technique coupled with a sensitive and robust identification algorithm allow FPALM and related techniques to probe questions of fast biological processes while limiting the sacrifice to spatial resolution inherent in high speed techniques

    Neural correlates of post-traumatic brain injury (TBI) attention deficits in children

    Get PDF
    Traumatic brain injury (TBI) in children is a major public health concern worldwide. Attention deficits are among the most common neurocognitive and behavioral consequences in children post-TBI which have significant negative impacts on their educational and social outcomes and compromise the quality of their lives. However, there is a paucity of evidence to guide the optimal treatment strategies of attention deficit related symptoms in children post-TBI due to the lack of understanding regarding its neurobiological substrate. Thus, it is critical to understand the neural mechanisms associated with TBI-induced attention deficits in children so that more refined and tailored strategies can be developed for diagnoses and long-term treatments and interventions. This dissertation is the first study to investigate neurobiological substrates associated with post-TBI attention deficits in children using both anatomical and functional neuroimaging data. The goals of this project are to discover the quantitatively measurable markers utilizing diffusion tensor imaging (DTI), structural magnetic resonance imaging (MRI), and functional MRI (fMRI) techniques, and to further identify the most robust neuroimaging features in predicting severe post-TBI attention deficits in children, by utilizing machine learning and deep learning techniques. A total of 53 children with TBI and 55 controls from age 9 to 17 are recruited. The results show that the systems-level topological properties in left frontal regions, parietal regions, and medial occipitotemporal regions in structural and functional brain network are significantly associated with inattentive and/or hyperactive/impulsive symptoms in children post-TBI. Semi-supervised deep learning modeling further confirms the significant contributions of these brain features in the prediction of elevated attention deficits in children post-TBI. The findings of this project provide valuable foundations for future research on developing neural markers for TBI-induced attention deficits in children, which may significantly assist the development of more effective and individualized diagnostic and treatment strategies

    Understanding Biology in the Age of Artificial Intelligence

    Get PDF
    Modern life sciences research is increasingly relying on artificial intelligence (AI) approaches to model biological systems, primarily centered around the use of machine learning (ML) models. Although ML is undeniably useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry. As such, the interplay between these models and scientific understanding in biology is a topic with important implications for the future of scientific research, yet it is a subject that has received little attention. Here, we draw from an epistemological toolkit to contextualize recent applications of ML in biological sciences under modern philosophical theories of understanding, identifying general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge. We propose that conceptions of scientific understanding as information compression, qualitative intelligibility, and dependency relation modelling provide a useful framework for interpreting ML-mediated understanding of biological systems. Through a detailed analysis of two key application areas of ML in modern biological research – protein structure prediction and single cell RNA-sequencing – we explore how these features have thus far enabled ML systems to advance scientific understanding of their target phenomena, how they may guide the development of future ML models, and the key obstacles that remain in preventing ML from achieving its potential as a tool for biological discovery. Consideration of the epistemological features of ML applications in biology will improve the prospects of these methods to solve important problems and advance scientific understanding of living systems

    Toward diffusion tensor imaging as a biomarker in neurodegenerative diseases: technical considerations to optimize recordings and data processing

    Get PDF
    Neuroimaging biomarkers have shown high potential to map the disease processes in the application to neurodegenerative diseases (NDD), e.g., diffusion tensor imaging (DTI). For DTI, the implementation of a standardized scanning and analysis cascade in clinical trials has potential to be further optimized. Over the last few years, various approaches to improve DTI applications to NDD have been developed. The core issue of this review was to address considerations and limitations of DTI in NDD: we discuss suggestions for improvements of DTI applications to NDD. Based on this technical approach, a set of recommendations was proposed for a standardized DTI scan protocol and an analysis cascade of DTI data pre-and postprocessing and statistical analysis. In summary, considering advantages and limitations of the DTI in NDD we suggest improvements for a standardized framework for a DTI-based protocol to be applied to future imaging studies in NDD, towards the goal to proceed to establish DTI as a biomarker in clinical trials in neurodegeneration

    Object Detection in medical imaging

    Get PDF
    A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Information and Decision SystemsArtificial Intelligence, assisted by deep learning, has emerged in various fields of our society. These systems allow the automation and the improvement of several tasks, even surpassing, in some cases, human capability. Object detection methods are used nowadays in several areas, including medical imaging analysis. However, these methods are susceptible to errors, and there is a lack of a universally accepted method that can be applied across all types of applications with the needed precision in the medical field. Additionally, the application of object detectors in medical imaging analysis has yet to be thoroughly analyzed to achieve a richer understanding of the state of the art. To tackle these shortcomings, we present three studies with distinct goals. First, a quantitative and qualitative analysis of academic research was conducted to gather a perception of which object detectors are employed, the modality of medical imaging used, and the particular body parts under investigation. Secondly, we propose an optimized version of a widely used algorithm to overcome limitations commonly addressed in medical imaging by fine-tuning several hyperparameters. Thirdly, we develop a novel stacking approach to augment the precision of detections on medical imaging analysis. The findings show that despite the late arrival of object detection in medical imaging analysis, the number of publications has increased in recent years, demonstrating the significant potential for growth. Additionally, we establish that it is possible to address some constraints on the data through an exhaustive optimization of the algorithm. Finally, our last study highlights that there is still room for improvement in these advanced techniques, using, as an example, stacking approaches. The contributions of this dissertation are several, as it puts forward a deeper overview of the state-of-the-art applications of object detection algorithms in the medical field and presents strategies for addressing typical constraints in this area.A Inteligência Artificial, auxiliada pelo deep learning, tem emergido em diversas áreas da nossa sociedade. Estes sistemas permitem a automatização e a melhoria de diversas tarefas, superando mesmo, em alguns casos, a capacidade humana. Os métodos de detecção de objetos são utilizados atualmente em diversas áreas, inclusive na análise de imagens médicas. No entanto, esses métodos são suscetíveis a erros e falta um método universalmente aceite que possa ser aplicado em todos os tipos de aplicações com a precisão necessária na área médica. Além disso, a aplicação de detectores de objetos na análise de imagens médicas ainda precisa ser analisada minuciosamente para alcançar uma compreensão mais rica do estado da arte. Para enfrentar essas limitações, apresentamos três estudos com objetivos distintos. Inicialmente, uma análise quantitativa e qualitativa da pesquisa acadêmica foi realizada para obter uma percepção de quais detectores de objetos são empregues, a modalidade de imagem médica usada e as partes específicas do corpo sob investigação. Num segundo estudo, propomos uma versão otimizada de um algoritmo amplamente utilizado para superar limitações comumente abordadas em imagens médicas por meio do ajuste fino de vários hiperparâmetros. Em terceiro lugar, desenvolvemos uma nova abordagem de stacking para aumentar a precisão das detecções na análise de imagens médicas. Os resultados demostram que, apesar da chegada tardia da detecção de objetos na análise de imagens médicas, o número de publicações aumentou nos últimos anos, evidenciando o significativo potencial de crescimento. Adicionalmente, estabelecemos que é possível resolver algumas restrições nos dados por meio de uma otimização exaustiva do algoritmo. Finalmente, o nosso último estudo destaca que ainda há espaço para melhorias nessas técnicas avançadas, usando, como exemplo, abordagens de stacking. As contribuições desta dissertação são várias, apresentando uma visão geral em maior detalhe das aplicações de ponta dos algoritmos de detecção de objetos na área médica e apresenta estratégias para lidar com restrições típicas nesta área
    • …
    corecore