6,761 research outputs found

    Optimizing digital archiving: An artificial intelligence approach for OCR error correction

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsThis thesis research scopes the knowledge gap for effective ways to address OCR errors and the importance to have training datasets adequated size and quality, to promote digital documents OCR recognition efficiency. The main goal is to examine the effects regarding the following dimensions of sourcing data: input size vs performance vs time efficiency, and to propose a new design that includes a machine translation model, to automate the errors correction caused by OCR scan. The study implemented various LSTM, with different thresholds, to recover errors generated by OCR systems. However, the results did not overcomed the performance of existing OCR systems, due to dataset size limitations, a step further was achieved. A relationship between performance and input size was established, providing meaningful insights for future digital archiving systems optimisation. This dissertation creates a new approach, to deal with OCR problems and implementation considerations, that can be further followed, to optimise digital archive systems efficiency and results

    Machine learning in solar physics

    Full text link
    The application of machine learning in solar physics has the potential to greatly enhance our understanding of the complex processes that take place in the atmosphere of the Sun. By using techniques such as deep learning, we are now in the position to analyze large amounts of data from solar observations and identify patterns and trends that may not have been apparent using traditional methods. This can help us improve our understanding of explosive events like solar flares, which can have a strong effect on the Earth environment. Predicting hazardous events on Earth becomes crucial for our technological society. Machine learning can also improve our understanding of the inner workings of the sun itself by allowing us to go deeper into the data and to propose more complex models to explain them. Additionally, the use of machine learning can help to automate the analysis of solar data, reducing the need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a Living Review in Solar Physics (LRSP

    Interpretable and explainable machine learning for ultrasonic defect sizing

    Get PDF
    Despite its popularity in literature, there are few examples of machine learning (ML) being used for industrial nondestructive evaluation (NDE) applications. A significant barrier is the ‘black box’ nature of most ML algorithms. This paper aims to improve the interpretability and explainability of ML for ultrasonic NDE by presenting a novel dimensionality reduction method: Gaussian feature approximation (GFA). GFA involves fitting a 2D elliptical Gaussian function an ultrasonic image and storing the seven parameters that describe each Gaussian. These seven parameters can then be used as inputs to data analysis methods such as the defect sizing neural network presented in this paper. GFA is applied to ultrasonic defect sizing for inline pipe inspection as an example application. This approach is compared to sizing with the same neural network, and two other dimensionality reduction methods (the parameters of 6 dB drop boxes and principal component analysis), as well as a convolutional neural network applied to raw ultrasonic images. Of the dimensionality reduction methods tested, GFA features produce the closest sizing accuracy to sizing from the raw images, with only a 23% increase in RMSE, despite a 96.5% reduction in the dimensionality of the input data. Implementing ML with GFA is implicitly more interpretable than doing so with principal component analysis or raw images as inputs, and gives significantly more sizing accuracy than 6 dB drop boxes. Shapley additive explanations (SHAP) are used to calculate how each feature contributes to the prediction of an individual defect’s length. Analysis of SHAP values demonstrates that the GFA-based neural network proposed displays many of the same relationships between defect indications and their predicted size as occur in traditional NDE sizing methods

    Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations

    Full text link
    The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search "unreasonable" local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and "reasonable" model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.Comment: 32 pages, 6 figures, 5 tables. Accepted for publication in the Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 202

    Beam scanning by liquid-crystal biasing in a modified SIW structure

    Get PDF
    A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium

    The Forward Physics Facility at the High-Luminosity LHC

    Get PDF

    Vitalism and Its Legacy in Twentieth Century Life Sciences and Philosophy

    Get PDF
    This Open Access book combines philosophical and historical analysis of various forms of alternatives to mechanism and mechanistic explanation, focusing on the 19th century to the present. It addresses vitalism, organicism and responses to materialism and its relevance to current biological science. In doing so, it promotes dialogue and discussion about the historical and philosophical importance of vitalism and other non-mechanistic conceptions of life. It points towards the integration of genomic science into the broader history of biology. It details a broad engagement with a variety of nineteenth, twentieth and twenty-first century vitalisms and conceptions of life. In addition, it discusses important threads in the history of concepts in the United States and Europe, including charting new reception histories in eastern and south-eastern Europe. While vitalism, organicism and similar epistemologies are often the concern of specialists in the history and philosophy of biology and of historians of ideas, the range of the contributions as well as the geographical and temporal scope of the volume allows for it to appeal to the historian of science and the historian of biology generally

    Specificity of the innate immune responses to different classes of non-tuberculous mycobacteria

    Get PDF
    Mycobacterium avium is the most common nontuberculous mycobacterium (NTM) species causing infectious disease. Here, we characterized a M. avium infection model in zebrafish larvae, and compared it to M. marinum infection, a model of tuberculosis. M. avium bacteria are efficiently phagocytosed and frequently induce granuloma-like structures in zebrafish larvae. Although macrophages can respond to both mycobacterial infections, their migration speed is faster in infections caused by M. marinum. Tlr2 is conservatively involved in most aspects of the defense against both mycobacterial infections. However, Tlr2 has a function in the migration speed of macrophages and neutrophils to infection sites with M. marinum that is not observed with M. avium. Using RNAseq analysis, we found a distinct transcriptome response in cytokine-cytokine receptor interaction for M. avium and M. marinum infection. In addition, we found differences in gene expression in metabolic pathways, phagosome formation, matrix remodeling, and apoptosis in response to these mycobacterial infections. In conclusion, we characterized a new M. avium infection model in zebrafish that can be further used in studying pathological mechanisms for NTM-caused diseases

    Collective moderation of hate, toxicity, and extremity in online discussions

    Full text link
    How can citizens moderate hate, toxicity, and extremism in online discourse? We analyze a large corpus of more than 130,000 discussions on German Twitter over the turbulent four years marked by the migrant crisis and political upheavals. With a help of human annotators, language models, machine learning classifiers, and longitudinal statistical analyses, we discern the dynamics of different dimensions of discourse. We find that expressing simple opinions, not necessarily supported by facts but also without insults, relates to the least hate, toxicity, and extremity of speech and speakers in subsequent discussions. Sarcasm also helps in achieving those outcomes, in particular in the presence of organized extreme groups. More constructive comments such as providing facts or exposing contradictions can backfire and attract more extremity. Mentioning either outgroups or ingroups is typically related to a deterioration of discourse in the long run. A pronounced emotional tone, either negative such as anger or fear, or positive such as enthusiasm and pride, also leads to worse outcomes. Going beyond one-shot analyses on smaller samples of discourse, our findings have implications for the successful management of online commons through collective civic moderation
    • …
    corecore