5,381 research outputs found

    The Slow Professor: Challenging the Culture of Speed in the Academy

    Get PDF

    Explainable Artificial Intelligence: Approaching it From the Lowest Level

    Get PDF
    The increasing complexity of artificial intelligence models has given rise to extensive work toward understanding the inner workings of neural networks. Much of that work, however, has focused on manipulating input data feeding the network to assess their affects on network output or pruning model components after the often-extensive time-consuming training. It is postulated in this study that understanding of neural network can benefit from model structure simplification. In turn, it is shown that model simplification can benefit from investigating network node, the most fundamental unit of neural networks, evolving trends during training. Whereas studies on simplification of model structure have mostly required repeated model training at prohibitive time costs, assessing evolving trends in node weights toward model stabilization may circumvent that limitation. Node Positional and magnitude stabilities were the central construct to investigate neuronal patterns in time for this study and to determine node influence in model predictive ability. Positional stability was defined as the number of epochs wherein nodes held their location compared to those from the stable model, defined in this study as a model with accuracy \u3e0.90. Node magnitude stability was defined as the number of epochs where node weights retained their magnitude within a tolerance value when compared to the stable model. To test evolving trends, a manipulated, a contrived, two life science data sets were used. Data sets were run convolutional (CNN) and deep neural network (DNN) models. Experiments were conducted to test neural network training for patterns as a predicate for investigating node evolving trends. It was postulated that highly stable nodes were most influential in determining model prediction, measured by accuracy. Furthermore, this study suggested that influential node addition to model during training followed a biological growth curve. Findings indicated that neural network weight assignment, weight spatial structure, and progression through time were not random, strongly by model choice and choice of data set. Moreover, progress toward stability differed by model, where CNNs added influential nodes more evenly during training. The CNN model runs generally followed a biological growht curve covering an entire life, whereas for DNN model runs, the growth curve shape was more characteristic of an organism during its early life or a population unconstrained by resources, where growth tends to be exponential. The stability approach of this study showed superior time efficiencies when compared to competing methods. The contributions of this work may assist in making AI models more transparent and easier to understand to all stakeholders, adding to the benefits of AI technologies by minimizing and dispelling the fears associated with adoption of black-box automation approaches in science and industry

    A low-loss, broadband antenna for efficient photon collection from a coherent spin in diamond

    Get PDF
    We report the creation of a low-loss, broadband optical antenna giving highly directed output from a coherent single spin in the solid-state. The device, the first solid-state realization of a dielectric antenna, is engineered for individual nitrogen vacancy (NV) electronic spins in diamond. We demonstrate a directionality close to 10. The photonic structure preserves the high spin coherence of single crystal diamond (T2>100us). The single photon count rate approaches a MHz facilitating efficient spin readout. We thus demonstrate a key enabling technology for quantum applications such as high-sensitivity magnetometry and long-distance spin entanglement.Comment: 5 pages, 4 figures and supplementary information (5 pages, 8 figures). Comments welcome. Further information under http://www.quantum-sensing.physik.unibas.c

    Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers

    Get PDF
    Numeracy is the ability to understand and work with numbers. It is a necessary skill for composing and understanding documents in clinical, scientific, and other technical domains. In this paper, we explore different strategies for modelling numerals with language models, such as memorisation and digit-by-digit composition, and propose a novel neural architecture that uses a continuous probability density function to model numerals from an open vocabulary. Our evaluation on clinical and scientific datasets shows that using hierarchical models to distinguish numerals from words improves a perplexity metric on the subset of numerals by 2 and 4 orders of magnitude, respectively, over non-hierarchical models. A combination of strategies can further improve perplexity. Our continuous probability density function model reduces mean absolute percentage errors by 18% and 54% in comparison to the second best strategy for each dataset, respectively.Comment: accepted at ACL 201

    Learning to Reason with Adaptive Computation

    Get PDF
    Multi-hop inference is necessary for machine learning systems to successfully solve tasks such as Recognising Textual Entailment and Machine Reading. In this work, we demonstrate the effectiveness of adaptive computation for learning the number of inference steps required for examples of different complexity and that learning the correct number of inference steps is difficult. We introduce the first model involving Adaptive Computation Time which provides a small performance benefit on top of a similar model without an adaptive component as well as enabling considerable insight into the reasoning process of the model
    • 

    corecore