145 research outputs found

    Label-free video-rate micro-endoscopy through flexible fibers via Fiber Bundle Distal Holography (FiDHo)

    Full text link
    Fiber-based micro-endoscopes are a critically important tool for minimally-invasive deep-tissue imaging. However, the state-of-the-art micro-endoscopes cannot perform three-dimensional imaging through dynamically-bent fibers without the use of bulky optical elements such as lenses and scanners at the distal end, increasing the footprint and tissue-damage. While great efforts have been invested in developing approaches that avoid distal bulky optical elements, the fundamental barrier of dynamic optical wavefront-distortions in propagation through flexible fibers, limits current approaches to nearly-static or non-flexible fibers. Here, we present an approach that allows holographic 3D bend-insensitive, coherence-gated, micro-endoscopic imaging, using commercially available multi-core fibers (MCFs). We achieve this by adding a miniature partially-reflecting mirror to the distal fiber-tip, allowing us to perform low-coherence full-field phase-shifting holography. We demonstrate widefield diffraction-limited reflection imaging of amplitude and phase targets through dynamically bent fibers at video-rates. Our approach holds great potential for label-free investigations of dynamic samples.Comment: 28 pages, 6 figures plus 4 supplementary figures. Movies not include

    Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

    Full text link
    Recent state-of-the-art natural language understanding models, such as BERT and XLNet, score a pair of sentences (A and B) using multiple cross-attention operations - a process in which each word in sentence A attends to all words in sentence B and vice versa. As a result, computing the similarity between a query sentence and a set of candidate sentences, requires the propagation of all query-candidate sentence-pairs throughout a stack of cross-attention layers. This exhaustive process becomes computationally prohibitive when the number of candidate sentences is large. In contrast, sentence embedding techniques learn a sentence-to-vector mapping and compute the similarity between the sentence vectors via simple elementary operations. In this paper, we introduce Distilled Sentence Embedding (DSE) - a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks. The outline of DSE is as follows: Given a cross-attentive teacher model (e.g. a fine-tuned BERT), we train a sentence embedding based student model to reconstruct the sentence-pair scores obtained by the teacher model. We empirically demonstrate the effectiveness of DSE on five GLUE sentence-pair tasks. DSE significantly outperforms several ELMO variants and other sentence embedding methods, while accelerating computation of the query-candidate sentence-pairs similarities by several orders of magnitude, with an average relative degradation of 4.6% compared to BERT. Furthermore, we show that DSE produces sentence embeddings that reach state-of-the-art performance on universal sentence representation benchmarks. Our code is made publicly available at https://github.com/microsoft/Distilled-Sentence-Embedding.Comment: In Proceedings of AAAI 202

    Experimental investigation of the trigger problem in magnetic reconnection

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Physics, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 145-154).Magnetic reconnection is a fundamental process in plasma physics, which involves the often explosive release of magnetically stored energy in both space and laboratory plasmas. In order for this sudden release of energy to occur, there must be a period of slow reconnection, in which magnetic stress accumulates in the system, followed by a quick transition to fast reconnection. The question of what causes this transition is known as the 'trigger problem' and is not well understood. We address the trigger problem using the Versatile Toroidal Facility (VTF) at MIT, which we operate in the strong magnetic guide field regime. The resulting reconnection occurs in spontaneous events, in which there is a transition to fast reconnection. The reconnection in these events is asymmetric: it begins at one toroidal location and propagates toroidally in both directions. The spontaneous onset is facilitated by an interaction between the x-line current channel and a global mode, which breaks axisymmetry. We model the onset using an empirical Ohm's law and current continuity, which is maintained by ion polarization currents associated with the mode. The model reproduces the exponential growth of the reconnection electric field, and the model growth rate agrees well with the experimentally measured growth rate. We begin, however, by discussing reconnection in the collisional regime and the effect of neutral gas on plasma flows. We perform experiments which are relevant to plasmas at the edge of tokamaks, but may also be applicable to reconnection in the solar photosphere and the interstellar medium, where the ionization fraction is low. In these experiments, a plasma filament propagates across a magnetic field in a background of neutral atoms. The filament motion is driven by charge separation in an inhomogeneous magnetic field, and this drive is balanced by collisional damping. The filament propagation and internal structure are described in detail.by Noam Karasov Katz.Ph.D

    Knowledge is a Region in Weight Space for Fine-tuned Language Models

    Full text link
    Research on neural networks has focused on understanding a single model trained on a single dataset. However, relatively little is known about the relationships between different models, particularly those trained or tested on different datasets. We address this by studying how the weight space and the underlying loss landscape of different models are interconnected. Specifically, we demonstrate that finetuned models that were optimized for high performance, reside in well-defined regions in weight space, and vice versa -- that any model that resides anywhere in those regions also exhibits high performance. Notably, we show that language models that have been finetuned on the same dataset form a tight cluster in the weight space, while models finetuned on different datasets from the same underlying task form a looser cluster. Moreover, traversing around the region between the models leads to new models that perform comparably or even better than models obtained via finetuning, even on tasks that the original models were not finetuned on. Our findings provide insight into the relationships between models, demonstrating that a model positioned between two similar models can acquire the knowledge of both. We leverage this and design a method for selecting a better model for efficient finetuning. Specifically, we show that starting from the center of the region is as effective, if not more, than using the pretrained model in 11 out of 12 datasets, resulting in an average accuracy improvement of 3.06

    ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning

    Full text link
    We propose a new paradigm to continually evolve pretrained models, denoted ColD Fusion. It provides the benefits of multitask learning but leverages distributed computation with limited communication and eliminates the need for shared data. Consequentially, ColD Fusion can give rise to a synergistic loop, where finetuned models can be recycled to continually improve the pretrained model they are based upon. We show that ColD Fusion yields comparable benefits to multitask training by producing a model that (a) attains strong performance on all of the datasets it was trained on; and (b) is a better starting point for finetuning on unseen datasets. We show that ColD Fusion outperforms RoBERTa and even previous multitask models. Specifically, when training and testing on 35 diverse datasets, ColD Fusion-based model outperforms RoBERTa by 2.33 points on average without any changes to the architecture.Comment: ACL 2
    • …
    corecore