29 research outputs found

    Electron Energy Loss Microspectroscopy and the Characterization of Solids

    Get PDF
    The inelastic scattering of fast electrons provides a detailed means of characterizing the chemical composition and electronic properties of thin samples in an electron microscope. Collective and single-electron excitations occuring in the low energy region of the spectrum can be described in terms of the generalized dielectric formulation. Important information is contained in this part of the spectrum but some prior detailed knowledge of the sample is usually required for proper interpretation. The core excitations allow microanalytical information to be obtained and quantitative procedures are now quite well developed at least for K and L edges. Sample thickness is one factor that limits the quality of data in energy loss spectra and it is now possible to remove the effects of plural scattering from core edges as well as from the low loss spectrum. Several advances in instrumentation have been made recently both in electron optics and recording devices. It appears that the detection limits are very low, possibly 1 to 10 atoms in an optimized system. Measurements also show that the core edges offer a sensitive method for probing the chemical bonding and electronic structure, provided the energy resolution is sufficient (≤ 1 eV). Of particular interest is the momentum transfer and orientation dependence of the fine structure for crystalline materials. The transition elements exhibit very sharp features near the L23 threshold due to transitions to unoccupied d states and reasonable agreement is found with theory here. Another type of information can be obtained from the extended fine structure above the core edges (EXELFS). This is capable of yielding the local atomic environment around the different atomic species

    Workshop on Viability of Halophilic Bacteria in Salt Deposits

    Get PDF
    The significance of finding viable extreme halophiles in halites associated with Permian-aged sedimentary deposits is considered. Issues related to the microbiology and geochemistry of the halite environment are addressed. Recommendations that related the significance of this phenomenon to NASA's interest in planetary exploration and the early evolution of life are provided

    Is margin all you need? An extensive empirical study of active learning on tabular data

    Full text link
    Given a labeled training set and a collection of unlabeled data, the goal of active learning (AL) is to identify the best unlabeled points to label. In this comprehensive study, we analyze the performance of a variety of AL algorithms on deep neural networks trained on 69 real-world tabular classification datasets from the OpenML-CC18 benchmark. We consider different data regimes and the effect of self-supervised model pre-training. Surprisingly, we find that the classical margin sampling technique matches or outperforms all others, including current state-of-art, in a wide range of experimental settings. To researchers, we hope to encourage rigorous benchmarking against margin, and to practitioners facing tabular data labeling constraints that hyper-parameter-free margin may often be all they need

    Anchor Points: Benchmarking Models with Much Fewer Examples

    Full text link
    Modern language models often exhibit powerful but brittle behavior, leading to the development of larger and more diverse benchmarks to reliably assess their behavior. Here, we suggest that model performance can be benchmarked and elucidated with much smaller evaluation sets. We first show that in six popular language classification benchmarks, model confidence in the correct class on many pairs of points is strongly correlated across models. We build upon this phenomenon to propose Anchor Point Selection, a technique to select small subsets of datasets that capture model behavior across the entire dataset. Anchor points reliably rank models: across 87 diverse language model-prompt pairs, evaluating models using 1-30 anchor points outperforms uniform sampling and other baselines at accurately ranking models. Moreover, just several anchor points can be used to estimate model per-class predictions on all other points in a dataset with low mean absolute error, sufficient for gauging where the model is likely to fail. Lastly, we present Anchor Point Maps for visualizing these insights and facilitating comparisons of the performance of different models on various regions within the dataset distribution

    Collective cell migration requires vesicular trafficking for chemoattractant delivery at the trailing edge

    Get PDF
    Chemoattractant signaling induces the polarization and directed movement of cells secondary to the activation of multiple effector pathways. In addition, chemotactic signals can be amplified and relayed to proximal cells via the synthesis and secretion of additional chemoattractant. The mechanisms underlying such remarkable features remain ill defined. We show that the asymmetrical distribution of adenylyl cyclase (ACA) at the back of Dictyostelium discoideum cells, an essential determinant of their ability to migrate in a head-to-tail fashion, requires vesicular trafficking. This trafficking results in a local accumulation of ACA-containing intracellular vesicles and involves intact actin, microtubule networks, and de novo protein synthesis. We also show that migrating cells leave behind ACA-containing vesicles, likely secreted as multivesicular bodies and presumably involved in the formation of head-to-tail arrays of migrating cells. We propose that similar compartmentalization and shedding mechanisms exist in mammalian cells during embryogenesis, wound healing, neuron growth, and metastasis

    Beyond neural scaling laws: beating power law scaling via data pruning

    Full text link
    Widely observed neural scaling laws, in which error falls off as a power of the training set size, model size, or both, have driven substantial performance improvements in deep learning. However, these improvements through scaling alone require considerable costs in compute and energy. Here we focus on the scaling of error with dataset size and show how in theory we can break beyond power law scaling and potentially even reduce it to exponential scaling instead if we have access to a high-quality data pruning metric that ranks the order in which training examples should be discarded to achieve any pruned dataset size. We then test this improved scaling prediction with pruned dataset size empirically, and indeed observe better than power law scaling in practice on ResNets trained on CIFAR-10, SVHN, and ImageNet. Next, given the importance of finding high-quality pruning metrics, we perform the first large-scale benchmarking study of ten different data pruning metrics on ImageNet. We find most existing high performing metrics scale poorly to ImageNet, while the best are computationally intensive and require labels for every image. We therefore developed a new simple, cheap and scalable self-supervised pruning metric that demonstrates comparable performance to the best supervised metrics. Overall, our work suggests that the discovery of good data-pruning metrics may provide a viable path forward to substantially improved neural scaling laws, thereby reducing the resource costs of modern deep learning.Comment: Outstanding Paper Award @ NeurIPS 2022. Added github link to metric score

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Full text link
    A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 40 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve reliability, we developed ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol as it improves the out-of-the-box performance and does not require designing scores or tuning the model for each task. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples. We also demonstrate Plex's capabilities on challenging tasks including zero-shot open set recognition, active learning, and uncertainty in conversational language understanding.Comment: Code available at https://goo.gle/plex-cod
    corecore