44 research outputs found

    Finding the SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings

    Full text link
    Adaptive inference is a simple method for reducing inference costs. The method works by maintaining multiple classifiers of different capacities, and allocating resources to each test instance according to its difficulty. In this work, we compare the two main approaches for adaptive inference, Early-Exit and Multi-Model, when training data is limited. First, we observe that for models with the same architecture and size, individual Multi-Model classifiers outperform their Early-Exit counterparts by an average of 2.3%. We show that this gap is caused by Early-Exit classifiers sharing model parameters during training, resulting in conflicting gradient updates of model weights. We find that despite this gap, Early-Exit still provides a better speed-accuracy trade-off due to the overhead of the Multi-Model approach. To address these issues, we propose SWEET (Separating Weights in Early Exit Transformers), an Early-Exit fine-tuning method that assigns each classifier its own set of unique model weights, not updated by other classifiers. We compare SWEET's speed-accuracy curve to standard Early-Exit and Multi-Model baselines and find that it outperforms both methods at fast speeds while maintaining comparable scores to Early-Exit at slow speeds. Moreover, SWEET individual classifiers outperform Early-Exit ones by 1.1% on average. SWEET enjoys the benefits of both methods, paving the way for further reduction of inference costs in NLP.Comment: Proceedings of ACL 202

    Creative Industries’ Network of Entrepreneurs Lessons learned from the offering of an Acceleration Program in Portugal, Spain and Greece, to foster entrepreneurship in CCIs

    Get PDF
    The Creative Industries Network of Entrepreneurs (CINet) is a research project in innovation and creative entrepreneurship being implemented, within the Lifelong Learning Programme, Leonardo da Vinci, of the European Commission. The CINet project aims at improving business skills for creative entrepreneurs and enhancing the potential for business creation in the creative industries in three Southern European countries (Greece, Portugal, and Spain). To achieve its objectives, CINet brings together six partners (Universidade Aberta, the University of Piraeus, the Open University of Catalonia, UKWON, MediaDeals, and DNA Cascais), with expertise in entrepreneurship research and education provision for potential entrepreneurs. The course was offered in a pilot fashion, during the April – June 2015 period, and aimed to help and provide support to would-be entrepreneurs who desire to start-up in the creative sector (including arts and crafts, architecture, gastronomy, leisure, videogames, advertising, press and media, film and audiovisual activities, public relations and publishing industry, among others). After a period of conception, development and testing, this acceleration program was offered in Portugal, Spain and Greece in three different modalities: face-to-face in Greece; bLearning in Portugal; and eLearning in Spain.info:eu-repo/semantics/publishedVersio

    How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers

    Full text link
    The attention mechanism is considered the backbone of the widely-used Transformer architecture. It contextualizes the input by computing input-specific attention matrices. We find that this mechanism, while powerful and elegant, is not as important as typically thought for pretrained language models. We introduce PAPA, a new probing method that replaces the input-dependent attention matrices with constant ones -- the average attention weights over multiple inputs. We use PAPA to analyze several established pretrained Transformers on six downstream tasks. We find that without any input-dependent attention, all models achieve competitive performance -- an average relative drop of only 8% from the probing baseline. Further, little or no performance drop is observed when replacing half of the input-dependent attention matrices with constant (input-independent) ones. Interestingly, we show that better-performing models lose more from applying our method than weaker models, suggesting that the utilization of the input-dependent attention mechanism might be a factor in their success. Our results motivate research on simpler alternatives to input-dependent attention, as well as on methods for better utilization of this mechanism in the Transformer architecture.Comment: Findings of EMNLP 202

    Temporal pattern of C1q deposition after transient focal cerebral ischemia

    Full text link
    Recent studies have focused on elucidating the contribution of individual complement proteins to post-ischemic cellular injury. As the timing of complement activation and deposition after cerebral ischemia is not well understood, our study investigates the temporal pattern of C1q accumulation after experimental murine stroke. Brains were harvested from mice subjected to transient focal cerebral ischemia at 3, 6, 12, and 24 hr post reperfusion. Western blotting and light microscopy were employed to determine the temporal course of C1q protein accumulation and correlate this sequence with infarct evolution observed with TTC staining. Confocal microscopy was utilized to further characterize the cellular localization and characteristics of C1q deposition. Western Blot analysis showed that C1q protein begins to accumulate in the ischemic hemisphere between 3 and 6 hr post-ischemia. Light microscopy confirmed these findings, showing concurrent C1q protein staining of neurons. Confocal microscopy demonstrated co-localization of C1q protein with neuronal cell bodies as well as necrotic cellular debris. These experiments demonstrate the accumulation of C1q protein on neurons during the period of greatest infarct evolution. This data provides information regarding the optimal time window during which a potentially neuroprotective anti-C1q strategy is most likely to achieve therapeutic success. © 2006 Wiley-Liss, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/50651/1/20775_ftp.pd

    Efficient Methods for Natural Language Processing: A Survey

    Full text link
    Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data, time, storage, or energy, all of which are naturally limited and unevenly distributed. This motivates research into efficient methods that require fewer resources to achieve similar results. This survey synthesizes and relates current methods and findings in efficient NLP. We aim to provide both guidance for conducting NLP under limited resources, and point towards promising research directions for developing more efficient methods.Comment: Accepted at TACL, pre publication versio

    High-intensity interval training and moderate-intensity continuous training in adults with Crohn’s disease: a pilot randomised controlled trial

    Get PDF
    Background: This study assessed the feasibility and acceptability of two common types of exercise training—high-intensity interval training (HIIT) and moderate-intensity continuous training (MICT)—in adults with Crohn’s disease (CD). Methods: In this mixed-methods pilot trial, participants with quiescent or mildly-active CD were randomly assigned 1:1:1 to HIIT, MICT or usual care control, and followed up for 6 months. The HIIT and MICT groups were offered three exercise sessions per week for the first 12 weeks. Feasibility outcomes included rates of recruitment, retention, outcome completion, and exercise attendance. Data were collected on cardiorespiratory fitness (e.g., peak oxygen uptake), disease activity, fatigue, quality of life, adverse events, and intervention acceptability (via interviews). Results: Over 17 months, 53 patients were assessed for eligibility and 36 (68%) were randomised (47% male; mean age 36.9 [SD 11.2] years); 13 to HIIT, 12 to MICT, and 11 to control. The exercise session attendance rate was 62% for HIIT (288/465) and 75% for MICT (320/429), with 62% of HIIT participants (8/13) and 67% of MICT participants (8/12) completing at least 24 of 36 sessions. One participant was lost to follow-up. Outcome completion rates ranged from 89 to 97%. The mean increase in peak oxygen uptake, relative to control, was greater following HIIT than MICT (2.4 vs. 0.7 mL/kg/min). There were three non-serious exercise-related adverse events, and two exercise participants experienced disease relapse during follow-up. Conclusions: The findings support the feasibility and acceptability of the exercise programmes and trial procedures. A definitive trial is warranted. Physical exercise remains a potentially useful adjunct therapy in CD

    Mapping soil health over large agriculturally important areas

    No full text
    Soil health deterioration due to intensive agricultural activity is a worldwide problem. To better understand this process, there is a prime need to map soil health over wide areas. This paper aims to quantify soil health in a spatially explicit manner over a large area using soil health indicators. The methodology includes sampling design, autocorrelation analysis and Kriging interpolation. The following variables were measured from vertisol clayey soils: aggregate stability (AS); available water capacity (AWC); surface and subsurface penetration resistance (PR15 and PR45 respectively); root health (RH); organic matter (OM); pH; electrical conductivity (EC); cation-exchange capacity (CEC); exchangeable K; nitrification potential (Np); and P. Stratified random sampling was found to be a more efficient method than random sampling for representing a large area with a limited number of sampling locations. The variogram envelope method was found to be more conservative in determining the significance of autocorrelation than the classical Moran’s I approach. Phosphorus, CEC, PR15, EC, and K exhibited strong autocorrelation in space; other variables showed no autocorrelation. Land management factors were found to control the spatial variability of most soil variables. Kriging with an external drift (KED) was found to be the most useful approach for spatial prediction of soil health. A positive correlation was found between the interpolated soil health index and NDVI (Normalized Difference Vegetation Index). These results suggest that soil health maps can be used to explore how cultivation activities limit crop yields at the catchment scale, and to determine whether these activities create distinctive soil characteristics
    corecore