302,508 research outputs found

    An energy-based comparative analysis of common approaches to text classification in the Legal domain

    Full text link
    Most Machine Learning research evaluates the best solutions in terms of performance. However, in the race for the best performing model, many important aspects are often overlooked when, on the contrary, they should be carefully considered. In fact, sometimes the gaps in performance between different approaches are neglectable, whereas factors such as production costs, energy consumption, and carbon footprint must take into consideration. Large Language Models (LLMs) are extensively adopted to address NLP problems in academia and industry. In this work, we present a detailed quantitative comparison of LLM and traditional approaches (e.g. SVM) on the LexGLUE benchmark, which takes into account both performance (standard indices) and alternative metrics such as timing, power consumption and cost, in a word: the carbon-footprint. In our analysis, we considered the prototyping phase (model selection by training-validation-test iterations) and in-production phases separately, since they follow different implementation procedures and also require different resources. The results indicate that very often, the simplest algorithms achieve performance very close to that of large LLMs but with very low power consumption and lower resource demands. The results obtained could suggest companies to include additional evaluations in the choice of Machine Learning (ML) solutions.Comment: Accepted at The 4th International Conference on NLP & Text Mining (NLTM 2024), January 27-28 2024, Copenhagen, Denmark - 12 pages, 1 figure, 7 table

    Research and Applications of the Processes of Performance Appraisal: A Bibliography of Recent Literature, 1981-1989

    Get PDF
    [Excerpt] There have been several recent reviews of different subtopics within the general performance appraisal literature. The reader of these reviews will find, however, that the accompanying citations may be of limited utility for one or more reasons. For example, the reference sections of these reviews are usually composed of citations which support a specific theory or practical approach to the evaluation of human performance. Consequently, the citation lists for these reviews are, as they must be, highly selective and do not include works that may have only a peripheral relationship to a given reviewer\u27s target concerns. Another problem is that the citations are out of date. That is, review articles frequently contain many citations that are fifteen or more years old. The generation of new studies and knowledge in this field occurs very rapidly. This creates a need for additional reference information solely devoted to identifying the wealth of new research, ideas, and writing that is changing the field

    Judging Competency A study of in-training evaluation of veterinary students : A thesis presented in partial fulfilment of the requirements for the degree of Doctor of Education at Massey University, Manawatū, New Zealand

    Get PDF
    Listed in 2016 Dean's List of Exceptional ThesesIn-training evaluations are a common but highly criticised method of assessing the competency of veterinary students completing training. They involve assessment of on-going performance in the workplace, performed by the supervisor. They are highly feasible and one of the few ways that a student’s performance in an authentic context can be evaluated. Psychometric research has suggested, however, that in-training evaluations are unreliable, do not discriminate aspects of performance, and do not predict performance on other assessments, casting doubt on the credibility of scores. Research on rater judgement processes suggests, in contrast, that multiple aspects are discriminated and that accounting for context and inferred reasons for behaviour contributes to rater variability. Very little research has considered in-training evaluation in a veterinary context. In a mixed method study this research investigated how well the in-training evaluation used during clinical placements in one veterinary school captured the aspects of student performance it was designed to capture. It explored the supervisor’s view of student performance, and how that related to the dimensions being assessed in in-training evaluation, and to the constructs of competency articulated in frameworks. Complementary research strands involved analysis of semi-structured interviews with supervisors, common factor analysis of in-training evaluation scores, ordinal logistic regression relating factors to overall judgement, and thematic comparisons of findings with competency frameworks. Together, the nature of what supervisors considered, the dimensional structure of scores, and the relationship of dimensions with the overall judgement suggested that the in-training evaluation is both holistic and discriminating, and that important aspects of performance are student engagement and trustworthiness. The aspects captured by the evaluation aligned well with the design of the instrument, and generally well with the veterinary competency frameworks. However, some areas were highlighted where concepts of veterinary competency and the competencies required in different subdisciplines need further consideration by the profession. The findings give insights into the process of judgement of competency by veterinary supervisors that will inform further research. They support some aspects of a validity argument in relation to scoring processes, and inform the design of evaluation instruments by underscoring the construct-relevance of interrelated dimensions

    Practical Bayesian Optimization of Machine Learning Algorithms

    Full text link
    Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks

    A comparison of the comprehension of procedural information using computer and hard-copy media

    Full text link
    Users of technical procedures must be able to understand the documents to use them to perform their work. As more companies contemplate putting their procedures on-line, it is important to know whether computer systems will be as effective as traditional hard-copy presentation in communicating procedures to the employees who must use them; To determine whether there is a relationship between computer usage and the comprehension of technical procedures, an experiment was conducted among employees of a scientific and technical company in Las Vegas, Nevada. A control group read and demonstrated its comprehension of hard-copy procedures only, while an experimental group read and demonstrated its comprehension of a hard-copy and then an on-line procedure; The experimental group selected fewer correct answers on a comprehension test for the on-line than for the hard-copy procedure. This suggests that when readers accustomed to the hard-copy medium switch to the computer medium, comprehension decreases
    • …
    corecore