302,508 research outputs found
An energy-based comparative analysis of common approaches to text classification in the Legal domain
Most Machine Learning research evaluates the best solutions in terms of
performance. However, in the race for the best performing model, many important
aspects are often overlooked when, on the contrary, they should be carefully
considered. In fact, sometimes the gaps in performance between different
approaches are neglectable, whereas factors such as production costs, energy
consumption, and carbon footprint must take into consideration. Large Language
Models (LLMs) are extensively adopted to address NLP problems in academia and
industry. In this work, we present a detailed quantitative comparison of LLM
and traditional approaches (e.g. SVM) on the LexGLUE benchmark, which takes
into account both performance (standard indices) and alternative metrics such
as timing, power consumption and cost, in a word: the carbon-footprint. In our
analysis, we considered the prototyping phase (model selection by
training-validation-test iterations) and in-production phases separately, since
they follow different implementation procedures and also require different
resources. The results indicate that very often, the simplest algorithms
achieve performance very close to that of large LLMs but with very low power
consumption and lower resource demands. The results obtained could suggest
companies to include additional evaluations in the choice of Machine Learning
(ML) solutions.Comment: Accepted at The 4th International Conference on NLP & Text Mining
(NLTM 2024), January 27-28 2024, Copenhagen, Denmark - 12 pages, 1 figure, 7
table
Research and Applications of the Processes of Performance Appraisal: A Bibliography of Recent Literature, 1981-1989
[Excerpt] There have been several recent reviews of different subtopics within the general performance appraisal literature. The reader of these reviews will find, however, that the accompanying citations may be of limited utility for one or more reasons. For example, the reference sections of these reviews are usually composed of citations which support a specific theory or practical approach to the evaluation of human performance. Consequently, the citation lists for these reviews are, as they must be, highly selective and do not include works that may have only a peripheral relationship to a given reviewer\u27s target concerns. Another problem is that the citations are out of date. That is, review articles frequently contain many citations that are fifteen or more years old. The generation of new studies and knowledge in this field occurs very rapidly. This creates a need for additional reference information solely devoted to identifying the wealth of new research, ideas, and writing that is changing the field
Judging Competency A study of in-training evaluation of veterinary students : A thesis presented in partial fulfilment of the requirements for the degree of Doctor of Education at Massey University, Manawatū, New Zealand
Listed in 2016 Dean's List of Exceptional ThesesIn-training evaluations are a common but highly criticised method of assessing the competency of veterinary students completing training. They involve assessment of on-going performance in the workplace, performed by the supervisor. They are highly feasible and one of the few ways that a student’s performance in an authentic context can be evaluated. Psychometric research has suggested, however, that in-training evaluations are unreliable, do not discriminate aspects of performance, and do not predict performance on other assessments, casting doubt on the credibility of scores. Research on rater judgement processes suggests, in contrast, that multiple aspects are discriminated and that accounting for context and inferred reasons for behaviour contributes to rater variability. Very little research has considered in-training evaluation in a veterinary context.
In a mixed method study this research investigated how well the in-training evaluation used during clinical placements in one veterinary school captured the aspects of student performance it was designed to capture. It explored the supervisor’s view of student performance, and how that related to the dimensions being assessed in in-training evaluation, and to the constructs of competency articulated in frameworks. Complementary research strands involved analysis of semi-structured interviews with supervisors, common factor analysis of in-training evaluation scores, ordinal logistic regression relating factors to overall judgement, and thematic comparisons of findings with competency frameworks.
Together, the nature of what supervisors considered, the dimensional structure of scores, and the relationship of dimensions with the overall judgement suggested that the in-training evaluation is both holistic and discriminating, and that important aspects of performance are student engagement and trustworthiness. The aspects captured by the evaluation aligned well with the design of the instrument, and generally well with the veterinary competency frameworks. However, some areas were highlighted where concepts of veterinary competency and the competencies required in different subdisciplines need further consideration by the profession. The findings give insights into the process of judgement of competency by veterinary supervisors that will inform further research. They support some aspects of a validity argument in relation to scoring processes, and inform the design of evaluation instruments by underscoring the construct-relevance of interrelated dimensions
Practical Bayesian Optimization of Machine Learning Algorithms
Machine learning algorithms frequently require careful tuning of model
hyperparameters, regularization terms, and optimization parameters.
Unfortunately, this tuning is often a "black art" that requires expert
experience, unwritten rules of thumb, or sometimes brute-force search. Much
more appealing is the idea of developing automatic approaches which can
optimize the performance of a given learning algorithm to the task at hand. In
this work, we consider the automatic tuning problem within the framework of
Bayesian optimization, in which a learning algorithm's generalization
performance is modeled as a sample from a Gaussian process (GP). The tractable
posterior distribution induced by the GP leads to efficient use of the
information gathered by previous experiments, enabling optimal choices about
what parameters to try next. Here we show how the effects of the Gaussian
process prior and the associated inference procedure can have a large impact on
the success or failure of Bayesian optimization. We show that thoughtful
choices can lead to results that exceed expert-level performance in tuning
machine learning algorithms. We also describe new algorithms that take into
account the variable cost (duration) of learning experiments and that can
leverage the presence of multiple cores for parallel experimentation. We show
that these proposed algorithms improve on previous automatic procedures and can
reach or surpass human expert-level optimization on a diverse set of
contemporary algorithms including latent Dirichlet allocation, structured SVMs
and convolutional neural networks
A comparison of the comprehension of procedural information using computer and hard-copy media
Users of technical procedures must be able to understand the documents to use them to perform their work. As more companies contemplate putting their procedures on-line, it is important to know whether computer systems will be as effective as traditional hard-copy presentation in communicating procedures to the employees who must use them; To determine whether there is a relationship between computer usage and the comprehension of technical procedures, an experiment was conducted among employees of a scientific and technical company in Las Vegas, Nevada. A control group read and demonstrated its comprehension of hard-copy procedures only, while an experimental group read and demonstrated its comprehension of a hard-copy and then an on-line procedure; The experimental group selected fewer correct answers on a comprehension test for the on-line than for the hard-copy procedure. This suggests that when readers accustomed to the hard-copy medium switch to the computer medium, comprehension decreases
- …