1,710,673 research outputs found

    The standard error of measurement is a more appropriate measure of quality for postgraduate medical assessments than is reliability: an analysis of MRCP(UK) examinations

    Get PDF
    Background: Cronbach's alpha is widely used as the preferred index of reliability for medical postgraduate examinations. A value of 0.8-0.9 is seen by providers and regulators alike as an adequate demonstration of acceptable reliability for any assessment. Of the other statistical parameters, Standard Error of Measurement (SEM) is mainly seen as useful only in determining the accuracy of a pass mark. However the alpha coefficient depends both on SEM and on the ability range (standard deviation, SD) of candidates taking an exam. This study investigated the extent to which the necessarily narrower ability range in candidates taking the second of the three part MRCP(UK) diploma examinations, biases assessment of reliability and SEM.Methods: a) The interrelationships of standard deviation (SD), SEM and reliability were investigated in a Monte Carlo simulation of 10,000 candidates taking a postgraduate examination. b) Reliability and SEM were studied in the MRCP(UK) Part 1 and Part 2 Written Examinations from 2002 to 2008. c) Reliability and SEM were studied in eight Specialty Certificate Examinations introduced in 2008-9.Results: The Monte Carlo simulation showed, as expected, that restricting the range of an assessment only to those who had already passed it, dramatically reduced the reliability but did not affect the SEM of a simulated assessment. The analysis of the MRCP(UK) Part 1 and Part 2 written examinations showed that the MRCP(UK) Part 2 written examination had a lower reliability than the Part 1 examination, but, despite that lower reliability, the Part 2 examination also had a smaller SEM (indicating a more accurate assessment). The Specialty Certificate Examinations had small Ns, and as a result, wide variability in their reliabilities, but SEMs were comparable with MRCP(UK) Part 2.Conclusions: An emphasis upon assessing the quality of assessments primarily in terms of reliability alone can produce a paradoxical and distorted picture, particularly in the situation where a narrower range of candidate ability is an inevitable consequence of being able to take a second part examination only after passing the first part examination. Reliability also shows problems when numbers of candidates in examinations are low and sampling error affects the range of candidate ability. SEM is not subject to such problems; it is therefore a better measure of the quality of an assessment and is recommended for routine use

    Grip and muscle strength dynamometry in acute burn injury: Evaluation of an updated assessment protocol

    Get PDF
    External stabilization is reported to improve reliability of hand held dynamometry, yet this has not been tested in burns. We aimed to assess the reliability of dynamometry using an external system of stabilization in people with moderate burn injury and explore construct validity of strength assessment using dynamometry. Participants were assessed on muscle and grip strength three times on each side. Assessment occurred three times per week for up to four weeks. Within session reliability was assessed using intraclass correlations calculated for within session data grouped prior to surgery, immediately after surgery and in the sub-acute phase of injury. Minimum detectable differences were also calculated. In the same timeframe categories, construct validity was explored using regression analysis incorporating burn severity and demographic characteristics. Thirty-eight participants with total burn surface area 5 – 40% were recruited. Reliability was determined to be clinically applicable for the assessment method (intraclass correlation coefficient \u3e0.75) at all phases after injury. Muscle strength was associated with sex and burn location during injury and wound healing. Burn size in the immediate period after surgery and age in the sub-acute phase of injury were also associated with muscle strength assessment results. Hand held dynamometry is a reliable assessment tool for evaluating within session muscle strength in the acute and sub-acute phase of injury in burns up to 40% total burn surface area. External stabilization may assist to eliminate reliability issues related to patient and assessor strength

    The Telehealth Skills, Training, and Implementation Project: An evaluation protocol

    Get PDF
    External stabilization is reported to improve reliability of hand held dynamometry, yet this has not been tested in burns. We aimed to assess the reliability of dynamometry using an external system of stabilization in people with moderate burn injury and explore construct validity of strength assessment using dynamometry. Participants were assessed on muscle and grip strength three times on each side. Assessment occurred three times per week for up to four weeks. Within session reliability was assessed using intraclass correlations calculated for within session data grouped prior to surgery, immediately after surgery and in the sub-acute phase of injury. Minimum detectable differences were also calculated. In the same timeframe categories, construct validity was explored using regression analysis incorporating burn severity and demographic characteristics. Thirty-eight participants with total burn surface area 5 – 40% were recruited. Reliability was determined to be clinically applicable for the assessment method (intraclass correlation coefficient \u3e0.75) at all phases after injury. Muscle strength was associated with sex and burn location during injury and wound healing. Burn size in the immediate period after surgery and age in the sub-acute phase of injury were also associated with muscle strength assessment results. Hand held dynamometry is a reliable assessment tool for evaluating within session muscle strength in the acute and sub-acute phase of injury in burns up to 40% total burn surface area. External stabilization may assist to eliminate reliability issues related to patient and assessor strength

    Reliability and risk assessment of structures

    Get PDF
    Development of reliability and risk assessment of structural components and structures is a major activity at Lewis Research Center. It consists of five program elements: (1) probabilistic loads; (2) probabilistic finite element analysis; (3) probabilistic material behavior; (4) assessment of reliability and risk; and (5) probabilistic structural performance evaluation. Recent progress includes: (1) the evaluation of the various uncertainties in terms of cumulative distribution functions for various structural response variables based on known or assumed uncertainties in primitive structural variables; (2) evaluation of the failure probability; (3) reliability and risk-cost assessment; and (4) an outline of an emerging approach for eventual certification of man-rated structures by computational methods. Collectively, the results demonstrate that the structural durability/reliability of man-rated structural components and structures can be effectively evaluated by using formal probabilistic methods

    Reliability verification of an existing reinforced concrete slab

    Get PDF
    The submitted contribution provides background information on the principles accepted in the CEN Technical Specification (TS). The application of the verification methods provided in the TS is clarified by an assessment of a reinforced concrete precast panel. The panel provides insufficient resistance in comparison to that required by Eurocodes for design of new structures. The critical comparison of the reliability levels indicated by Eurocodes, the assessment value method, and fully probabilistic approach demonstrates the benefits gained by applying the principles of the TS. While the partial factors recommended in Eurocodes leads to a negative result the assessment value method and the probabilistic method indicate sufficient structural reliability

    Evaluating testing methods by delivered reliability

    Get PDF
    There are two main goals in testing software: (1) to achieve adequate quality (debug testing), where the objective is to probe the software for defects so that these can be removed, and (2) to assess existing quality (operational testing), where the objective is to gain confidence that the software is reliable. Debug methods tend to ignore random selection of test data from an operational profile, while for operational methods this selection is all-important. Debug methods are thought to be good at uncovering defects so that these can be repaired, but having done so they do not provide a technically defensible assessment of the reliability that results. On the other hand, operational methods provide accurate assessment, but may not be as useful for achieving reliability. This paper examines the relationship between the two testing goals, using a probabilistic analysis. We define simple models of programs and their testing, and try to answer the question of how to attain program reliability: is it better to test by probing for defects as in debug testing, or to assess reliability directly as in operational testing? Testing methods are compared in a model where program failures are detected and the software changed to eliminate them. The “better” method delivers higher reliability after all test failures have been eliminated. Special cases are exhibited in which each kind of testing is superior. An analysis of the distribution of the delivered reliability indicates that even simple models have unusual statistical properties, suggesting caution in interpreting theoretical comparisons

    Reliability approach in spacecraft structures

    Get PDF
    This paper presents an application of the probabilistic approach with reliability assessment on a spacecraft structure. The adopted strategy uses meta-modeling with first and second order polynomial functions. This method aims at minimizing computational time while giving relevant results. The first part focuses on computational tools employed in the strategy development. The second part presents a spacecraft application. The purpose is to highlight benefits of the probabilistic approach compared with the current deterministic one. From examples of reliability assessment we show some advantages which could be found in industrial applications
    corecore