196,733 research outputs found
Model Cards for Model Reporting
Trained machine learning models are increasingly used to perform high-impact
tasks in areas such as law enforcement, medicine, education, and employment. In
order to clarify the intended use cases of machine learning models and minimize
their usage in contexts for which they are not well suited, we recommend that
released models be accompanied by documentation detailing their performance
characteristics. In this paper, we propose a framework that we call model
cards, to encourage such transparent model reporting. Model cards are short
documents accompanying trained machine learning models that provide benchmarked
evaluation in a variety of conditions, such as across different cultural,
demographic, or phenotypic groups (e.g., race, geographic location, sex,
Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex
and Fitzpatrick skin type) that are relevant to the intended application
domains. Model cards also disclose the context in which models are intended to
be used, details of the performance evaluation procedures, and other relevant
information. While we focus primarily on human-centered machine learning models
in the application fields of computer vision and natural language processing,
this framework can be used to document any trained machine learning model. To
solidify the concept, we provide cards for two supervised models: One trained
to detect smiling faces in images, and one trained to detect toxic comments in
text. We propose model cards as a step towards the responsible democratization
of machine learning and related AI technology, increasing transparency into how
well AI technology works. We hope this work encourages those releasing trained
machine learning models to accompany model releases with similar detailed
evaluation numbers and other relevant documentation
Data Structure Lower Bounds for Document Indexing Problems
We study data structure problems related to document indexing and pattern
matching queries and our main contribution is to show that the pointer machine
model of computation can be extremely useful in proving high and unconditional
lower bounds that cannot be obtained in any other known model of computation
with the current techniques. Often our lower bounds match the known space-query
time trade-off curve and in fact for all the problems considered, there is a
very good and reasonable match between the our lower bounds and the known upper
bounds, at least for some choice of input parameters. The problems that we
consider are set intersection queries (both the reporting variant and the
semi-group counting variant), indexing a set of documents for two-pattern
queries, or forbidden- pattern queries, or queries with wild-cards, and
indexing an input set of gapped-patterns (or two-patterns) to find those
matching a document given at the query time.Comment: Full version of the conference version that appeared at ICALP 2016,
25 page
Statistical and Clinical Aspects of Hospital Outcomes Profiling
Hospital profiling involves a comparison of a health care provider's
structure, processes of care, or outcomes to a standard, often in the form of a
report card. Given the ubiquity of report cards and similar consumer ratings in
contemporary American culture, it is notable that these are a relatively recent
phenomenon in health care. Prior to the 1986 release of Medicare hospital
outcome data, little such information was publicly available. We review the
historical evolution of hospital profiling with special emphasis on outcomes;
present a detailed history of cardiac surgery report cards, the paradigm for
modern provider profiling; discuss the potential unintended negative
consequences of public report cards; and describe various statistical
methodologies for quantifying the relative performance of cardiac surgery
programs. Outstanding statistical issues are also described.Comment: Published in at http://dx.doi.org/10.1214/088342307000000096 the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Scaling up antiretroviral therapy in Malawi-implications for managing other chronic diseases in resource-limited countries.
The national scale-up of antiretroviral therapy (ART) in Malawi is based on the public health approach, with principles and practices borrowed from the successful DOTS (directly observed treatment, short course) tuberculosis control framework. The key principles include political commitment, free care, and standardized systems for case finding, treatment, recording and reporting, and drug procurement. Scale-up of ART started in June 2004, and by December 2008, 223,437 patients were registered for treatment within a health system that is severely underresourced. The Malawi model for delivering lifelong ART can be adapted and used for managing patients with chronic noncommunicable diseases, the burden of which is already high and continues to grow in low-income and middle-income countries. This article discusses how the principles behind the successful Malawi model of ART delivery can be applied to the management of other chronic diseases in resource-limited settings and how this paradigm can be used for health systems strengthening
Validity of parental recalls to estimate vaccination coverage: evidence from Tanzania.
BACKGROUND: The estimates of vaccination coverage are measured from administrative data and from population based survey. While both card-based and recall data are collected through population survey, and the recall is when the card is missing, the preferred estimates remain of the card-based due to limited validity of parental recalls. As there is a concern of missing cards in poor settings, the evidence on validity of parental recalls is limited and varied across vaccine types, and therefore timely and needed. We validated the recalls against card-based data based on population survey in Tanzania. METHODS: We used a cross-sectional survey of about 3000 households with women who delivered in the last 12Â months prior to the interview in 2012 from three regions in Tanzania. Data on the vaccination status on four vaccine types were collected using two data sources, card and recall-based. We compared the level of agreement and identified the recall bias between the two data sources. We further computed the sensitivity and specificity of parental recalls, and used a multivariate logit model to identify the determinants of parental recall bias. RESULTS: Most parents (85.4%) were able to present the vaccination cards during the survey, and these were used for analysis. Although the coverage levels were generally similar across data sources, the recall-based data slightly overestimated the coverage estimates. The level of agreement between the two data sources was high above 94%, with minimal recall bias of less than 6%. The recall bias due to over-reporting were slightly higher than that due to under-reporting. The sensitivity of parental recalls was generally high for all vaccine types, while the specificity was generally low across vaccine types except for measles. The minimal recall bias for DPT and measles were associated with the mother's age, education level, health insurance status, region location and child age. CONCLUSION: Parental recalls when compared to card-based data are hugely accurate with minimal recall bias in Tanzania. Our findings support the use of parental recall collected through surveys to identify the child vaccination status in the absence of vaccination cards. The use of recall data alongside card-based estimates also ensures more representative coverage estimates
Credit Cards: Weapons for Domestic Violence
The objectives of this study were to describe the intra-specific variation in herbicide response of weed populations when subjected to new vs. well-established herbicides, and to assess distributions of logLD(50)- and logGR(50)-estimates as a potential indicator for early resistance detection. Seeds of two grass weeds (Alopecurus myosuroides, Apera spica-venti) were collected in southern Sweden, mainly in 2002. In line with the objectives of the study, the collections sites were not chosen for noted herbicide failures nor for detected herbicide resistance, but solely for the presence of the target species. For each species, seedlings were subjected to two herbicides in dose-response experiments in a greenhouse. One herbicide per species was recently introduced and the other had been on the market for control of the species for a decade, with several reports of resistance in the literature. Fresh weight of plants and a visual vigour score were used to estimate GR(50) and LD50, respectively. Resistance to fenoxaprop-P-ethyl in A. myosuroides was indicated by the LD50-estimates to be present in frequencies sufficient to affect the population-level response in 9 of 29 samples, and was correlated to response to flupyrsulfuron, while low susceptibility to isoproturon in A. spica-venti populations was not linked to the response to sulfosulfuron. In the study as a whole, the magnitude of the estimated herbicide susceptibility ranges differed irrespective of previous exposure. No consistent differences were found in the distribution of LD50-estimates for new and "old" herbicides, and normality in the distribution of estimates could not be assumed for a non-exposed sample, even in the absence of an indication of cross-resistance.Original Publication:Liv A Espeby, Hakan Fogelfors and Per Milberg, Susceptibility variation to new and established herbicides: Examples of inter-population sensitivity of grass weeds, 2011, CROP PROTECTION, (30), 4, 429-435.http://dx.doi.org/10.1016/j.cropro.2010.12.022Copyright: Elsevier Science B.V., Amsterdam.http://www.elsevier.com
- …