8 research outputs found
Localised Skin Hyperpigmentation as a Presenting Symptom of Vitamin B12 Deficiency Complicating Chronic Atrophic Gastritis
Vitamin B12 deficiency is common in developing countries and should be suspected in patients with unexplained anaemia or neurological symptoms. Dermatological manifestations associated with this deficiency include skin hyper- or hypopigmentation, angular stomatitis and hair changes. We report a case of a 28-year-old man who presented to the Sultan Qaboos University Hospital in Muscat, Oman, in November 2013 with localised hyperpigmentation of the palmar and dorsal aspects of both hands of two months’ duration. Other symptoms included numbness of the hands, anorexia, weight loss, dizziness, fatigability and a sore mouth and tongue. There was no evidence of hypocortisolaemia and a literature search revealed a possible B12 deficiency. The patient had low serum B12 levels and megaloblastic anaemia. An intrinsic factor antibody test was negative. A gastric biopsy revealed chronic gastritis. After B12 supplementation, the patient’s symptoms resolved. Family physicians should familiarise themselves with atypical presentations of B12 deficiency. Many symptoms of this deficiency are reversible if detected and treated early
Introducing v0.5 of the AI Safety Benchmark from MLCommons
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark
Introducing v0.5 of the AI Safety Benchmark from MLCommons
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark
Localised Skin Hyperpigmentation as a Presenting Symptom of Vitamin B12 Deficiency Complicating Chronic Atrophic Gastritis
Vitamin B12 deficiency is common in developing countries and should be suspected in patients with
unexplained anaemia or neurological symptoms. Dermatological manifestations associated with this deficiency
include skin hyper- or hypopigmentation, angular stomatitis and hair changes. We report a case of a 28-year-old
man who presented to the Sultan Qaboos University Hospital in Muscat, Oman, in November 2013 with localised
hyperpigmentation of the palmar and dorsal aspects of both hands of two months’ duration. Other symptoms
included numbness of the hands, anorexia, weight loss, dizziness, fatigability and a sore mouth and tongue. There
was no evidence of hypocortisolaemia and a literature search revealed a possible B12 deficiency. The patient had
low serum B12 levels and megaloblastic anaemia. An intrinsic factor antibody test was negative. A gastric biopsy
revealed chronic gastritis. After B12 supplementation, the patient’s symptoms resolved. Family physicians should
familiarise themselves with atypical presentations of B12 deficiency. Many symptoms of this deficiency are reversible
if detected and treated early
Introducing v0.5 of the AI Safety Benchmark from MLCommons
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark