9 research outputs found
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Recent methods have demonstrated that Large Language Models (LLMs) can solve
reasoning tasks better when they are encouraged to solve subtasks of the main
task first. In this paper we devise a similar strategy that breaks down
reasoning tasks into a problem decomposition phase and a problem solving phase
and show that the strategy is able to outperform a single stage solution.
Further, we hypothesize that the decomposition should be easier to distill into
a smaller model compared to the problem solving because the latter requires
large amounts of domain knowledge while the former only requires learning
general problem solving strategies. We propose methods to distill these two
capabilities and evaluate their impact on reasoning outcomes and inference
cost. We find that we can distill the problem decomposition phase and at the
same time achieve good generalization across tasks, datasets, and models.
However, it is harder to distill the problem solving capability without losing
performance and the resulting distilled model struggles with generalization.
These results indicate that by using smaller, distilled problem decomposition
models in combination with problem solving LLMs we can achieve reasoning with
cost-efficient inference and local adaptation
Using social media to assess the consumer nutrition environment: comparing Yelp reviews with a direct observation audit instrument for grocery stores
Objective To examine the feasibility of using social media to assess the consumer nutrition environment by comparing sentiment expressed in Yelp reviews with information obtained from a direct observation audit instrument for grocery stores. Design Trained raters used the Nutrition Environment Measures Survey in Stores (NEMS-S) in 100 grocery stores from July 2015 to March 2016. Yelp reviews were available for sixty-nine of these stores and were retrieved in February 2017 using the Yelp Application Program Interface. A sentiment analysis was conducted to quantify the perceptions of the consumer nutrition environment in the review text. Pearson correlation coefficients (ρ) were used to compare NEMS-S scores with Yelp review text on food availability, quality, price and shopping experience. Setting Detroit, Michigan, USA. Participants None. Results Yelp reviews contained more comments about food availability and the overall shopping experience than food price and food quality. Negative sentiment about food prices in Yelp review text and the number of dollar signs on Yelp were positively correlated with observed food prices in stores (ρ=0·413 and 0·462, respectively). Stores with greater food availability were rated as more expensive on Yelp. Other aspects of the food store environment (e.g. overall quality and shopping experience) were captured only in Yelp. Conclusions While Yelp cannot replace in-person audits for collecting detailed information on the availability, quality and cost of specific food items, Yelp holds promise as a cost-effective means to gather information on the overall cost, quality and experience of food stores, which may be relevant for nutrition outcomes
Recommended from our members
Dementia and electronic health record phenotypes: a scoping review of available phenotypes and opportunities for future research.
ObjectiveWe performed a scoping review of algorithms using electronic health record (EHR) data to identify patients with Alzheimer's disease and related dementias (ADRD), to advance their use in research and clinical care.Materials and methodsStarting with a previous scoping review of EHR phenotypes, we performed a cumulative update (April 2020 through March 1, 2023) using Pubmed, PheKB, and expert review with exclusive focus on ADRD identification. We included algorithms using EHR data alone or in combination with non-EHR data and characterized whether they identified patients at high risk of or with a current diagnosis of ADRD.ResultsFor our cumulative focused update, we reviewed 271 titles meeting our search criteria, 49 abstracts, and 26 full text papers. We identified 8 articles from the original systematic review, 8 from our new search, and 4 recommended by an expert. We identified 20 papers describing 19 unique EHR phenotypes for ADRD: 7 algorithms identifying patients with diagnosed dementia and 12 algorithms identifying patients at high risk of dementia that prioritize sensitivity over specificity. Reference standards range from only using other EHR data to in-person cognitive screening.ConclusionA variety of EHR-based phenotypes are available for use in identifying populations with or at high-risk of developing ADRD. This review provides comparative detail to aid in choosing the best algorithm for research, clinical care, and population health projects based on the use case and available data. Future research may further improve the design and use of algorithms by considering EHR data provenance
Using Social Media to Identify Sources of Healthy Food in Urban Neighborhoods.
An established body of research has used secondary data sources (such as proprietary business databases) to demonstrate the importance of the neighborhood food environment for multiple health outcomes. However, documenting food availability using secondary sources in low-income urban neighborhoods can be particularly challenging since small businesses play a crucial role in food availability. These small businesses are typically underrepresented in national databases, which rely on secondary sources to develop data for marketing purposes. Using social media and other crowdsourced data to account for these smaller businesses holds promise, but the quality of these data remains unknown. This paper compares the quality of full-line grocery store information from Yelp, a crowdsourced content service, to a "ground truth" data set (Detroit Food Map) and a commercially-available dataset (Reference USA) for the greater Detroit area. Results suggest that Yelp is more accurate than Reference USA in identifying healthy food stores in urban areas. Researchers investigating the relationship between the nutrition environment and health may consider Yelp as a reliable and valid source for identifying sources of healthy food in urban environments
Recommended from our members
Using social media to assess the consumer nutrition environment: comparing Yelp reviews with a direct observation audit instrument for grocery stores.
ObjectiveTo examine the feasibility of using social media to assess the consumer nutrition environment by comparing sentiment expressed in Yelp reviews with information obtained from a direct observation audit instrument for grocery stores.DesignTrained raters used the Nutrition Environment Measures Survey in Stores (NEMS-S) in 100 grocery stores from July 2015 to March 2016. Yelp reviews were available for sixty-nine of these stores and were retrieved in February 2017 using the Yelp Application Program Interface. A sentiment analysis was conducted to quantify the perceptions of the consumer nutrition environment in the review text. Pearson correlation coefficients (ρ) were used to compare NEMS-S scores with Yelp review text on food availability, quality, price and shopping experience.SettingDetroit, Michigan, USA.ParticipantsNone.ResultsYelp reviews contained more comments about food availability and the overall shopping experience than food price and food quality. Negative sentiment about food prices in Yelp review text and the number of dollar signs on Yelp were positively correlated with observed food prices in stores (ρ=0·413 and 0·462, respectively). Stores with greater food availability were rated as more expensive on Yelp. Other aspects of the food store environment (e.g. overall quality and shopping experience) were captured only in Yelp.ConclusionsWhile Yelp cannot replace in-person audits for collecting detailed information on the availability, quality and cost of specific food items, Yelp holds promise as a cost-effective means to gather information on the overall cost, quality and experience of food stores, which may be relevant for nutrition outcomes
Recommended from our members
Uncovering the relationship between food-related discussion on Twitter and neighborhood characteristics
ObjectiveInitiatives to reduce neighborhood-based health disparities require access to meaningful, timely, and local information regarding health behavior and its determinants. We examined the validity of Twitter as a source of information for neighborhood-level analysis of dietary choices and attitudes.Materials and methodsWe analyzed the "healthiness" quotient and sentiment in food-related tweets at the census tract level, and associated them with neighborhood characteristics and health outcomes. We analyzed keywords driving the differences in food healthiness between the most and least-affluent tracts, and qualitatively analyzed contents of a random sample of tweets.ResultsSignificant, albeit weak, correlations existed between healthiness and sentiment in food-related tweets and tract-level measures of affluence, disadvantage, race, age, U.S. density, and mortality from conditions associated with obesity. Analyses of keywords driving the differences in food healthiness revealed foods high in saturated fat (eg, pizza, bacon, fries) were mentioned more frequently in less-affluent tracts. Food-related discussion referred to activities (eating, drinking, cooking), locations where food was consumed, and positive (affection, cravings, enjoyment) and negative attitudes (dislike, personal struggles, complaints).DiscussionTweet-based healthiness scores largely correlated with offline phenomena in the expected directions. Social media offer less resource-intensive data collection methods than traditional surveys do. Twitter may assist in informing local health programs that focus on drivers of food consumption and could inform interventions focused on attitudes and the food environment.ConclusionsTwitter provided weak but significant signals concerning food-related behavior and attitudes at the neighborhood level, suggesting its potential usefulness for informing local health disparity reduction efforts
Assessing the readability of ClinicalTrials.gov
Objective ClinicalTrials.gov serves critical functions of disseminating trial information to the public and helping the trials recruit participants. This study assessed the readability of trial descriptions at ClinicalTrials.gov using multiple quantitative measures. Materials and Methods The analysis included all 165 988 trials registered at ClinicalTrials.gov as of April 30, 2014. To obtain benchmarks, the authors also analyzed 2 other medical corpora: (1) all 955 Health Topics articles from MedlinePlus and (2) a random sample of 100 000 clinician notes retrieved from an electronic health records system intended for conveying internal communication among medical professionals. The authors characterized each of the corpora using 4 surface metrics, and then applied 5 different scoring algorithms to assess their readability. The authors hypothesized that clinician notes would be most difficult to read, followed by trial descriptions and MedlinePlus Health Topics articles. Results Trial descriptions have the longest average sentence length (26.1 words) across all corpora; 65% of their words used are not covered by a basic medical English dictionary. In comparison, average sentence length of MedlinePlus Health Topics articles is 61% shorter, vocabulary size is 95% smaller, and dictionary coverage is 46% higher. All 5 scoring algorithms consistently rated CliniclTrials.gov trial descriptions the most difficult corpus to read, even harder than clinician notes. On average, it requires 18 years of education to properly understand these trial descriptions according to the results generated by the readability assessment algorithms. Discussion and Conclusion Trial descriptions at CliniclTrials.gov are extremely difficult to read. Significant work is warranted to improve their readability in order to achieve CliniclTrials.gov’s goal of facilitating information dissemination and subject recruitment