9 research outputs found

    Divide-or-Conquer? Which Part Should You Distill Your LLM?

    Full text link
    Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is able to outperform a single stage solution. Further, we hypothesize that the decomposition should be easier to distill into a smaller model compared to the problem solving because the latter requires large amounts of domain knowledge while the former only requires learning general problem solving strategies. We propose methods to distill these two capabilities and evaluate their impact on reasoning outcomes and inference cost. We find that we can distill the problem decomposition phase and at the same time achieve good generalization across tasks, datasets, and models. However, it is harder to distill the problem solving capability without losing performance and the resulting distilled model struggles with generalization. These results indicate that by using smaller, distilled problem decomposition models in combination with problem solving LLMs we can achieve reasoning with cost-efficient inference and local adaptation

    Using social media to assess the consumer nutrition environment: comparing Yelp reviews with a direct observation audit instrument for grocery stores

    Get PDF
    Objective To examine the feasibility of using social media to assess the consumer nutrition environment by comparing sentiment expressed in Yelp reviews with information obtained from a direct observation audit instrument for grocery stores. Design Trained raters used the Nutrition Environment Measures Survey in Stores (NEMS-S) in 100 grocery stores from July 2015 to March 2016. Yelp reviews were available for sixty-nine of these stores and were retrieved in February 2017 using the Yelp Application Program Interface. A sentiment analysis was conducted to quantify the perceptions of the consumer nutrition environment in the review text. Pearson correlation coefficients (ρ) were used to compare NEMS-S scores with Yelp review text on food availability, quality, price and shopping experience. Setting Detroit, Michigan, USA. Participants None. Results Yelp reviews contained more comments about food availability and the overall shopping experience than food price and food quality. Negative sentiment about food prices in Yelp review text and the number of dollar signs on Yelp were positively correlated with observed food prices in stores (ρ=0·413 and 0·462, respectively). Stores with greater food availability were rated as more expensive on Yelp. Other aspects of the food store environment (e.g. overall quality and shopping experience) were captured only in Yelp. Conclusions While Yelp cannot replace in-person audits for collecting detailed information on the availability, quality and cost of specific food items, Yelp holds promise as a cost-effective means to gather information on the overall cost, quality and experience of food stores, which may be relevant for nutrition outcomes

    Using Social Media to Identify Sources of Healthy Food in Urban Neighborhoods.

    No full text
    An established body of research has used secondary data sources (such as proprietary business databases) to demonstrate the importance of the neighborhood food environment for multiple health outcomes. However, documenting food availability using secondary sources in low-income urban neighborhoods can be particularly challenging since small businesses play a crucial role in food availability. These small businesses are typically underrepresented in national databases, which rely on secondary sources to develop data for marketing purposes. Using social media and other crowdsourced data to account for these smaller businesses holds promise, but the quality of these data remains unknown. This paper compares the quality of full-line grocery store information from Yelp, a crowdsourced content service, to a "ground truth" data set (Detroit Food Map) and a commercially-available dataset (Reference USA) for the greater Detroit area. Results suggest that Yelp is more accurate than Reference USA in identifying healthy food stores in urban areas. Researchers investigating the relationship between the nutrition environment and health may consider Yelp as a reliable and valid source for identifying sources of healthy food in urban environments

    Assessing the readability of ClinicalTrials.gov

    No full text
    Objective ClinicalTrials.gov serves critical functions of disseminating trial information to the public and helping the trials recruit participants. This study assessed the readability of trial descriptions at ClinicalTrials.gov using multiple quantitative measures. Materials and Methods The analysis included all 165 988 trials registered at ClinicalTrials.gov as of April 30, 2014. To obtain benchmarks, the authors also analyzed 2 other medical corpora: (1) all 955 Health Topics articles from MedlinePlus and (2) a random sample of 100 000 clinician notes retrieved from an electronic health records system intended for conveying internal communication among medical professionals. The authors characterized each of the corpora using 4 surface metrics, and then applied 5 different scoring algorithms to assess their readability. The authors hypothesized that clinician notes would be most difficult to read, followed by trial descriptions and MedlinePlus Health Topics articles. Results Trial descriptions have the longest average sentence length (26.1 words) across all corpora; 65% of their words used are not covered by a basic medical English dictionary. In comparison, average sentence length of MedlinePlus Health Topics articles is 61% shorter, vocabulary size is 95% smaller, and dictionary coverage is 46% higher. All 5 scoring algorithms consistently rated CliniclTrials.gov trial descriptions the most difficult corpus to read, even harder than clinician notes. On average, it requires 18 years of education to properly understand these trial descriptions according to the results generated by the readability assessment algorithms. Discussion and Conclusion Trial descriptions at CliniclTrials.gov are extremely difficult to read. Significant work is warranted to improve their readability in order to achieve CliniclTrials.gov’s goal of facilitating information dissemination and subject recruitment
    corecore