73 research outputs found

    A Comparative Analysis of Text Mining Methodologies for Online Consumer Reviews

    Get PDF
    Extracting meaningful insights from the sheer volume of Online Consumer Reviews (OCRs) has been challenging. We aim to explore the most effective methodologies for text mining of OCRs, covering topic extraction, topic classification, and sentiment analysis. Through a comprehensive review of recent research on text mining applied to OCRs, we found that LDA2Vec can enhance the effectiveness of conventional LDA for topic extraction. Additionally, the combination of Convolutional Neural Networks (CNN) and GloVe demonstrates the best performance for topic classification, while CNN and SVM outperform other algorithms for sentiment analysis. Furthermore, the spaCy Natural Language Processing (NLP) proves to be a more effective choice for text pre-processing compared to Natural Language Toolkit (NLTK). Subsequently, we applied these refined models to a Yelp reviews dataset, assessed their performance against conventional models, and provided a comprehensive discussion of the results and limitations. The insights gained from this study can be valuable for developing effective models in OCR analysis

    Resource utilization and costs during the initial years of lung cancer screening with computed tomography in Canada

    Get PDF
    Background It is estimated that millions of North Americans would qualify for lung cancer screening and that billions of dollars of national health expenditures would be required to support population-based computed tomography lung cancer screening programs. The decision to implement such programs should be informed by data on resource utilization and costs. Methods Resource utilization data were collected prospectively from 2059 participants in the Pan-Canadian Early Detection of Lung Cancer Study using low-dose computed tomography (LDCT). Participants who had 2% or greater lung cancer risk over 3 years using a risk prediction tool were recruited from seven major cities across Canada. A cost analysis was conducted from the Canadian public payer's perspective for resources that were used for the screening and treatment of lung cancer in the initial years of the study. Results The average per-person cost for screening individuals with LDCT was USD453 (95% confidence interval [CI], USD400–USD505) for the initial 18-months of screening following a baseline scan. The screening costs were highly dependent on the detected lung nodule size, presence of cancer, screening intervention, and the screening center. The mean per-person cost of treating lung cancer with curative surgery was USD33,344 (95% CI, USD31,553–USD34,935) over 2 years. This was lower than the cost of treating advanced-stage lung cancer with chemotherapy, radiotherapy, or supportive care alone, (USD47,792; 95% CI, USD43,254–USD52,200; p = 0.061). Conclusion In the Pan-Canadian study, the average cost to screen individuals with a high risk for developing lung cancer using LDCT and the average initial cost of curative intent treatment were lower than the average per-person cost of treating advanced stage lung cancer which infrequently results in a cure

    Genome-wide association identifies ATOH7 as a major gene determining human optic disc size

    Get PDF
    Optic nerve assessment is important for many blinding diseases, with cup-to-disc ratio (CDR) assessments commonly used in both diagnosis and progression monitoring of glaucoma patients. Optic disc, cup, rim area and CDR measurements all show substantial variation between human populations and high heritability estimates within populations. To identify loci underlying these quantitative traits, we performed a genome-wide association study in two Australian twin cohorts and identified rs3858145, P = 6.2 × 10−10, near the ATOH7 gene as associated with the mean disc area. ATOH7 is known from studies in model organisms to play a key role in retinal ganglion cell formation. The association with rs3858145 was replicated in a cohort of UK twins, with a meta-analysis of the combined data yielding P = 3.4 × 10−10. Imputation further increased the evidence for association for several SNPs in and around ATOH7 (P = 1.3 × 10−10 to 4.3 × 10−11, top SNP rs1900004). The meta-analysis also provided suggestive evidence for association for the cup area at rs690037, P = 1.5 × 10−7, in the gene RFTN1. Direct sequencing of ATOH7 in 12 patients with optic nerve hypoplasia, one of the leading causes of blindness in children, revealed two novel non-synonymous mutations (Arg65Gly, Ala47Thr) which were not found in 90 unrelated controls (combined Fisher's exact P = 0.0136). Furthermore, the Arg65Gly variant was found to have very low frequency (0.00066) in an additional set of 672 controls

    The genetic architecture of the human cerebral cortex

    Get PDF
    The cerebral cortex underlies our complex cognitive capabilities, yet little is known about the specific genetic loci that influence human cortical structure. To identify genetic variants that affect cortical structure, we conducted a genome-wide association meta-analysis of brain magnetic resonance imaging data from 51,665 individuals. We analyzed the surface area and average thickness of the whole cortex and 34 regions with known functional specializations. We identified 199 significant loci and found significant enrichment for loci influencing total surface area within regulatory elements that are active during prenatal cortical development, supporting the radial unit hypothesis. Loci that affect regional surface area cluster near genes in Wnt signaling pathways, which influence progenitor expansion and areal identity. Variation in cortical structure is genetically correlated with cognitive function, Parkinson's disease, insomnia, depression, neuroticism, and attention deficit hyperactivity disorder

    The genetic architecture of the human cerebral cortex

    Get PDF
    The cerebral cortex underlies our complex cognitive capabilities, yet little is known about the specific genetic loci that influence human cortical structure. To identify genetic variants that affect cortical structure, we conducted a genome-wide association meta-analysis of brain magnetic resonance imaging data from 51,665 individuals. We analyzed the surface area and average thickness of the whole cortex and 34 regions with known functional specializations. We identified 199 significant loci and found significant enrichment for loci influencing total surface area within regulatory elements that are active during prenatal cortical development, supporting the radial unit hypothesis. Loci that affect regional surface area cluster near genes in Wnt signaling pathways, which influence progenitor expansion and areal identity. Variation in cortical structure is genetically correlated with cognitive function, Parkinson's disease, insomnia, depression, neuroticism, and attention deficit hyperactivity disorder

    Worldwide trends in underweight and obesity from 1990 to 2022: a pooled analysis of 3663 population-representative studies with 222 million children, adolescents, and adults

    Get PDF
    Background Underweight and obesity are associated with adverse health outcomes throughout the life course. We estimated the individual and combined prevalence of underweight or thinness and obesity, and their changes, from 1990 to 2022 for adults and school-aged children and adolescents in 200 countries and territories. Methods We used data from 3663 population-based studies with 222 million participants that measured height and weight in representative samples of the general population. We used a Bayesian hierarchical model to estimate trends in the prevalence of different BMI categories, separately for adults (age ≥20 years) and school-aged children and adolescents (age 5–19 years), from 1990 to 2022 for 200 countries and territories. For adults, we report the individual and combined prevalence of underweight (BMI <18·5 kg/m2) and obesity (BMI ≥30 kg/m2). For schoolaged children and adolescents, we report thinness (BMI <2 SD below the median of the WHO growth reference) and obesity (BMI >2 SD above the median). Findings From 1990 to 2022, the combined prevalence of underweight and obesity in adults decreased in 11 countries (6%) for women and 17 (9%) for men with a posterior probability of at least 0·80 that the observed changes were true decreases. The combined prevalence increased in 162 countries (81%) for women and 140 countries (70%) for men with a posterior probability of at least 0·80. In 2022, the combined prevalence of underweight and obesity was highest in island nations in the Caribbean and Polynesia and Micronesia, and countries in the Middle East and north Africa. Obesity prevalence was higher than underweight with posterior probability of at least 0·80 in 177 countries (89%) for women and 145 (73%) for men in 2022, whereas the converse was true in 16 countries (8%) for women, and 39 (20%) for men. From 1990 to 2022, the combined prevalence of thinness and obesity decreased among girls in five countries (3%) and among boys in 15 countries (8%) with a posterior probability of at least 0·80, and increased among girls in 140 countries (70%) and boys in 137 countries (69%) with a posterior probability of at least 0·80. The countries with highest combined prevalence of thinness and obesity in school-aged children and adolescents in 2022 were in Polynesia and Micronesia and the Caribbean for both sexes, and Chile and Qatar for boys. Combined prevalence was also high in some countries in south Asia, such as India and Pakistan, where thinness remained prevalent despite having declined. In 2022, obesity in school-aged children and adolescents was more prevalent than thinness with a posterior probability of at least 0·80 among girls in 133 countries (67%) and boys in 125 countries (63%), whereas the converse was true in 35 countries (18%) and 42 countries (21%), respectively. In almost all countries for both adults and school-aged children and adolescents, the increases in double burden were driven by increases in obesity, and decreases in double burden by declining underweight or thinness. Interpretation The combined burden of underweight and obesity has increased in most countries, driven by an increase in obesity, while underweight and thinness remain prevalent in south Asia and parts of Africa. A healthy nutrition transition that enhances access to nutritious foods is needed to address the remaining burden of underweight while curbing and reversing the increase in obesit

    The evolution of lung cancer and impact of subclonal selection in TRACERx

    Get PDF
    Lung cancer is the leading cause of cancer-associated mortality worldwide. Here we analysed 1,644 tumour regions sampled at surgery or during follow-up from the first 421 patients with non-small cell lung cancer prospectively enrolled into the TRACERx study. This project aims to decipher lung cancer evolution and address the primary study endpoint: determining the relationship between intratumour heterogeneity and clinical outcome. In lung adenocarcinoma, mutations in 22 out of 40 common cancer genes were under significant subclonal selection, including classical tumour initiators such as TP53 and KRAS. We defined evolutionary dependencies between drivers, mutational processes and whole genome doubling (WGD) events. Despite patients having a history of smoking, 8% of lung adenocarcinomas lacked evidence of tobacco-induced mutagenesis. These tumours also had similar detection rates for EGFR mutations and for RET, ROS1, ALK and MET oncogenic isoforms compared with tumours in never-smokers, which suggests that they have a similar aetiology and pathogenesis. Large subclonal expansions were associated with positive subclonal selection. Patients with tumours harbouring recent subclonal expansions, on the terminus of a phylogenetic branch, had significantly shorter disease-free survival. Subclonal WGD was detected in 19% of tumours, and 10% of tumours harboured multiple subclonal WGDs in parallel. Subclonal, but not truncal, WGD was associated with shorter disease-free survival. Copy number heterogeneity was associated with extrathoracic relapse within 1 year after surgery. These data demonstrate the importance of clonal expansion, WGD and copy number instability in determining the timing and patterns of relapse in non-small cell lung cancer and provide a comprehensive clinical cancer evolutionary data resource

    The evolution of non-small cell lung cancer metastases in TRACERx

    Get PDF
    Metastatic disease is responsible for the majority of cancer-related deaths. We report the longitudinal evolutionary analysis of 126 non-small cell lung cancer (NSCLC) tumours from 421 prospectively recruited patients in TRACERx who developed metastatic disease, compared with a control cohort of 144 non-metastatic tumours. In 25% of cases, metastases diverged early, before the last clonal sweep in the primary tumour, and early divergence was enriched for patients who were smokers at the time of initial diagnosis. Simulations suggested that early metastatic divergence more frequently occurred at smaller tumour diameters (less than 8 mm). Single-region primary tumour sampling resulted in 83% of late divergence cases being misclassified as early, highlighting the importance of extensive primary tumour sampling. Polyclonal dissemination, which was associated with extrathoracic disease recurrence, was found in 32% of cases. Primary lymph node disease contributed to metastatic relapse in less than 20% of cases, representing a hallmark of metastatic potential rather than a route to subsequent recurrences/disease progression. Metastasis-seeding subclones exhibited subclonal expansions within primary tumours, probably reflecting positive selection. Our findings highlight the importance of selection in metastatic clone evolution within untreated primary tumours, the distinction between monoclonal versus polyclonal seeding in dictating site of recurrence, the limitations of current radiological screening approaches for early diverging tumours and the need to develop strategies to target metastasis-seeding subclones before relapse
    corecore