This project explores the application of Big Data technologies for large-scale mental health analysis, focusing on the prevalence of depressive disorder symptoms across diverse demographic and geographic subgroups. Utilizing Apache Spark on Google Cloud Dataproc, the system efficiently processed millions of survey records stored in Hadoop Distributed File System (HDFS). Through comprehensive data preprocessing, aggregation, and visualization, the analysis revealed critical trends and disparities in mental health outcomes related to age, race, education level, gender, and state. Seasonal variations and subgroup-specific confidence intervals were also examined to identify high-risk populations and areas of measurement uncertainty. The results offer actionable insights for public health decision-makers, supporting targeted interventions and equitable resource allocation. This work demonstrates the potential of scalable data processing frameworks to inform data-driven mental health strategies and highlights the integration of computational tools in addressing public health challenges
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.