6,898 research outputs found

    Analysis and Synthesis of Metadata Goals for Scientific Data

    Get PDF
    The proliferation of discipline-specific metadata schemes contributes to artificial barriers that can impede interdisciplinary and transdisciplinary research. The authors considered this problem by examining the domains, objectives, and architectures of nine metadata schemes used to document scientific data in the physical, life, and social sciences. They used a mixed-methods content analysis and Greenberg’s (2005) metadata objectives, principles, domains, and architectural layout (MODAL) framework, and derived 22 metadata-related goals from textual content describing each metadata scheme. Relationships are identified between the domains (e.g., scientific discipline and type of data) and the categories of scheme objectives. For each strong correlation (\u3e0.6), a Fisher’s exact test for nonparametric data was used to determine significance (p \u3c .05). Significant relationships were found between the domains and objectives of the schemes. Schemes describing observational data are more likely to have “scheme harmonization” (compatibility and interoperability with related schemes) as an objective; schemes with the objective “abstraction” (a conceptual model exists separate from the technical implementation) also have the objective “sufficiency” (the scheme defines a minimal amount of information to meet the needs of the community); and schemes with the objective “data publication” do not have the objective “element refinement.” The analysis indicates that many metadata-driven goals expressed by communities are independent of scientific discipline or the type of data, although they are constrained by historical community practices and workflows as well as the technological environment at the time of scheme creation. The analysis reveals 11 fundamental metadata goals for metadata documenting scientific data in support of sharing research data across disciplines and domains. The authors report these results and highlight the need for more metadata-related research, particularly in the context of recent funding agency policy changes

    Overcoming data scarcity of Twitter: using tweets as bootstrap with application to autism-related topic content analysis

    Full text link
    Notwithstanding recent work which has demonstrated the potential of using Twitter messages for content-specific data mining and analysis, the depth of such analysis is inherently limited by the scarcity of data imposed by the 140 character tweet limit. In this paper we describe a novel approach for targeted knowledge exploration which uses tweet content analysis as a preliminary step. This step is used to bootstrap more sophisticated data collection from directly related but much richer content sources. In particular we demonstrate that valuable information can be collected by following URLs included in tweets. We automatically extract content from the corresponding web pages and treating each web page as a document linked to the original tweet show how a temporal topic model based on a hierarchical Dirichlet process can be used to track the evolution of a complex topic structure of a Twitter community. Using autism-related tweets we demonstrate that our method is capable of capturing a much more meaningful picture of information exchange than user-chosen hashtags.Comment: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 201

    Online social capital : mood, topical and psycholinguistic analysis

    Full text link
    Social media provides rich sources of personal information and community interaction which can be linked to aspect of mental health. In this paper we investigate manifest properties of textual messages, including latent topics, psycholinguistic features, and authors\u27 mood, of a large corpus of blog posts, to analyze the aspect of social capital in social media communities. Using data collected from Live Journal, we find that bloggers with lower social capital have fewer positive moods and more negative moods than those with higher social capital. It is also found that people with low social capital have more random mood swings over time than the people with high social capital. Significant differences are found between low and high social capital groups when characterized by a set of latent topics and psycholinguistic features derived from blogposts, suggesting discriminative features, proved to be useful for classification tasks. Good prediction is achieved when classifying among social capital groups using topic and linguistic features, with linguistic features are found to have greater predictive power than latent topics. The significance of our work lies in the importance of online social capital to potential construction of automatic healthcare monitoring systems. We further establish the link between mood and social capital in online communities, suggesting the foundation of new systems to monitor online mental well-being

    Integrated Serologic Surveillance of Population Immunity and Disease Transmission.

    Get PDF
    Antibodies are unique among biomarkers in their ability to identify persons with protective immunity to vaccine-preventable diseases and to measure past exposure to diverse pathogens. Most infectious disease surveillance maintains a single-disease focus, but broader testing of existing serologic surveys with multiplex antibody assays would create new opportunities for integrated surveillance. In this perspective, we highlight multiple areas for potential synergy where integrated surveillance could add more value to public health efforts than the current trend of independent disease monitoring through vertical programs. We describe innovations in laboratory and data science that should accelerate integration and identify remaining challenges with respect to specimen collection, testing, and analysis. Throughout, we illustrate how information generated through integrated surveillance platforms can create new opportunities to more quickly and precisely identify global health program gaps that range from undervaccination to emerging pathogens to multilayered health disparities that span diverse communicable diseases

    A Simultaneous Extraction of Context and Community from pervasive signals using nested Dirichlet process

    Get PDF
    Understanding user contexts and group structures plays a central role in pervasive computing. These contexts and community structures are complex to mine from data collected in the wild due to the unprecedented growth of data, noise, uncertainties and complexities. Typical existing approaches would first extract the latent patterns to explain human dynamics or behaviors and then use them as a way to consistently formulate numerical representations for community detection, often via a clustering method. While being able to capture high-order and complex representations, these two steps are performed separately. More importantly, they face a fundamental difficulty in determining the correct number of latent patterns and communities. This paper presents an approach that seamlessly addresses these challenges to simultaneously discover latent patterns and communities in a unified Bayesian nonparametric framework. Our Simultaneous Extraction of Context and Community (SECC) model roots in the nested Dirichlet process theory which allows a nested structure to be built to summarize data at multiple levels. We demonstrate our framework on five datasets where the advantages of the proposed approach are validated

    Triaging Content Severity in Online Mental Health Forums

    Get PDF
    Mental health forums are online communities where people express their issues and seek help from moderators and other users. In such forums, there are often posts with severe content indicating that the user is in acute distress and there is a risk of attempted self-harm. Moderators need to respond to these severe posts in a timely manner to prevent potential self-harm. However, the large volume of daily posted content makes it difficult for the moderators to locate and respond to these critical posts. We present a framework for triaging user content into four severity categories which are defined based on indications of self-harm ideation. Our models are based on a feature-rich classification framework which includes lexical, psycholinguistic, contextual and topic modeling features. Our approaches improve the state of the art in triaging the content severity in mental health forums by large margins (up to 17% improvement over the F-1 scores). Using the proposed model, we analyze the mental state of users and we show that overall, long-term users of the forum demonstrate a decreased severity of risk over time. Our analysis on the interaction of the moderators with the users further indicates that without an automatic way to identify critical content, it is indeed challenging for the moderators to provide timely response to the users in need.Comment: Accepted for publication in Journal of the Association for Information Science and Technology (2017

    2020 - The First Annual Fall Symposium of Student Scholars

    Get PDF
    The full program book from the Fall 2020 Symposium of Student Scholars, held on December 3, 2020. Includes abstracts from the presentations and posters.https://digitalcommons.kennesaw.edu/sssprograms/1022/thumbnail.jp

    Network analysis of inflammation and symptoms in recent onset schizophrenia and the influence of minocycline during a clinical trial

    Get PDF
    Abstract Attempts to delineate an immune subtype of schizophrenia have not yet led to the clear identification of potential treatment targets. An unbiased informatic approach at the level of individual immune cytokines and symptoms may reveal organisational structures underlying heterogeneity in schizophrenia, and potential for future therapies. The aim was to determine the network and relative influence of pro- and anti-inflammatory cytokines on depressive, positive, and negative symptoms. We further aimed to determine the effect of exposure to minocycline or placebo for 6 months on cytokine-symptom network connectivity and structure. Network analysis was applied to baseline and 6-month data from the large multi-center BeneMin trial of minocycline (N = 207) in schizophrenia. Pro-inflammatory cytokines IL-6, TNF-α, and IFN-Îł had the greatest influence in the inflammatory network and were associated with depressive symptoms and suspiciousness at baseline. At 6 months, the placebo group network connectivity was 57% stronger than the minocycline group, due to significantly greater influence of TNF-α, early wakening, and pathological guilt. IL-6 and its downstream impact on TNF-α, and IFN-Îł, could offer novel targets for treatment if offered at the relevant phenotypic profile including those with depression. Future targeted experimental studies of immune-based therapies are now needed

    Opportunities for mental health interventions in rural Mississippi communities during the COVID-19 pandemic: A quantitative analysis

    Get PDF
    COVID-19 presented unique challenges for rural Mississippi communities including impacts on the mental health of rural individuals. This research study aimed to identify opportunities for mental health interventions to provide health promotion professionals with quantitative data on the accessibility and the likelihood of engagement with mental health-fostering behaviors. A secondary objective of this research was to categorize these behaviors within the constructs of the Social Ecological Model. Demographics for rural Mississippi communities were collected and displayed, and using multivariate analyses including Spearman’s correlation and a Mann-Whitney U test the correlation between mental health fostering behaviors and demographic factors was obtained. Results showed differences in self-reported accessibility and likelihood of engagement when broken down by race, age, and gender
    • 

    corecore