48 research outputs found

    2023 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium

    Repeatable and reusable research - Exploring the needs of users for a Data Portal for Disease Phenotyping

    Get PDF
    Background: Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it hard to compare different study findings and hinders the ability to conduct repeatable and reusable research. Objective: This thesis aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, for both new and existing data portals for phenotypes (concept libraries). Methods: Exploratory sequential mixed methods were used in this thesis to look at which concept libraries are available, how they are used, what their characteristics are, where there are gaps, and what needs to be done in the future from the point of view of the people who use them. This thesis consists of three phases: 1) two qualitative studies, including one-to-one interviews with researchers, clinicians, machine learning experts, and senior research managers in health data science, as well as focus group discussions with researchers working with the Secured Anonymized Information Linkage databank, 2) the creation of an email survey (i.e., the Concept Library Usability Scale), and 3) a quantitative study with researchers, health professionals, and clinicians. Results: Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would encourage them to: 1) share their work, such as receiving citations from other researchers; and 2) reuse the work of others, such as saving a lot of time and effort, which they frequently spend on creating new code lists from scratch. They also pointed out several barriers that could inhibit them from: 1) sharing their work, such as concerns about intellectual property (e.g., if they shared their methods before publication, other researchers would use them as their own); and 2) reusing others' work, such as a lack of confidence in the quality and validity of their code lists. Participants suggested some developments that they would like to see happen in order to make research that is done with routine data more reproducible, such as the availability of a drive for more transparency in research methods documentation, such as publishing complete phenotype definitions and clear code lists. Conclusions: The findings of this thesis indicated that most participants valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform such as the CALIBER research platform. Analysis of interviews, focus group discussions, and qualitative studies revealed that different users have different requirements, facilitators, barriers, and concerns about concept libraries. This work was to investigate if we should develop concept libraries in Kuwait to facilitate the development of improved data sharing. However, at the end of this thesis the recommendation is this would be unlikely to be cost effective or highly valued by users and investment in open access research publications may be of more value to the Kuwait research/academic community

    2023 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium

    Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges

    Full text link
    Artificial General Intelligence (AGI), possessing the capacity to comprehend, learn, and execute tasks with human cognitive abilities, engenders significant anticipation and intrigue across scientific, commercial, and societal arenas. This fascination extends particularly to the Internet of Things (IoT), a landscape characterized by the interconnection of countless devices, sensors, and systems, collectively gathering and sharing data to enable intelligent decision-making and automation. This research embarks on an exploration of the opportunities and challenges towards achieving AGI in the context of the IoT. Specifically, it starts by outlining the fundamental principles of IoT and the critical role of Artificial Intelligence (AI) in IoT systems. Subsequently, it delves into AGI fundamentals, culminating in the formulation of a conceptual framework for AGI's seamless integration within IoT. The application spectrum for AGI-infused IoT is broad, encompassing domains ranging from smart grids, residential environments, manufacturing, and transportation to environmental monitoring, agriculture, healthcare, and education. However, adapting AGI to resource-constrained IoT settings necessitates dedicated research efforts. Furthermore, the paper addresses constraints imposed by limited computing resources, intricacies associated with large-scale IoT communication, as well as the critical concerns pertaining to security and privacy

    Mining Behavioral Patterns from Mobile Big Data

    Get PDF
    Mobile devices connected to the Internet are a ubiquitous platform that can easily record a large amount of data describing human behavior. Specifically, the data collected from mobile devices --- referred to as mobile big data reveal important social and economic information. Therefore, analyzing mobile big data is valuable for several stakeholders, ranging from smartphone manufacturers to network operators and app developers. This thesis aims to discover and understand behavioral patterns from mobile big data based on large real-world datasets. Specifically, this thesis reveals patterns from three domains: people, time, and location. First, we explore mobile big data from the people domain and propose a framework to discover users' daily activity patterns from their mobile app usage. By applying the framework to a real-world dataset consisting of 653,092 users, we successfully extract five common patterns among millions of people, including commuting, pervasive socializing, nightly entertainment, afternoon reading, and nightly socializing. Second, still from the people domain, we derive group health conditions by using their smartphone usage data. In particular, we collect mobile usage records of 452 users in North America. We then demonstrate the potential for inferring group health conditions (i.e., COVID-19 outbreak stages) by leveraging less privacy-sensitive smartphone data, including CPU usage, memory usage, and network connections. Third, we mine the behavior patterns from the time domain. We reveal the evolution of mobile app usage by conducting a longitudinal study on 1,465 users from 2012 to 2017. The results show that users' app usage significantly changes over time. However, the evolution in app-category usage and individual app usage are different in terms of popularity distribution, usage diversity, and correlations. Last, with respect to the location domain, we leverage city-scale spatiotemporal mobile app usage data to reveal urban land usage patterns. We prove the strong correlation between mobile usage behavior and location features, which brings a new angle to urban analytics.Internetiin kytketyt mobiililaitteet ovat kaikkialla läsnä oleva alusta, joka voi helposti tallentaa suuren määrän tietoja, jotka kuvaavat ihmisen käyttäytymistä. Erityisesti mobiililaitteista kerätyt tiedot, joita kutsutaan mobiiliksi massadataksi (big data), paljastavat tärkeitä sosiaalisia ja taloudellisia tietoja. Siksi mobiilin massadatan analysointi on arvokasta useille sidosryhmille älypuhelinvalmistajista verkko-operaattoreihin ja sovelluskehittäjiin. Tämän väitöskirjan tavoitteena on löytää ja ymmärtää käyttäytymismalleja mobiilista massadatasta, joka perustuu suuriin reaalimaailman tietojoukkoihin. Erityisesti tämä väitöskirja tuottaa malleja kolmelta eri alueelta: ihmisiin, aikaan ja sijaintiin liittyen. Ensinnäkin tutkimme mobiilia massadataa ihmisiin liittyen ja ehdotamme viitekehystä, jonka avulla voidaan löytää käyttäjien päivittäisiä toimintamalleja heidän mobiilisovellustensa käytön perusteella. Soveltamalla tätä viitekehystä tosielämän tietojoukkoon, joka koostuu 653 092 käyttäjästä, löysimme onnistuneesti viisi yleistä mallia miljoonien ihmisten tiedoista, joihin kuuluivat mm. tiedot työmatkoista, sosiaalisista kontakteista, yöllisestä viihteestä, iltapäivän lukemisesta ja yöllisestä seurustelusta. Toiseksi, edelleen ihmisiin liittyen, johdamme tietoja ryhmien terveysolosuhteista käyttämällä heidän älypuhelintensa käyttötietoja. Keräsimme erityisesti 452 käyttäjän mobiilikäyttötietoja Pohjois-Amerikassa. Sitten osoitamme, että on mahdollista päätellä ryhmän terveysolosuhteet (eli COVID-19-epidemiavaiheet) hyödyntämällä vähemmän yksityisyyden kannalta arkoja älypuhelintietoja, mukaan lukien suorittimen käyttö, muistin käyttö ja verkkoyhteydet. Kolmanneksi louhimme käyttäytymismalleja aikaan liittyen. Paljastamme mobiilisovellusten käytön kehityksen tekemällä pitkittäistutkimuksen 1 465 käyttäjälle vuosina 2012–2017. Tulokset osoittavat, että käyttäjien sovellusten käyttö muuttuu merkittävästi ajan myötä. Sovellusluokan käytön ja yksittäisten sovellusten käytön kehitys on kuitenkin erilainen niiden suosion jakautumisen, käytön moninaisuuden ja korrelaatioiden suhteen. Lopuksi liittyen sijaintitietoihin hyödynnämme spatiotemporaalisten mobiilisovellusten käyttötietoja suurkaupunkitasolla paljastaaksemme kaupunkien maankäyttömallit. Todistamme vahvan korrelaation mobiililaitteiden käyttöön liittyvän käyttäytymisen ja sijaintiominaisuuksien välillä, mikä tuottaa uuden näkökulman kaupunkianalytiikkaan

    Modern Views of Machine Learning for Precision Psychiatry

    Full text link
    In light of the NIMH's Research Domain Criteria (RDoC), the advent of functional neuroimaging, novel technologies and methods provide new opportunities to develop precise and personalized prognosis and diagnosis of mental disorders. Machine learning (ML) and artificial intelligence (AI) technologies are playing an increasingly critical role in the new era of precision psychiatry. Combining ML/AI with neuromodulation technologies can potentially provide explainable solutions in clinical practice and effective therapeutic treatment. Advanced wearable and mobile technologies also call for the new role of ML/AI for digital phenotyping in mobile mental health. In this review, we provide a comprehensive review of the ML methodologies and applications by combining neuroimaging, neuromodulation, and advanced mobile technologies in psychiatry practice. Additionally, we review the role of ML in molecular phenotyping and cross-species biomarker identification in precision psychiatry. We further discuss explainable AI (XAI) and causality testing in a closed-human-in-the-loop manner, and highlight the ML potential in multimedia information extraction and multimodal data fusion. Finally, we discuss conceptual and practical challenges in precision psychiatry and highlight ML opportunities in future research
    corecore