8 research outputs found

    Data Reduction and Deep-Learning Based Recovery for Geospatial Visualization and Satellite Imagery

    Get PDF
    The storage, retrieval and distribution of data are some critical aspects of big data management. Data scientists and decision-makers often need to share large datasets and make decisions on archiving or deleting historical data to cope with resource constraints. As a consequence, there is an urgency of reducing the storage and transmission requirement. A potential approach to mitigate such problems is to reduce big datasets into smaller ones, which will not only lower storage requirements but also allow light load transfer over the network. The high dimensional data often exhibit high repetitiveness and paradigm across different dimensions. Carefully prepared data by removing redundancies, along with a machine learning model capable of reconstructing the whole dataset from its reduced version, can improve the storage scalability, data transfer, and speed up the overall data management pipeline. In this thesis, we explore some data reduction strategies for big datasets, while ensuring that the data can be transferred and used ubiquitously by all stakeholders, i.e., the entire dataset can be reconstructed with high quality whenever necessary. One of our data reduction strategies follows a straightforward uniform pattern, which guarantees a minimum of 75% data size reduction. We also propose a novel variance based reduction technique, which focuses on removing only redundant data and offers additional 1% to 2% deletion rate. We have adopted various traditional machine learning and deep learning approaches for high-quality reconstruction. We evaluated our pipelines with big geospatial data and satellite imageries. Among them, our deep learning approaches have performed very well both quantitatively and qualitatively with the capability of reconstructing high quality features. We also show how to leverage temporal data for better reconstruction. For uniform deletion, the reconstruction accuracy observed is as high as 98.75% on an average for spatial meteorological data (e.g., soil moisture and albedo), and 99.09% for satellite imagery. Pushing the deletion rate further by following variance based deletion method, the decrease in accuracy remains within 1% for spatial meteorological data and 7% for satellite imagery

    Distinguishing Human Generated Text From ChatGPT Generated Text Using Machine Learning

    Full text link
    ChatGPT is a conversational artificial intelligence that is a member of the generative pre-trained transformer of the large language model family. This text generative model was fine-tuned by both supervised learning and reinforcement learning so that it can produce text documents that seem to be written by natural intelligence. Although there are numerous advantages of this generative model, it comes with some reasonable concerns as well. This paper presents a machine learning-based solution that can identify the ChatGPT delivered text from the human written text along with the comparative analysis of a total of 11 machine learning and deep learning algorithms in the classification process. We have tested the proposed model on a Kaggle dataset consisting of 10,000 texts out of which 5,204 texts were written by humans and collected from news and social media. On the corpus generated by GPT-3.5, the proposed algorithm presents an accuracy of 77%

    Understanding complex causes of suicidal behaviour among graduates in Bangladesh

    No full text
    Abstract This study utilizes both fieldwork and desk-based discourse analysis of newspaper reports to investigate the concerning number of suicides among graduates in Bangladesh. According to some reports, a majority of suicide cases involve young adults who are either currently studying at university or have recently completed their degree (between the ages of 20 and 32). This research contends that patriarchal social expectations in Bangladesh place significant pressure on young adults to secure well-paying jobs to support their families and uphold their family’s status, which can have a negative impact on their mental health. Furthermore, this article identifies additional risk factors that contribute to the high suicide rates among graduates in Bangladesh. These factors include unemployment, poverty, relationship problems, drug addiction, political marginalization, and the stigma of shame, all of which can cause low self-esteem and suicidal thoughts. Moreover, the research suggests that families in Bangladesh have not been providing adequate support to their young members when facing challenges in life. On the contrary, families have added to the pressure on young adults, which can be attributed to joiner’s theory of the effect of industrialization on family norms and values

    How does quality deviate in stable releases by backporting?

    No full text
    How does quality deviate in stable releases by backporting? The sample data (The list of pull requests considered) is given to visualize the file evolution of Ansible, Kibana, Salt Project for metrics value The change patch of the files changed by backports are documented in the change_patch_data_code_stable/Codes/ansible::ansible/ directory (We provideded change patch data for Ansible project

    Prevalence of latent tuberculosis infection in Asian nations: A systematic review and meta‐analysis

    No full text
    Abstract Background Tuberculosis (TB) is a serious public health concern around the world including Asia. TB burden is high in Asian countries and significant population harbor latent tuberculosis infection(LTBI). Aim This systematic review and meta‐analysis aims to evaluate the prevalence of LTBI in Asian countries. Method We performed a systematic literature search on PubMed, Embase, and ScienceDirect to identify relevant articles published between January 1, 2005, and January 1, 2023 investigating the overall prevalence of latent TB among people of Asia. Subgroup analysis was done for Asian subregions during the study period of 2011 to 2016 and 2017 to 2023, for tuberculin skin test (TST) and interferon gamma release assay (IGRA), respectively, as well as for QuantiFERON‐TB (QFT) and TSPOT TB tests. Der Simonian and Laird's random‐effects model was used to pool the prevalence of LTBI found using TST and IGRA. Result A total of 15 studies were included after a systematic search from standard electronic databases. The analysis showed that the prevalence of latent TB in Asia was 21% (95% confidence interval [CI]: 19%–23%) and 36% (95% CI: 12%–59%) according to IGRAs and TSTs (cut off 10 mm) results, respectively. Based on IGRA, the prevalence of latent TB was 20% (95% CI: 13%–25%) in 2011 to 2016 and 21% (95% CI: 18%–24%) in 2017 to 2023. Using QFT, the prevalence was 19% (95% CI: 17%–22%) and using TSPOT, the prevalence was 26% (95% CI: 21%–31%). According to the United Nations division of Asia, the prevalence was higher for the Southern region and least for the Western region using TST and higher in the South‐Eastern region and least in the Western region using the IGRA test. Conclusion Almost a quarter of the Asian population has LTBI. Its diagnosis often poses a diagnostic challenge due to the unavailability of standard test in certain areas. Given this prevalence, a mass screening program is suggested with the available standard test and public awareness along with anti‐TB regimen should be considered for individuals who test positive. However, for it to be implemented effectively, we need to take the affordability, availability, and cost‐effectiveness of such interventions into account
    corecore