15 research outputs found

    Secure Deduplication Based on Rabin Fingerprinting over Wireless Sensing Data in Cloud Computing

    No full text
    The rapid advancements in the Internet of Things (IoT) and cloud computing technologies have significantly promoted the collection and sharing of various data. In order to reduce the communication cost and the storage overhead, it is necessary to exploit data deduplication mechanisms. However, existing data deduplication technologies still suffer security and efficiency drawbacks. In this paper, we propose two secure data deduplication schemes based on Rabin fingerprinting over wireless sensing data in cloud computing. The first scheme is based on deterministic tags and the other one adopts random tags. The proposed schemes realize data deduplication before the data is outsourced to the cloud storage server, and hence both the communication cost and the computation cost are reduced. In particular, variable-size block-level deduplication is enabled based on the technique of Rabin fingerprinting which generates data blocks based on the content of the data. Before outsourcing data to the cloud, users encrypt the data based on convergent encryption technologies, which protects the data from being accessed by unauthorized users. Our security analysis shows that the proposed schemes are secure against offline brute-force dictionary attacks. In addition, the random tag makes the second scheme more reliable. Extensive experimental results indicate that the proposed data deduplication schemes are efficient in terms of the deduplication rate, the system operation time, and the tag generation time

    Identification of Smoking-Associated Transcriptome Aberration in Blood with Machine Learning Methods

    No full text
    Long-term cigarette smoking causes various human diseases, including respiratory disease, cancer, and gastrointestinal (GI) disorders. Alterations in gene expression and variable splicing processes induced by smoking are associated with the development of diseases. This study applied advanced machine learning methods to identify the isoforms with important roles in distinguishing smokers from former smokers based on the expression profile of isoforms from current and former smokers collected in one previous study. These isoforms were deemed as features, which were first analyzed by the Boruta to select features highly correlated with the target variables. Then, the selected features were evaluated by four feature ranking algorithms, resulting in four feature lists. The incremental feature selection method was applied to each list for obtaining the optimal feature subsets and building high-performance classification models. Furthermore, a series of classification rules were accessed by decision tree with the highest performance. Eventually, the rationality of the mined isoforms (features) and classification rules was verified by reviewing previous research. Features such as isoforms ENST00000464835 (expressed by LRRN3), ENST00000622663 (expressed by SASH1), and ENST00000284311 (expressed by GPR15), and pathways (cytotoxicity mediated by natural killer cell and cytokine–cytokine receptor interaction) revealed by the enrichment analysis, were highly relevant to smoking response, suggesting the robustness of our analysis pipeline

    Metabonomics reveals the main small molecules differences between green and white egg shells in ducks

    No full text
    The eggshell colour is related to the biological functions of birds, and the colour of poultry eggshells can affect consumers' choices. This study explored the difference in the metabolite composition of duck eggshells to screen the key substances that affect the eggshell colours. The green and white duck eggshells were selected for non-targeted metabolomics analysis. We screened 402 and 512 differentially expressed metabolites in the negative and positive ion modes, respectively. Among them, 40 differentially expressed metabolites were annotated with specific names and related functions, of which the expression levels of 8 metabolites showed extremely significant differences. They were 2-heptanone, 12-hydroxydodecanoic acid, D-fructose, dodecanedioic acid, L-leucine, methyl jasmonate, palmitoleic acid, and styrene oxide. Additionally, the annotated differentially expressed metabolites were enriched in 33 metabolic pathways, including aminoacyl-tRNA biosynthesis, amino acid metabolism, galactose metabolism, etc. The results showed that the expression of metabolites between green and white eggshells differed in ducks; among metabolites with extremely significant differences in expression, the expression level of L-leucine in green eggshells was higher than that in ducks white eggshells. Therefore, we speculated that the increased expression of L-leucine promoted the response of related metabolic pathways, enhanced the expression of antioxidants, and changed the eggshell colours.Highlights The metabolites of green and white shells were different. 40 DEMs were annotated with specific names and related functions. The annotated DEMs were enriched in 33 metabolic pathways

    Using Machine Learning Methods in Identifying Genes Associated with COVID-19 in Cardiomyocytes and Cardiac Vascular Endothelial Cells

    No full text
    Corona Virus Disease 2019 (COVID-19) not only causes respiratory system damage, but also imposes strain on the cardiovascular system. Vascular endothelial cells and cardiomyocytes play an important role in cardiac function. The aberrant expression of genes in vascular endothelial cells and cardiomyocytes can lead to cardiovascular diseases. In this study, we sought to explain the influence of respiratory syndrome coronavirus 2 (SARS-CoV-2) infection on the gene expression levels of vascular endothelial cells and cardiomyocytes. We designed an advanced machine learning-based workflow to analyze the gene expression profile data of vascular endothelial cells and cardiomyocytes from patients with COVID-19 and healthy controls. An incremental feature selection method with a decision tree was used in building efficient classifiers and summarizing quantitative classification genes and rules. Some key genes, such as MALAT1, MT-CO1, and CD36, were extracted, which exert important effects on cardiac function, from the gene expression matrix of 104,182 cardiomyocytes, including 12,007 cells from patients with COVID-19 and 92,175 cells from healthy controls, and 22,438 vascular endothelial cells, including 10,812 cells from patients with COVID-19 and 11,626 cells from healthy controls. The findings reported in this study may provide insights into the effect of COVID-19 on cardiac cells and further explain the pathogenesis of COVID-19, and they may facilitate the identification of potential therapeutic targets

    Immune responses of different COVID-19 vaccination strategies by analyzing single-cell RNA sequencing data from multiple tissues using machine learning methods

    Get PDF
    Multiple types of COVID-19 vaccines have been shown to be highly effective in preventing SARS-CoV-2 infection and in reducing post-infection symptoms. Almost all of these vaccines induce systemic immune responses, but differences in immune responses induced by different vaccination regimens are evident. This study aimed to reveal the differences in immune gene expression levels of different target cells under different vaccine strategies after SARS-CoV-2 infection in hamsters. A machine learning based process was designed to analyze single-cell transcriptomic data of different cell types from the blood, lung, and nasal mucosa of hamsters infected with SARS-CoV-2, including B and T cells from the blood and nasal cavity, macrophages from the lung and nasal cavity, alveolar epithelial and lung endothelial cells. The cohort was divided into five groups: non-vaccinated (control), 2*adenovirus (two doses of adenovirus vaccine), 2*attenuated (two doses of attenuated virus vaccine), 2*mRNA (two doses of mRNA vaccine), and mRNA/attenuated (primed by mRNA vaccine, boosted by attenuated vaccine). All genes were ranked using five signature ranking methods (LASSO, LightGBM, Monte Carlo feature selection, mRMR, and permutation feature importance). Some key genes that contributed to the analysis of immune changes, such as RPS23, DDX5, PFN1 in immune cells, and IRF9 and MX1 in tissue cells, were screened. Afterward, the five feature sorting lists were fed into the feature incremental selection framework, which contained two classification algorithms (decision tree [DT] and random forest [RF]), to construct optimal classifiers and generate quantitative rules. Results showed that random forest classifiers could provide relative higher performance than decision tree classifiers, whereas the DT classifiers provided quantitative rules that indicated special gene expression levels under different vaccine strategies. These findings may help us to develop better protective vaccination programs and new vaccines

    Clinical and Genomic Analysis of Liver Abscess-Causing Klebsiella pneumoniae Identifies New Liver Abscess-Associated Virulence Genes

    No full text
    Hypervirulent variants of Klebsiella pneumoniae (hvKp) that cause invasive community-acquired pyogenic liver abscess have emerged globally. Little is known about the virulence determinants associated with hvKp, except for the virulence genes rmpA/A2 and siderophores (iroBCD/iucABCD) carried by the pK2044-like large virulence plasmid. Here, we collected most recent clinical isolates of hvKp from pyogenic liver abscess (PLA) samples in China, and performed clinical, molecular, and genomic sequencing analyses. We found that 90.9% (40/44) of the pathogens causing PLA were K. pneumoniae. Among the 40 LA-Kp, K1 (62.5%) and K2 (17.5%) were the dominant serotypes, and ST23 (47.5%) was the major sequence type. S1-PFGE analyses demonstrated that although 77.5% (31/40) of the LA-Kp isolates harbored a single large virulence plasmid varied in size, 5 (12.5%) isolates had no plasmid and 4 (10%) had two or three plasmids. Whole genome sequencing and comparative analysis of 3 LA-Kp and 3 non-LA-Kp identified 133 genes present only in LA-Kp. Further, large scale screening of the 133 genes in 45 LA-Kp and 103 non-LA-Kp genome sequences from public databases identified 30 genes that were highly associated with LA-Kp, including iroBCD, iucABCD and rmpA/A2 and 21 new genes. Then, these 21 new genes were analyzed in 40 LA-Kp and 86 non-LA-Kp clinical isolates collected in this study by PCR, showing that new genes were present 80-100% among LA-Kp isolates while 2-11% in K. pneumoniae isolates from sputum and urine. Several of the 21 genes have been proposed as virulence factors in other bacteria, such as the gene encoding SAM-dependent methyltransferase and pagO which protects bacteria from phagocytosis. Taken together, these genes are likely new virulence factors contributing to the hypervirulence phenotype of hvKp, and may deepen our understanding of virulence mechanism of hvKp
    corecore