49 research outputs found

    BClean: A Bayesian Data Cleaning System

    Full text link
    There is a considerable body of work on data cleaning which employs various principles to rectify erroneous data and transform a dirty dataset into a cleaner one. One of prevalent approaches is probabilistic methods, including Bayesian methods. However, existing probabilistic methods often assume a simplistic distribution (e.g., Gaussian distribution), which is frequently underfitted in practice, or they necessitate experts to provide a complex prior distribution (e.g., via a programming language). This requirement is both labor-intensive and costly, rendering these methods less suitable for real-world applications. In this paper, we propose BClean, a Bayesian Cleaning system that features automatic Bayesian network construction and user interaction. We recast the data cleaning problem as a Bayesian inference that fully exploits the relationships between attributes in the observed dataset and any prior information provided by users. To this end, we present an automatic Bayesian network construction method that extends a structure learning-based functional dependency discovery method with similarity functions to capture the relationships between attributes. Furthermore, our system allows users to modify the generated Bayesian network in order to specify prior information or correct inaccuracies identified by the automatic generation process. We also design an effective scoring model (called the compensative scoring model) necessary for the Bayesian inference. To enhance the efficiency of data cleaning, we propose several approximation strategies for the Bayesian inference, including graph partitioning, domain pruning, and pre-detection. By evaluating on both real-world and synthetic datasets, we demonstrate that BClean is capable of achieving an F-measure of up to 0.9 in data cleaning, outperforming existing Bayesian methods by 2% and other data cleaning methods by 15%.Comment: Our source code is available at https://github.com/yyssl88/BClea

    Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types.

    Get PDF
    Characterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits

    One-pot aqueous synthesis of cysteine-capped CdTe/CdS core-shell nanowires

    Get PDF
    Highly fluorescent cysteine-capped CdTe/CdS core-shell nanowires were successfully synthesized by reacting CdCl2 with NaHTe in aqueous solution under refluxing at 100 °C for 140 min. On increasing the reaction time from 10 to 140 min, CdTe/CdS nanocrystals gradually grew into nanorods and eventually completely evolved into nanowires. The nanowires have amino and carboxyl functional groups on their surfaces and can be well dispersed in aqueous solution. The as-prepared CdTe/CdS nanowires show a fluorescence quantum yield (QY) of 7.25 % due to the unique nature of cysteine and the formation of a CdS shell on the surface of the CdTe core, they have a narrower diameter distribution (d = ~5 nm) and a length in the range of 175-275 nm, and their aspect ratio is between 1/35 and 1/55

    Fluid Field Modulation in Mass Transfer for Efficient Photocatalysis

    No full text
    Mass transfer is an essential factor determining photocatalytic performance, which can be modulated by fluid field via manipulating the kinetic characteristics of photocatalysts and photocatalytic intermediates. Past decades have witnessed the efforts and achievements made in manipulating mass transfer based on photocatalyst structure and composition design, and thus, a critical survey that scrutinizes the recent progress in this topic is urgently necessitated. This review examines the basic principles of how mass transfer behavior impacts photocatalytic activity accompanying with the discussion on theoretical simulation calculation including fluid flow speed and pattern. Meanwhile, newly emerged viable photocatalytic micro/nanomotors with self-thermophoresis, self-diffusiophoresis, and bubble-propulsion mechanisms as well as magnet-actuated photocatalytic artificial cilia for facilitating mass transfer will be covered. Furthermore, their applications in photocatalytic hydrogen evolution, carbon dioxide reduction, organic pollution degradation, bacteria disinfection and so forth are scrutinized. Finally, a brief summary and future outlook are presented, providing a viable guideline to those working in photocatalysis, mass transfer, and other related fields

    The application advances of dendrimers in biomedical field

    No full text
    Abstract Dendrimers are a family of nano‐sized three‐dimensional polymers with unique dendritic branching structures and compact spherical geometries. In recent years, dendrimers have made a series of breakthroughs in the biomedical field. In this review, we introduce the synthesis principles, modification methods, and new materials designed based on dendrimers; discuss the importance of cytotoxicity of dendrimers for applications; and elaborate on their applications in the field of molecular assembly and cancer diagnosis and treatment. We speculate that in the near future, more new materials based on dendrimers will be applied in the biomedical field

    Protective effects of Sapindus mukorossi Gaertn against fatty liver disease induced by high fat diet in rats

    No full text
    OBJECTIVES: Study the effects of alcohol extract of Sapindus mukorossi Gaertn (AESM) on the metabolism of blood fat, morphology of fenestrated liver sinusoidal endothelial cells (LSEC), and the ultrastructure of liver cells of the rats with non-alcoholic fatty liver disease (NAFLD).METHODS: Divide SD rats into control group, model group, simvastatin (7.2 mg/kg) group, and S.mukorossi Gaertn group with high dosage (0.5 g/kg), moderate dosage (0.1 g/kg), and low dosage (0.05 g/kg). After feeding with fat-rich nutrients for 3 weeks and establishing the model of hepatic adipose, conduct intragastric administration and provide the rats with fat-rich nutrients at the same time. At the 43rd day, take blood sample and measure aminotransferase and different indexes of blood fat; take hepatic tissue for pathological section, and observe the hepatic morphological patterns under light microscope; obtain and fix the hepatic tissue after injecting perfusate into the body, and observe the changes of fenestrated LSEC under scanning electron microscope; observe the ultrastructure of liver cells under transmission electron microscope.RESULTS: High-dosage alcohol extracts of S.mukorossi Gaertn can alleviate the AST, ALT, TC, TG, LDL, γ-GT, and ALP level, as well as raise the HDL and APN level in the serum of NAFLD-rat model. In addition, through the observation from light microscope and electron microscopes, the morphology of the hepatic tissue and liver cells as well as the recovery of the fenestrated LSEC in the treatment group has become normal.CONCLUSIONS: Alcohol extracts of S.mukorossi Gaertn can regulate the level of blood fat and improve the pathological changes of the hepatic tissues in NAFLD-rat model, which demonstrates the effects of down-regulating fat level and protecting liver.</p
    corecore