66 research outputs found

    Archiving the Relaxed Consistency Web

    Full text link
    The historical, cultural, and intellectual importance of archiving the web has been widely recognized. Today, all countries with high Internet penetration rate have established high-profile archiving initiatives to crawl and archive the fast-disappearing web content for long-term use. As web technologies evolve, established web archiving techniques face challenges. This paper focuses on the potential impact of the relaxed consistency web design on crawler driven web archiving. Relaxed consistent websites may disseminate, albeit ephemerally, inaccurate and even contradictory information. If captured and preserved in the web archives as historical records, such information will degrade the overall archival quality. To assess the extent of such quality degradation, we build a simplified feed-following application and simulate its operation with synthetic workloads. The results indicate that a non-trivial portion of a relaxed consistency web archive may contain observable inconsistency, and the inconsistency window may extend significantly longer than that observed at the data store. We discuss the nature of such quality degradation and propose a few possible remedies.Comment: 10 pages, 6 figures, CIKM 201

    The effects of local voids and imperfections of surrounding rock on the performance of existing tunnel lining

    Get PDF
    Local voids and imperfections may exist around the tunnel due to reasons such as inadequate back infill behind the lining, insufficient local lining thickness, ground water erosion, and other imperfect construction related activities. Such local voids and imperfections generally will lead to local contact loss and discontinuity in the ground-lining interaction. This paper evaluates the effect of local voids and imperfections developing around the tunnel vault area on the mechanical performance of tunnel lining. Based on field investigation results, a series of voids and imperfections with different geometries are defined to reflect cases resulting from different causes. Numerical parametric analyses were performed to investigate how those voids and imperfections influence the internal force and the safety factor of the lining, and the reinforced concrete lining were modelled with the smeared crack model to examine the development of cracking directions and patterns. Furthermore, the numerical approach was verified by comparing with field investigations and measurements. This study aims to investigate the most unsafe situation due to local voids and imperfections around the tunnel, and the modelled cracking feature shows a way to preliminary evaluate the possible local voids and imperfections behind tunnel lining based on field observation

    BONE MORPHOGENETIC PROTEIN-2 AND COLLAGEN TYPE 1 FROM DIFFERENT SOURCES OF DEMINERALIZED DENTINE MATRIX: RELEASE KINETIC AND CHEMOTAXIS POTENTIAL FOR OSTEOPROGENITOR CELLS

    Get PDF
    Objective: To investigate the release of bone morphogenetic protein-2 (BMP-2) and collagen type I proteins (COL1) from different sources ofdemineralized dentine matrix (DDM) and their chemotaxis to mouse osteoprogenitor cells.Methods: The release kinetic of BMP-2 and COL1 was measured from custom-made DDM (CMDDM) and commercially available DDM (CADDM).Using Urist physicochemical method, CMDDM was collected from the extracted teeth in a certified dental clinic. Levels of BMP-2 and COL1 releasedwere measured at days 1, 2, 3, 5, 7, 9, 11, and 13. Next, mouse osteoprogenitor cells, MC3T3-E1, were cultured with a variety of materials as follows:CMDDM, CADDM, Bio-Oss®, and blank control in transwell system. The number of cell migration was determined by crystal violet staining to explorechemotaxis of different DDMs to mouse osteoprogenitor cells.Results: BMP-2 was detected at 588.32 ± 14.53 pg/ml from 5 g of CMDDM. In the release kinetic assay, the concentration of BMP-2 in the CMDDMgroup increased rapidly and peaked at 113.9 pg/ml on day 5, almost four times higher than that of CADDM. The release of COL1 showed similarpattern in both CMDDM and CADDM; however, the amount was significantly higher in the CMDDM group. In cell culture experiment, the number ofmigrated MC3T3-E1 was ranked as the highest in CMDDM, followed by CADDM and Bio-Oss® (p<0.05).Conclusion: CMDDM released BMP-2 and COL1 greater than CADDM, which can induce more osteoblast-like cell migration. These results demonstrateda release kinetic of proteins and osteoinductivity of CMDDM, which supports a benefit of using autogenous bone graft

    Effects of Ultra-high Pressure Assisted Enzymatic Hydrolysis on Structure and Antioxidant Activity of Hemp Protein Isolate

    Get PDF
    Hemp Protein Isolate (HPI) was used as raw material to modify HPI through ultrahigh pressure assisted enzymatic hydrolysis reaction. The SDS-PAGE electrophoresis characteristics, surface hydrophobicity, sulfhydryl content, FTIR and endogenous fluorescence of the hydrolysate of hemp protein isolate (HPIH) were determined under different pressures to investigate the structural changes of the HPI before and after modification. The results showed that ultra-high pressure (UHP) (0.1, 100, 200, 300 MPa) treatment had a certain auxiliary effect on HPI enzymolysis reaction, and with the increase of pressure, the degree of enzymolysis reaction increased gradually, and the molecular weight decreased gradually. After HPI modification, the hydrophobic groups were gradually exposed, and the surface hydrophobicity increased first and then decreased with the increase of pressure, the change difference was significant (P<0.05). The surface hydrophobicity reached the maximum at 200 MPa. After enzymolysis, the free sulfhydryl content of HPIH decreased significantly (P<0.05), while the surface sulfhydryl content increased first and then decreased with the increase of pressure. The determination of amino acid composition and content of protein before and after modification showed that the amino acid composition of HPI remained unchanged before and after modification, but the contents of various amino acids decreased to varying degrees. According to the fourier infrared spectroscopy, compared with HPI, the absorption peak intensity, peak shape and peak area of HPIH changed to different degrees, indicating that the secondary structure of protein was changed by the ultra-high pressure assisted enzymatic hydrolysis reaction. The endogenous fluorescence spectra showed that the fluorescence intensity of HPIH increased and the maximum emission wavelength was redshifted, indicating that the tertiary structure of HPI was changed by the enzymatic hydrolysis reaction. The results of antioxidant activity showed that appropriate pressure treatment could effectively improve the antioxidant capacity of enzymatic hydrolysis products. When the pressure was 200 MPa, the reducing power of HPIH of DPPH· and ABTS+· reached the highest. In conclusion, ultrahigh pressure assisted enzymatic hydrolysis modification can significantly change the secondary and tertiary structure of HPI, exposing hydrophobic groups and other active groups, thereby improving its antioxidant properties

    Data-Juicer: A One-Stop Data Processing System for Large Language Models

    Full text link
    The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, diverse, and high-quality data. Despite this, existing open-source tools for LLM data processing remain limited and mostly tailored to specific datasets, with an emphasis on the reproducibility of released data over adaptability and usability, inhibiting potential applications. In response, we propose a one-stop, powerful yet flexible and user-friendly LLM data processing system named Data-Juicer. Our system offers over 50 built-in versatile operators and pluggable tools, which synergize modularity, composability, and extensibility dedicated to diverse LLM data processing needs. By incorporating visualized and automatic evaluation capabilities, Data-Juicer enables a timely feedback loop to accelerate data processing and gain data insights. To enhance usability, Data-Juicer provides out-of-the-box components for users with various backgrounds, and fruitful data recipes for LLM pre-training and post-tuning usages. Further, we employ multi-facet system optimization and seamlessly integrate Data-Juicer with both LLM and distributed computing ecosystems, to enable efficient and scalable data processing. Empirical validation of the generated data recipes reveals considerable improvements in LLaMA performance for various pre-training and post-tuning cases, demonstrating up to 7.45% relative improvement of averaged score across 16 LLM benchmarks and 16.25% higher win rate using pair-wise GPT-4 evaluation. The system's efficiency and scalability are also validated, supported by up to 88.7% reduction in single-machine processing time, 77.1% and 73.1% less memory and CPU usage respectively, and 7.91x processing acceleration when utilizing distributed computing ecosystems. Our system, data recipes, and multiple tutorial demos are released, calling for broader research centered on LLM data.Comment: Under continuous maintenance and updating; The system, refined data recipes, and demos are at https://github.com/alibaba/data-juice

    The Invasive MED/Q \u3cem\u3eBemisia tabaci\u3c/em\u3e Genome: A Tale of Gene Loss and Gene Gain

    Get PDF
    Background: Sweetpotato whitefly, Bemisia tabaci MED/Q and MEAM1/B, are two economically important invasive species that cause considerable damages to agriculture crops through direct feeding and indirect vectoring of plant pathogens. Recently, a draft genome of B. tabaci MED/Q has been assembled. In this study, we focus on the genomic comparison between MED/Q and MEAM1/B, with a special interest in MED/Q’s genomic signatures that may contribute to the highly invasive nature of this emerging insect pest. Results: The genomes of both species share similarity in syntenic blocks, but have significant divergence in the gene coding sequence. Expansion of cytochrome P450 monooxygenases and UDP glycosyltransferases in MED/Q and MEAM1/B genome is functionally validated for mediating insecticide resistance in MED/Q using in vivo RNAi. The amino acid biosynthesis pathways in MED/Q genome are partitioned among the host and endosymbiont genomes in a manner distinct from other hemipterans. Evidence of horizontal gene transfer to the host genome may explain their obligate relationship. Putative loss-of-function in the immune deficiency-signaling pathway due to the gene loss is a shared ancestral trait among hemipteran insects. Conclusions: The expansion of detoxification genes families, such as P450s, may contribute to the development of insecticide resistance traits and a broad host range in MED/Q and MEAM1/B, and facilitate species’ invasions into intensively managed cropping systems. Numerical and compositional changes in multiple gene families (gene loss and gene gain) in the MED/Q genome sets a foundation for future hypothesis testing that will advance our understanding of adaptation, viral transmission, symbiosis, and plant-insect-pathogen tritrophic interactions

    A comprehensive update on CIDO: the community-based coronavirus infectious disease ontology

    Get PDF
    The current COVID-19 pandemic and the previous SARS/MERS outbreaks of 2003 and 2012 have resulted in a series of major global public health crises. We argue that in the interest of developing effective and safe vaccines and drugs and to better understand coronaviruses and associated disease mechenisms it is necessary to integrate the large and exponentially growing body of heterogeneous coronavirus data. Ontologies play an important role in standard-based knowledge and data representation, integration, sharing, and analysis. Accordingly, we initiated the development of the community-based Coronavirus Infectious Disease Ontology in early 2020. As an Open Biomedical Ontology (OBO) library ontology, CIDO is open source and interoperable with other existing OBO ontologies. CIDO is aligned with the Basic Formal Ontology and Viral Infectious Disease Ontology. CIDO has imported terms from over 30 OBO ontologies. For example, CIDO imports all SARS-CoV-2 protein terms from the Protein Ontology, COVID-19-related phenotype terms from the Human Phenotype Ontology, and over 100 COVID-19 terms for vaccines (both authorized and in clinical trial) from the Vaccine Ontology. CIDO systematically represents variants of SARS-CoV-2 viruses and over 300 amino acid substitutions therein, along with over 300 diagnostic kits and methods. CIDO also describes hundreds of host-coronavirus protein-protein interactions (PPIs) and the drugs that target proteins in these PPIs. CIDO has been used to model COVID-19 related phenomena in areas such as epidemiology. The scope of CIDO was evaluated by visual analysis supported by a summarization network method. CIDO has been used in various applications such as term standardization, inference, natural language processing (NLP) and clinical data integration. We have applied the amino acid variant knowledge present in CIDO to analyze differences between SARS-CoV-2 Delta and Omicron variants. CIDO's integrative host-coronavirus PPIs and drug-target knowledge has also been used to support drug repurposing for COVID-19 treatment. CIDO represents entities and relations in the domain of coronavirus diseases with a special focus on COVID-19. It supports shared knowledge representation, data and metadata standardization and integration, and has been used in a range of applications
    • …
    corecore