36 research outputs found
Reordering Columns for Smaller Indexes
Column-oriented indexes-such as projection or bitmap indexes-are compressed
by run-length encoding to reduce storage and increase speed. Sorting the tables
improves compression. On realistic data sets, permuting the columns in the
right order before sorting can reduce the number of runs by a factor of two or
more. Unfortunately, determining the best column order is NP-hard. For many
cases, we prove that the number of runs in table columns is minimized if we
sort columns by increasing cardinality. Experimentally, sorting based on
Hilbert space-filling curves is poor at minimizing the number of runs.Comment: to appear in Information Science
Rescue of replication failure by Fanconi anaemia proteins
Chromosomal aberrations are often associated with incomplete genome duplication, for instance at common fragile sites, or as a consequence of chemical alterations in the DNA template that block replication forks. Studies of the cancer-prone disease Fanconi anaemia (FA) have provided important insights into the resolution of replication problems. The repair of interstrand DNA crosslinks induced by chemotherapy drugs is coupled with DNA replication and controlled by FA proteins. We discuss here the recent discovery of new FA-associated proteins and the development of new tractable repair systems that have dramatically improved our understanding of crosslink repair. We focus also on how FA proteins protect against replication failure in the context of fragile sites and on the identification of reactive metabolites that account for the development of Fanconi anaemia symptoms
Robust estimation of bacterial cell count from optical density
Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data
Computational Methods for Pigmented Skin Lesion Classification in Images: Review and Future Trends
Skin cancer is considered as one of the most common types of cancer in several countries, and its incidence rate has increased in recent years. Melanoma cases have caused an increasing number of deaths worldwide, since this type of skin cancer is the most aggressive compared to other types. Computational methods have been developed to assist dermatologists in early diagnosis of skin cancer. An overview of the main and current computational methods that have been proposed for pattern analysis and pigmented skin lesion classification is addressed in this review. In addition, a discussion about the application of such methods, as well as future trends, is also provided. Several methods for feature extraction from both macroscopic and dermoscopic images and models for feature selection are introduced and discussed. Furthermore, classification algorithms and evaluation procedures are described, and performance results for lesion classification and pattern analysis are given