356 research outputs found
Detection of anomalous patterns in water consumption: an overview of approaches
The water distribution system constantly aims at improving and efficiently distributing water to the city. Thus, understanding the nature of irregularities that may interrupt or exacerbate the service is at the core of their business model. The detection of technical and non-technical losses allows water companies to improve the sustainability and affordability of the service. Anomaly detection in water consumption is at present a challenging task. Manual inspection of data is tedious and requires a large workforce. Fortunately, the sector may benefit from automatized and intelligent workflows to reduce the amount of time required to identify abnormal water consumption. The aim of this research work is to develop a methodology to detect anomalies and irregular patterns of water consumption. We propose the use of algorithms of different nature that approach the problem of anomaly detection from different perspectives that go from searching deviations from typical behavior to identification of anomalous pattern changes in prolonged periods of time. The experiments reveal that different approaches to the problem of anomaly detection provide complementary clues to contextualize household water consumption. In addition, all the information extracted from each approach can be used in conjunction to provide insights for decision-makingThis research work is cofounded by the European Regional Development Fund (FEDER) under the FEDER Catalonia Operative Programme 2014–2020 as part of the R+D Project from RIS3CAT Utilities 4.0 Community with reference code COMRDI16-1-0057.Peer ReviewedPostprint (author's final draft
Novel survey method finds dramatic decline of wild cotton-top tamarin population
For conservation purposes, accurate methods are
required to track cotton-top tamarins in their natural habitat. As existing census methods are
not appropriate for surveying these monkeys, a lure-transect method combined with playback
vocalization was used here to allow accurate counting of the animals
Probe-level linear model fitting and mixture modeling results in high accuracy detection of differential gene expression
BACKGROUND: The identification of differentially expressed genes (DEGs) from Affymetrix GeneChips arrays is currently done by first computing expression levels from the low-level probe intensities, then deriving significance by comparing these expression levels between conditions. The proposed PL-LM (Probe-Level Linear Model) method implements a linear model applied on the probe-level data to directly estimate the treatment effect. A finite mixture of Gaussian components is then used to identify DEGs using the coefficients estimated by the linear model. This approach can readily be applied to experimental design with or without replication. RESULTS: On a wholly defined dataset, the PL-LM method was able to identify 75% of the differentially expressed genes within 10% of false positives. This accuracy was achieved both using the three replicates per conditions available in the dataset and using only one replicate per condition. CONCLUSION: The method achieves, on this dataset, a higher accuracy than the best set of tools identified by the authors of the dataset, and does so using only one replicate per condition
Analysis of the Trajectory of Drosophila melanogaster in a Circular Open Field Arena
BACKGROUND: Obtaining a complete phenotypic characterization of a freely moving organism is a difficult task, yet such a description is desired in many neuroethological studies. Many metrics currently used in the literature to describe locomotor and exploratory behavior are typically based on average quantities or subjectively chosen spatial and temporal thresholds. All of these measures are relatively coarse-grained in the time domain. It is advantageous, however, to employ metrics based on the entire trajectory that an organism takes while exploring its environment. METHODOLOGY/PRINCIPAL FINDINGS: To characterize the locomotor behavior of Drosophila melanogaster, we used a video tracking system to record the trajectory of a single fly walking in a circular open field arena. The fly was tracked for two hours. Here, we present techniques with which to analyze the motion of the fly in this paradigm, and we discuss the methods of calculation. The measures we introduce are based on spatial and temporal probability distributions and utilize the entire time-series trajectory of the fly, thus emphasizing the dynamic nature of locomotor behavior. Marginal and joint probability distributions of speed, position, segment duration, path curvature, and reorientation angle are examined and related to the observed behavior. CONCLUSIONS/SIGNIFICANCE: The measures discussed in this paper provide a detailed profile of the behavior of a single fly and highlight the interaction of the fly with the environment. Such measures may serve as useful tools in any behavioral study in which the movement of a fly is an important variable and can be incorporated easily into many setups, facilitating high-throughput phenotypic characterization
Adherent Monomer-Misfolded SOD1
Background: Multiple cellular functions are compromised in amyotrophic lateral sclerosis (ALS). In familial ALS (FALS) with Cu/Zn superoxide dismutase (SOD1) mutations, the mechanisms by which the mutation in SOD1 leads to such a wide range of abnormalities remains elusive. Methodology/Principal Findings: To investigate underlying cellular conditions caused by the SOD1 mutation, we explored mutant SOD1-interacting proteins in the spinal cord of symptomatic transgenic mice expressing a mutant SOD1, SOD1 Leu126delTT with a FLAG sequence (DF mice). This gene product is structurally unable to form a functional homodimer. Tissues were obtained from both DF mice and disease-free mice expressing wild-type with FLAG SOD1 (WF mice). Both FLAG-tagged SOD1 and cross-linking proteins were enriched and subjected to a shotgun proteomic analysis. We identified 34 proteins (or protein subunits) in DF preparations, while in WF preparations, interactions were detected with only 4 proteins. Conclusions/Significance: These results indicate that disease-causing mutant SOD1 likely leads to inadequate proteinprotein interactions. This could be an early and crucial process in the pathogenesis of FALS
Absence of the common Insulin-like growth factor-1 19-repeat allele is associated with early age at breast cancer diagnosis in multiparous women
Multiparity decreases the risk of breast cancer in white women, whereas it is a risk factor in black women <50 years. Early-onset breast cancer (<50 years) has been associated with high insulin-like growth factor-1 (IGF-1) levels. Absence of the common IGF1 19 cytosine-adenine (CA)-repeat allele (IGF1-19/-19) inverts the effect of several non-genetic factors on breast cancer risk but the interaction between IGF1-19/-19 and multiparity on breast cancer risk is unknown. As IGF1-19/-19, multiparity and early-onset breast cancer are more common in black than in white women, we aimed to study whether multiparity combined with IGF1-19/-19 increases the risk of early-onset breast cancer. Four hundred and three breast cancer patients diagnosed in Lund, Sweden, at age 25–99 years were genotyped for the IGF1 CA-repeat length using fragment analysis. Overall, 12.9% carried the IGF1-19/-19 genotype. There was a highly significant interaction between multiparity and IGF1-19/-19 on age at breast cancer diagnosis (P=0.007). Among IGF1-19/-19 patients, multiparity was associated with a 9.2 year earlier age at diagnosis compared with uniparity or nulliparity (P=0.006). Multiparity combined with IGF1-19/-19 was associated with an early age at breast cancer diagnosis. If confirmed, IGF1-19/-19 may help identify a subgroup of women for earlier breast cancer screening
SNP Haplotype Mapping in a Small ALS Family
The identification of genes for monogenic disorders has proven to be highly effective for understanding disease mechanisms, pathways and gene function in humans. Nevertheless, while thousands of Mendelian disorders have not yet been mapped there has been a trend away from studying single-gene disorders. In part, this is due to the fact that many of the remaining single-gene families are not large enough to map the disease locus to a single site in the genome. New tools and approaches are needed to allow researchers to effectively tap into this genetic gold-mine. Towards this goal, we have used haploid cell lines to experimentally validate the use of high-density single nucleotide polymorphism (SNP) arrays to define genome-wide haplotypes and candidate regions, using a small amyotrophic lateral sclerosis (ALS) family as a prototype. Specifically, we used haploid-cell lines to determine if high-density SNP arrays accurately predict haplotypes across entire chromosomes and show that haplotype information significantly enhances the genetic information in small families. Panels of haploid-cell lines were generated and a 5 centimorgan (cM) short tandem repeat polymorphism (STRP) genome scan was performed. Experimentally derived haplotypes for entire chromosomes were used to directly identify regions of the genome identical-by-descent in 5 affected individuals. Comparisons between experimentally determined and in silico haplotypes predicted from SNP arrays demonstrate that SNP analysis of diploid DNA accurately predicted chromosomal haplotypes. These methods precisely identified 12 candidate intervals, which are shared by all 5 affected individuals. Our study illustrates how genetic information can be maximized using readily available tools as a first step in mapping single-gene disorders in small families
Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
Timm W, Scherbart A, Boecker S, Kohlbacher O, Nattkemper TW. Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics. 2008;9(1):443.Background: Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e. g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification. Results: In this work we present machine learning techniques for peak intensity prediction for MALDI mass spectra. Features encoding the peptides' physico-chemical properties as well as string-based features were extracted. A feature subset was obtained from multiple forward feature selections on the extracted features. Based on these features, two advanced machine learning methods (support vector regression and local linear maps) are shown to yield good results for this problem (Pearson correlation of 0.68 in a ten-fold cross validation). Conclusion: The techniques presented here are a useful first step going beyond the binary prediction of proteotypic peptides towards a more quantitative prediction of peak intensities. These predictions in turn will turn out to be beneficial for mass spectrometry-based quantitative proteomics
Gene Expression Profiling of Soft and Firm Atlantic Salmon Fillet
Texture of salmon fillets is an important quality trait for consumer acceptance as well as for the suitability for processing. In the present work we measured fillet firmness in a population of farmed Atlantic salmon with known pedigree and investigated the relationship between this trait and gene expression. Transcriptomic analyses performed with a 21 K oligonucleotide microarray revealed strong correlations between firmness and a large number of genes. Highly similar expression profiles were observed in several functional groups. Positive regression was found between firmness and genes encoding proteasome components (41 genes) and mitochondrial proteins (129 genes), proteins involved in stress responses (12 genes), and lipid metabolism (30 genes). Coefficients of determination (R2) were in the range of 0.64–0.74. A weaker though highly significant negative regression was seen in sugar metabolism (26 genes, R2 = 0.66) and myofiber proteins (42 genes, R2 = 0.54). Among individual genes that showed a strong association with firmness, there were extracellular matrix proteins (negative correlation), immune genes, and intracellular proteases (positive correlation). Several genes can be regarded as candidate markers of flesh quality (coiled-coil transcriptional coactivator b, AMP deaminase 3, and oligopeptide transporter 15) though their functional roles are unclear. To conclude, fillet firmness of Atlantic salmon depends largely on metabolic properties of the skeletal muscle; where aerobic metabolism using lipids as fuel, and the rapid removal of damaged proteins, appear to play a major role
- …