37 research outputs found

    Least Dependent Component Analysis Based on Mutual Information

    Get PDF
    We propose to use precise estimators of mutual information (MI) to find least dependent components in a linearly mixed signal. On the one hand this seems to lead to better blind source separation than with any other presently available algorithm. On the other hand it has the advantage, compared to other implementations of `independent' component analysis (ICA) some of which are based on crude approximations for MI, that the numerical values of the MI can be used for: (i) estimating residual dependencies between the output components; (ii) estimating the reliability of the output, by comparing the pairwise MIs with those of re-mixed components; (iii) clustering the output according to the residual interdependencies. For the MI estimator we use a recently proposed k-nearest neighbor based algorithm. For time sequences we combine this with delay embedding, in order to take into account non-trivial time correlations. After several tests with artificial data, we apply the resulting MILCA (Mutual Information based Least dependent Component Analysis) algorithm to a real-world dataset, the ECG of a pregnant woman. The software implementation of the MILCA algorithm is freely available at http://www.fz-juelich.de/nic/cs/softwareComment: 18 pages, 20 figures, Phys. Rev. E (in press

    Estimating Mutual Information

    Get PDF
    We present two classes of improved estimators for mutual information M(X,Y)M(X,Y), from samples of random points distributed according to some joint probability density ÎĽ(x,y)\mu(x,y). In contrast to conventional estimators based on binnings, they are based on entropy estimates from kk-nearest neighbour distances. This means that they are data efficient (with k=1k=1 we resolve structures down to the smallest possible scales), adaptive (the resolution is higher where data are more numerous), and have minimal bias. Indeed, the bias of the underlying entropy estimates is mainly due to non-uniformity of the density at the smallest resolved scale, giving typically systematic errors which scale as functions of k/Nk/N for NN points. Numerically, we find that both families become {\it exact} for independent distributions, i.e. the estimator M^(X,Y)\hat M(X,Y) vanishes (up to statistical fluctuations) if ÎĽ(x,y)=ÎĽ(x)ÎĽ(y)\mu(x,y) = \mu(x) \mu(y). This holds for all tested marginal distributions and for all dimensions of xx and yy. In addition, we give estimators for redundancies between more than 2 random variables. We compare our algorithms in detail with existing algorithms. Finally, we demonstrate the usefulness of our estimators for assessing the actual independence of components obtained from independent component analysis (ICA), for improving ICA, and for estimating the reliability of blind source separation.Comment: 16 pages, including 18 figure

    Two Novel Methods For The Determination Of The Number Of Components In Independent Components Analysis Models

    Get PDF
    Independent Components Analysis is a Blind Source Separation method that aims to find the pure source signals mixed together in unknown proportions in the observed signals under study. It does this by searching for factors which are mutually statistically independent. It can thus be classified among the latent-variable based methods. Like other methods based on latent variables, a careful investigation has to be carried out to find out which factors are significant and which are not. Therefore, it is important to dispose of a validation procedure to decide on the optimal number of independent components to include in the final model. This can be made complicated by the fact that two consecutive models may differ in the order and signs of similarly-indexed ICs. As well, the structure of the extracted sources can change as a function of the number of factors calculated. Two methods for determining the optimal number of ICs are proposed in this article and applied to simulated and real datasets to demonstrate their performance

    Rivers of North Rhine Westphalia revisited.

    No full text
    Three rivers in North-Rhine Westphalia, Germany, were investigated for their hydrochemical properties including their stable isotopic composition of water (d2H, d18O) and dissolved river compounds (d13CDIC, d34SSO4 and d18OSO4, and d15NNO3 and d18ONO3). The study focused on two objectives: an assessment of potential sources for river solutes (anthropogenic vs. natural sources), and the quantification of changes in river chemistry over the past 15 a (for the rivers Lippe and Ruhr). Decreasing concentrations were found for most of those river constituents that are commonly linked to anthropogenic activities, such as , [Cl-], [K+], and [Na+]. An observed increase in for the river Lippe reflects most likely varying discharges from mining activities. Variations in the isotopic composition of water display the influence of ocean water (river Ems) or of evaporation that occurred either in channels (river Ems), in reservoirs (river Ruhr) or due to the use of river water for cooling purposes (river Lippe). d13CDIC values around -11‰ point to carbonate dissolution by carbonic acid as the major source for dissolved inorganic C. Modifications of this average d13CDIC resulted from enhanced agricultural use, sewage inputs, and gas exchange with the atmosphere in reservoirs and channels. The isotopic composition of dissolved reveals atmospheric deposition and sulphide oxidation as its major sources. Sulphate from sulphide oxidation in parts reflects the local geology (river Ruhr); in the Kreidebecken leaching of sulphide seems to be linked to agriculture and drainage (rivers Lippe and Ems). However, introduced from mining activities into the Lippe and the Ems does not alter the isotopic composition of riverine , despite rather high discharges. Nitrogen and O isotopes reveal that manure and sewage are major sources of NO3 in most parts of the river Ruhr. Only a single value from the headwaters displays the signature of soil NO3. Downstream increasing d15NNO3 and d18ONO3 values (both by 2‰ on average) point to denitrification and to additional inputs from atmospheric deposition

    Evidence for denitrification regulated by pyrite oxidation in a heterogeneous porous groundwater system.

    No full text
    Denitrification is an important natural attenuation process that has been observed in many fissured and porous aquifers. However, an important factor limiting denitrification in aquatic systems is the microbial availability of electron donors. Pyrite as the most abundant sulfide mineral in nature represents one of the potential electron sources for denitrifiers to reduce nitrate, but the reaction mechanisms coupling denitrification processes to pyrite oxidation are still questionable. We utilized hydrochemical data and stable isotopes of nitrate and sulfate in groundwater, isotope ratios of sulfur compounds in aquifer sediments and tritium based groundwater dating for assessing denitrification processes in a pyrite-bearing porous groundwater system. The oxic part of the aquifer with mean water transit times of approximately 60 years was characterized by nitrate concentrations of around 15 mg/l and ?15N values were similar to those typical for nitrification. In contrast, in the anoxic part with mean water transit times of up to 100 years, low nitrate concentrations accompanied by elevated ?15N values were observed. Furthermore, isotope data of groundwater sulfate and sulfur compounds in the aquifer sediment suggest that pyrite oxidation is the dominant source of sulfate in the aquifer. The trend of increasing ?15N values and decreasing nitrate concentrations in concert with depleted ?34S values of groundwater sulfate similar to ?34S values of pyrite, FeS2, suggests that denitrification is coupled to pyrite oxidation, particularly when water mean transit time is elevated

    A sequence-ready BAC/PAC contig and partial transcript map of approximately 1.5 Mb in human chromosome 17q25 comprising multiple disease genes.

    No full text
    Hereditary neuralgic amyotrophy (HNA) is an autosomal dominant recurrent neuropathy mapped to a 4-cM interval on chromosome 17q25 between the short tandem repeat (STR) markers D17S1603 and D17S802. Chromosome 17q25 in general and the 4-cM HNA region in particular are also implicated in the pathogenesis of a number of tumors (tylosis with esophageal cancer, sporadic breast and ovarian tumors) and harbor a psoriasis susceptibility locus. Initial attempts to construct a yeast artificial chromosome contig failed. Therefore, we have now constructed a complete P1 artificial chromosome (PAC) and bacterial artificial chromosome (BAC) contig of the region flanked by the STR markers D17S1603 and D17S802. The contig contains 22 PAC and 64 BAC clones and covers a physical distance of approximately 1. 5 Mb. A total of 83 sequence-tagged site (STS) markers (10 known STSs and STRs, 56 STSs generated from clone end-fragments, 12 expressed sequence tags, and 5 known genes) were mapped on the contig, resulting in an extremely dense physical map with approximately 1 STS per 20 kb. This sequence-ready PAC and BAC contig will be pivotal for the positional cloning of the HNA gene as well as other disease genes mapping to this region

    Evaluation of single nucleotide polymorphisms in the phosphodiesterase 4D gene (PDE4D) and their association with ischaemic stroke in a large German cohort

    No full text
    Genetic fine mapping of the first locus identified for genetically complex forms of stroke, STRK1 (which has been mapped to chromosome 5q12 in Icelandic families), has identified the phosphodiesterase 4D gene (PDE4D) gene as a good candidate gene. Association analysis of single nucleotide polymorphisms (SNPs) in the PDE4D gene in an Icelandic stroke cohort demonstrated genetic association between six SNPs in the 5′ region of PDE4D and ischaemic stroke. The present study aimed to test whether the same six SNPs in PDE4D were also associated with stroke in a large stroke cohort from northern Germany (stroke patients with acute completed ischaemic stroke: n = 1181; population based controls: n = 1569). None of the six SNPs showed significant association with ischaemic stroke in the whole stroke sample before and after adjustment for conventional stroke risk factors (age, sex, hypertension, diabetes, and hypercholesterolaemia). Haplotype analysis did also not reveal any significant association. Marginally positive statistical measures of association in the subgroup with cardioembolic stroke did not remain significant after correction for multiple testing. In conclusion, this study was unable to demonstrate an association between the six SNPs which had showed significant single marker association with stroke in the Icelandic stroke cohort and ischaemic stroke in a large German cohort
    corecore