48 research outputs found
MIC values calculated by ApproxMaxMI and ChiMIC for random data with different sample sizes.
<p>The scores are reported as means over 500 replicates.</p
Statistical power of MIC from ApproxMaxMI, ChiMIC and dCor with different levels of noise, for five kinds of functional relationships.
<p>The statistical power was estimated via 500 simulations, with sample size <i>n</i> = 400.</p
A New Algorithm to Optimize Maximal Information Coefficient
<div><p>The maximal information coefficient (MIC) captures dependences between paired variables, including both functional and non-functional relationships. In this paper, we develop a new method, ChiMIC, to calculate the MIC values. The ChiMIC algorithm uses the chi-square test to terminate grid optimization and then removes the restriction of maximal grid size limitation of original ApproxMaxMI algorithm. Computational experiments show that ChiMIC algorithm can maintain same MIC values for noiseless functional relationships, but gives much smaller MIC values for independent variables. For noise functional relationship, the ChiMIC algorithm can reach the optimal partition much faster. Furthermore, the MCN values based on MIC calculated by ChiMIC can capture the complexity of functional relationships in a better way, and the statistical powers of MIC calculated by ChiMIC are higher than those calculated by ApproxMaxMI. Moreover, the computational costs of ChiMIC are much less than those of ApproxMaxMI. We apply the MIC values tofeature selection and obtain better classification accuracy using features selected by the MIC values from ChiMIC.</p></div
Grid partition of ApproxMaxMI and ChiMIC for parabolic function.
<p>1000 data points simulated for functional relationships of the form <i>y</i> = 4<i>x</i><sup><i>2</i></sup>+<i>η</i>. where <i>η</i> is noise drawn uniformly from (−0.25, 0.25). A: Grid partition for noiseless parabolic function. B: Grid partition based on ApproxMaxMI for noisy parabolic function. C: Grid partition based on ChiMIC for noisy parabolic function.</p
Illustration of x-axis partition of ChiMIC.
<p>Colored <i>r</i>×2 contingency tables are used for chi-square test.</p
Density distribution of ApproxMaxMI, ChiMIC and dCor scores for two independent variables.
<p>ApproxMaxMI, ChiMIC and dCor estimates were computed for sample size <i>n</i> = 400 over 1000 replicates.</p
Grid partition of ApproxMaxMI and ChiMIC for linear function.
<p>1000 data points simulated for functional relationships of the form <i>y</i> = <i>x</i>+<i>η</i>. where <i>η</i> is noise drawn uniformly from (−0.25, 0.25). A: Grid partition for noiseless linear function. B: Grid partition based on ApproxMaxMI for noisy linear function. C: Grid partition based on ChiMIC for noisy linear function.</p
Illustration of x-axis partition of ChiMIC.
<p>Colored <i>r</i>×2 contingency tables are used for chi-square test.</p
Density distribution of ApproxMaxMI, ChiMIC and dCor scores for two independent variables.
<p>ApproxMaxMI, ChiMIC and dCor estimates were computed for sample size <i>n</i> = 400 over 1000 replicates.</p
Retained features and independent test accuracy based on MIC and ChiMIC.
<p>Retained features and independent test accuracy based on MIC and ChiMIC.</p