48 research outputs found

    A New Algorithm to Optimize Maximal Information Coefficient

    No full text
    <div><p>The maximal information coefficient (MIC) captures dependences between paired variables, including both functional and non-functional relationships. In this paper, we develop a new method, ChiMIC, to calculate the MIC values. The ChiMIC algorithm uses the chi-square test to terminate grid optimization and then removes the restriction of maximal grid size limitation of original ApproxMaxMI algorithm. Computational experiments show that ChiMIC algorithm can maintain same MIC values for noiseless functional relationships, but gives much smaller MIC values for independent variables. For noise functional relationship, the ChiMIC algorithm can reach the optimal partition much faster. Furthermore, the MCN values based on MIC calculated by ChiMIC can capture the complexity of functional relationships in a better way, and the statistical powers of MIC calculated by ChiMIC are higher than those calculated by ApproxMaxMI. Moreover, the computational costs of ChiMIC are much less than those of ApproxMaxMI. We apply the MIC values tofeature selection and obtain better classification accuracy using features selected by the MIC values from ChiMIC.</p></div

    Grid partition of ApproxMaxMI and ChiMIC for parabolic function.

    No full text
    <p>1000 data points simulated for functional relationships of the form <i>y</i> = 4<i>x</i><sup><i>2</i></sup>+<i>η</i>. where <i>η</i> is noise drawn uniformly from (−0.25, 0.25). A: Grid partition for noiseless parabolic function. B: Grid partition based on ApproxMaxMI for noisy parabolic function. C: Grid partition based on ChiMIC for noisy parabolic function.</p

    Grid partition of ApproxMaxMI and ChiMIC for linear function.

    Get PDF
    <p>1000 data points simulated for functional relationships of the form <i>y</i> = <i>x</i>+<i>η</i>. where <i>η</i> is noise drawn uniformly from (−0.25, 0.25). A: Grid partition for noiseless linear function. B: Grid partition based on ApproxMaxMI for noisy linear function. C: Grid partition based on ChiMIC for noisy linear function.</p

    Illustration of x-axis partition of ChiMIC.

    No full text
    <p>Colored <i>r</i>×2 contingency tables are used for chi-square test.</p

    Density distribution of ApproxMaxMI, ChiMIC and dCor scores for two independent variables.

    No full text
    <p>ApproxMaxMI, ChiMIC and dCor estimates were computed for sample size <i>n</i> = 400 over 1000 replicates.</p

    Retained features and independent test accuracy based on MIC and ChiMIC.

    No full text
    <p>Retained features and independent test accuracy based on MIC and ChiMIC.</p
    corecore