531 research outputs found

    Automatic identification of variables in epidemiological datasets using logic regression

    Get PDF
    textabstractBackground: For an individual participant data (IPD) meta-analysis, multiple datasets must be transformed in a consistent format, e.g. using uniform variable names. When large numbers of datasets have to be processed, this can be a time-consuming and error-prone task. Automated or semi-automated identification of variables can help to reduce the workload and improve the data quality. For semi-automation high sensitivity in the recognition of matching variables is particularly important, because it allows creating software which for a target variable presents a choice of source variables, from which a user can choose the matching one, with only low risk of having missed a correct source variable. Methods: For each variable in a set of target variables, a number of simple rules were manually created. With logic regression, an optimal Boolean combination of these rules was searched for every target variable, using a random subset of a large database of epidemiological and clinical cohort data (construction subset). In a second subset of this database (validation subset), this optimal combination rules were validated. Results: In the construction sample, 41 target variables were allocated on average with a positive predictive value (PPV) of 34%, and a negative predictive value (NPV) of 95%. In the validation sample, PPV was 33%, whereas NPV remained at 94%. In the construction sample, PPV was 50% or less in 63% of all variables, in the validation sample in 71% of all variables. Conclusions: We demonstrated that the application of logic regression in a complex data management task in large epidemiological IPD meta-analyses is feasible. However, the performance of the algorithm is poor, which may require backup strategies

    Solubility correlation of monocarboxylic acids in one-component solvents

    No full text

    Densities and Excess Volumes of the 1-Chlorobutane + n

    No full text

    Excess Enthalpies in Binary Systems of Isomeric C<sub>8</sub> Aliphatic Monoethers with Acetonitrile and Their Description by the COSMO-SAC Model

    No full text
    The excess enthalpies at 298.15 and 308.15 K for six binary mixtures of acetonitrile + C<sub>8</sub> aliphatic ether {heptyl methyl ether CH<sub>3</sub>O<sup><i>n</i></sup>C<sub>7</sub>H<sub>15</sub>, or ethylhexyl ether C<sub>2</sub>H<sub>5</sub>O<sup><i>n</i></sup>C<sub>6</sub>H<sub>13</sub>, or pentyl propyl ether <sup><i>n</i></sup>C<sub>3</sub>H<sub>7</sub>O<sup><i>n</i></sup>C<sub>5</sub>H<sub>11</sub>, or isopentyl propyl ether <sup><i>n</i></sup>C<sub>3</sub>H<sub>7</sub>O<sup><i>i</i></sup> C<sub>5</sub>H<sub>11</sub>, or dibutyl ether <sup><i>n</i></sup>C<sub>4</sub>H<sub>9</sub>O<sup><i>n</i></sup>C<sub>4</sub>H<sub>9</sub>, or butyl isobutyl ether <sup><i>n</i></sup>C<sub>4</sub>H<sub>9</sub>O<sup><i>i</i></sup>C<sub>4</sub>H<sub>9</sub>} have been determined by isothermal titration calorimetry using the TA Instruments (model TAM III) calorimeter. The possibility of the COSMO-SAC model to account for the thermodynamic differences between these systems has been tested
    corecore