124 research outputs found
Data abstractions for decision tree induction
AbstractWhen descriptions of data values in a database are too concrete or too detailed, the computational complexity needed to discover useful knowledge from the database will be generally increased. Furthermore, discovered knowledge tends to become complicated. A notion of data abstraction seems useful to resolve this kind of problems, as we obtain a smaller and more general database after the abstraction, from which we can quickly extract more abstract knowledge that is expected to be easier to understand. In general, however, since there exist several possible abstractions, we have to carefully select one according to which the original database is generalized. An inadequate selection would make the accuracy of extracted knowledge worse.From this point of view, we propose in this paper a method of selecting an appropriate abstraction from possible ones, assuming that our task is to construct a decision tree from a relational database. Suppose that, for each attribute in a relational database, we have a class of possible abstractions for the attribute values. As an appropriate abstraction for each attribute, we prefer an abstraction such that, even after the abstraction, the distribution of target classes necessary to perform our classification task can be preserved within an acceptable error range given by user.By the selected abstractions, the original database can be transformed into a small generalized database written in abstract values. Therefore, it would be expected that, from the generalized database, we can construct a decision tree whose size is much smaller than one constructed from the original database. Furthermore, such a size reduction can be justified under some theoretical assumptions. The appropriateness of abstraction is precisely defined in terms of the standard information theory. Therefore, we call our abstraction framework Information Theoretical Abstraction.We show some experimental results obtained by a system ITA that is an implementation of our abstraction method. From those results, it is verified that our method is very effective in reducing the size of detected decision tree without making classification errors so worse
Recommended from our members
Homogeneous Expansion of Human T-Regulatory Cells Via Tumor Necrosis Factor Receptor 2
T-regulatory cells (Tregs) are a rare lymphocyte subtype that shows promise for treating infectious disease, allergy, graft-versus-host disease, autoimmunity, and asthma. Clinical applications of Tregs have not been fully realized because standard methods of expansion ex vivo produce heterogeneous progeny consisting of mixed populations of CD4 + T cells. Heterogeneous progeny are risky for human clinical trials and face significant regulatory hurdles. With the goal of producing homogeneous Tregs, we developed a novel expansion protocol targeting tumor necrosis factor receptors (TNFR) on Tregs. In in vitro studies, a TNFR2 agonist was found superior to standard methods in proliferating human Tregs into a phenotypically homogeneous population consisting of 14 cell surface markers. The TNFR2 agonist-expanded Tregs also were functionally superior in suppressing a key Treg target cell, cytotoxic T-lymphocytes. Targeting the TNFR2 receptor during ex vivo expansion is a new means for producing homogeneous and potent human Tregs for clinical opportunities
Discovery of hidden correlations in a local transaction database based on differences of correlations
Abstract. Given a transaction database as a global set of transactions and its sub-database regarded as a local one, we consider a pair of itemsets whose degrees of correlations are higher in the local database than in the global one. If they show high correlation in the local database, they are detectable by some search methods of previous studies. On the other hand, there exist another kind of paired itemsets such that they are not regarded as characteristic and cannot be found by the methods of previous studies but that their degrees of correlations become drastically higher by the conditioning to the local database. We pay much attention to the latter kind of paired itemsets, as such pairs of itemsets can be an implicit and hidden evidence showing that something particular to the local database occurs even though they are not yet realized as characteristic ones. From this viewpoint, we measure paired itemsets by a difference of two correlations before and after the conditioning to the local database, and define a notion of DC pairs whose degrees of differences of correlations are high. As the measure is non-monotonic, we present an algorithm, searching for DC pairs, with some new pruning rules for cutting off hopeless itemsets. We show by an experimental result that potentially significant DC pairs can be actually found for a given database and the algorithm successfully detects such DC pairs
Recommended from our members
Proof-of-Concept, Randomized, Controlled Clinical Trial of Bacillus-Calmette-Guerin for Treatment of Long-Term Type 1 Diabetes
Background: No targeted immunotherapies reverse type 1 diabetes in humans. However, in a rodent model of type 1 diabetes, Bacillus Calmette-Guerin (BCG) reverses disease by restoring insulin secretion. Specifically, it stimulates innate immunity by inducing the host to produce tumor necrosis factor (TNF), which, in turn, kills disease-causing autoimmune cells and restores pancreatic beta-cell function through regeneration. Methodology/Principal Findings Translating these findings to humans, we administered BCG, a generic vaccine, in a proof-of-principle, double-blind, placebo-controlled trial of adults with long-term type 1 diabetes (mean: 15.3 years) at one clinical center in North America. Six subjects were randomly assigned to BCG or placebo and compared to self, healthy paired controls (n = 6) or reference subjects with (n = 57) or without (n = 16) type 1 diabetes, depending upon the outcome measure. We monitored weekly blood samples for 20 weeks for insulin-autoreactive T cells, regulatory T cells (Tregs), glutamic acid decarboxylase (GAD) and other autoantibodies, and C-peptide, a marker of insulin secretion. BCG-treated patients and one placebo-treated patient who, after enrollment, unexpectedly developed acute Epstein-Barr virus infection, a known TNF inducer, exclusively showed increases in dead insulin-autoreactive T cells and induction of Tregs. C-peptide levels (pmol/L) significantly rose transiently in two BCG-treated subjects (means: 3.49 pmol/L [95% CI 2.95–3.8], 2.57 [95% CI 1.65–3.49]) and the EBV-infected subject (3.16 [95% CI 2.54–3.69]) vs.1.65 [95% CI 1.55–3.2] in reference diabetic subjects. BCG-treated subjects each had more than 50% of their C-peptide values above the 95th percentile of the reference subjects. The EBV-infected subject had 18% of C-peptide values above this level. Conclusions/Significance: We conclude that BCG treatment or EBV infection transiently modified the autoimmunity that underlies type 1 diabetes by stimulating the host innate immune response. This suggests that BCG or other stimulators of host innate immunity may have value in the treatment of long-term diabetes. Trial Registration ClinicalTrials.gov NCT0060723
Novel Automated Blood Separations Validate Whole Cell Biomarkers
Progress in clinical trials in infectious disease, autoimmunity, and cancer is stymied by a dearth of successful whole cell biomarkers for peripheral blood lymphocytes (PBLs). Successful biomarkers could help to track drug effects at early time points in clinical trials to prevent costly trial failures late in development. One major obstacle is the inaccuracy of Ficoll density centrifugation, the decades-old method of separating PBLs from the abundant red blood cells (RBCs) of fresh blood samples.To replace the Ficoll method, we developed and studied a novel blood-based magnetic separation method. The magnetic method strikingly surpassed Ficoll in viability, purity and yield of PBLs. To reduce labor, we developed an automated platform and compared two magnet configurations for cell separations. These more accurate and labor-saving magnet configurations allowed the lymphocytes to be tested in bioassays for rare antigen-specific T cells. The automated method succeeded at identifying 79% of patients with the rare PBLs of interest as compared with Ficoll's uniform failure. We validated improved upfront blood processing and show accurate detection of rare antigen-specific lymphocytes.Improving, automating and standardizing lymphocyte detections from whole blood may facilitate development of new cell-based biomarkers for human diseases. Improved upfront blood processes may lead to broad improvements in monitoring early trial outcome measurements in human clinical trials
Multimodality imaging to identify lipid-rich coronary plaques and predict periprocedural myocardial injury: Association between near-infrared spectroscopy and coronary computed tomography angiography
BackgroundThis study compares the efficacy of coronary computed tomography angiography (CCTA) and near-infrared spectroscopy intravascular ultrasound (NIRS–IVUS) in patients with significant coronary stenosis for predicting periprocedural myocardial injury during percutaneous coronary intervention (PCI).MethodsWe prospectively enrolled 107 patients who underwent CCTA before PCI and performed NIRS–IVUS during PCI. Based on the maximal lipid core burden index for any 4-mm longitudinal segments (maxLCBI4mm) in the culprit lesion, we divided the patients into two groups: lipid-rich plaque (LRP) group (maxLCBI4mm ≥ 400; n = 48) and no-LRP group (maxLCBI4mm < 400; n = 59). Periprocedural myocardial injury was a postprocedural cardiac troponin T (cTnT) elevation of ≥5 times the upper limit of normal.ResultsThe LRP group had a significantly higher cTnT (p = 0.026), lower CT density (p < 0.001), larger percentage atheroma volume (PAV) by NIRS–IVUS (p = 0.036), and larger remodeling index measured by both CCTA (p = 0.020) and NIRS–IVUS (p < 0.001). A significant negative linear correlation was found between maxLCBI4mm and CT density (rho = −0.552, p < 0.001). Multivariable logistic regression analysis identified maxLCBI4mm [odds ratio (OR): 1.006, p = 0.003] and PAV (OR: 1.125, p = 0.014) as independent predictors of periprocedural myocardial injury, while CT density was not an independent predictor (OR: 0.991, p = 0.22).ConclusionCCTA and NIRS–IVUS correlated well to identify LRP in culprit lesions. However, NIRS–IVUS was more competent in predicting the risk of periprocedural myocardial injury
- …