140 research outputs found

    Learning from Multi-Class Imbalanced Big Data with Apache Spark

    Get PDF
    With data becoming a new form of currency, its analysis has become a top priority in both academia and industry, furthering advancements in high-performance computing and machine learning. However, these large, real-world datasets come with additional complications such as noise and class overlap. Problems are magnified when with multi-class data is presented, especially since many of the popular algorithms were originally designed for binary data. Another challenge arises when the number of examples are not evenly distributed across all classes in a dataset. This often causes classifiers to favor the majority class over the minority classes, leading to undesirable results as learning from the rare cases may be the primary goal. Many of the classic machine learning algorithms were not designed for multi-class, imbalanced data or parallelism, and so their effectiveness has been hindered. This dissertation addresses some of these challenges with in-depth experimentation using novel implementations of machine learning algorithms using Apache Spark, a distributed computing framework based on the MapReduce model designed to handle very large datasets. Experimentation showed that many of the traditional classifier algorithms do not translate well to a distributed computing environment, indicating the need for a new generation of algorithms targeting modern high-performance computing. A collection of popular oversampling methods, originally designed for small binary class datasets, have been implemented using Apache Spark for the first time to improve parallelism and add multi-class support. An extensive study on how instance level difficulty affects the learning from large datasets was also performed

    Effect of variations in atelectasis on tumor displacement during radiation therapy for locally advanced lung cancer

    Get PDF
    Purpose Atelectasis (AT), or collapsed lung, is frequently associated with central lung tumors. We investigated the variation of atelectasis volumes during radiation therapy and analyzed the effect of AT volume changes on the reproducibility of the primary tumor (PT) position. Methods and materials Twelve patients with lung cancer who had AT and 10 patients without AT underwent repeated 4-dimensional fan beam computed tomography (CT) scans during radiation therapy per protocols that were approved by the institutional review board. Interfraction volume changes of AT and PT were correlated with PT displacements relative to bony anatomy using both a bounding box (BB) method and change in center of mass (COM). Linear regression modeling was used to determine whether PT and AT volume changes were independently associated with PT displacement. PT displacement was compared between patients with and without AT. Results The mean initial AT volume on the planning CT was 189 cm3 (37-513 cm3), and the mean PT volume was 93 cm3 (12-176 cm3). During radiation therapy, AT and PT volumes decreased on average 136.7 cm3 (20-369 cm3) for AT and 40 cm3 (−7 to 131 cm3) for PT. Eighty-three percent of patients with AT had at least one unidirectional PT shift that was greater than 0.5 cm outside of the initial BB during treatment. In patients with AT, the maximum PT COM shift was ≥0.5 cm in all patients and \u3e1 cm in 58% of patients (0.5-2.4 cm). Changes in PT and AT volumes were independently associated with PT displacement (P \u3c .01), and the correlation was smaller with COM (R2 = 0.58) compared with the BB method (R2 = 0.80). The median root mean squared PT displacement with the BB method was significantly less for patients without AT (0.45 cm) compared with those with AT (0.8cm, P = .002). Conclusions Changes in AT and PT volumes during radiation treatment were significantly associated with PT displacements that often exceeded standard setup margins. Repeated 3-dimensional imaging is recommended in patients with AT to evaluate for PT displacements during treatment. Summary This study analyzed 12 patients with atelectasis and 10 patients without atelectasis who underwent repeat 4-dimensional fan beam computed tomography during radiation therapy. Patients with atelectasis had significantly greater tumor displacements than patients without atelectasis, and these tumor displacements often exceeded standard setup margins. Patients with atelectasis may benefit from repeated 3-dimensional imaging during radiation therapy and possible replanning for large tumor displacements

    Effect of variations in atelectasis on tumor displacement during radiation therapy for locally advanced lung cancer

    Get PDF
    Purpose Atelectasis (AT), or collapsed lung, is frequently associated with central lung tumors. We investigated the variation of atelectasis volumes during radiation therapy and analyzed the effect of AT volume changes on the reproducibility of the primary tumor (PT) position. Methods and materials Twelve patients with lung cancer who had AT and 10 patients without AT underwent repeated 4-dimensional fan beam computed tomography (CT) scans during radiation therapy per protocols that were approved by the institutional review board. Interfraction volume changes of AT and PT were correlated with PT displacements relative to bony anatomy using both a bounding box (BB) method and change in center of mass (COM). Linear regression modeling was used to determine whether PT and AT volume changes were independently associated with PT displacement. PT displacement was compared between patients with and without AT. Results The mean initial AT volume on the planning CT was 189 cm3 (37-513 cm3), and the mean PT volume was 93 cm3 (12-176 cm3). During radiation therapy, AT and PT volumes decreased on average 136.7 cm3 (20-369 cm3) for AT and 40 cm3 (−7 to 131 cm3) for PT. Eighty-three percent of patients with AT had at least one unidirectional PT shift that was greater than 0.5 cm outside of the initial BB during treatment. In patients with AT, the maximum PT COM shift was ≥0.5 cm in all patients and \u3e1 cm in 58% of patients (0.5-2.4 cm). Changes in PT and AT volumes were independently associated with PT displacement (P \u3c .01), and the correlation was smaller with COM (R2 = 0.58) compared with the BB method (R2 = 0.80). The median root mean squared PT displacement with the BB method was significantly less for patients without AT (0.45 cm) compared with those with AT (0.8cm, P = .002). Conclusions Changes in AT and PT volumes during radiation treatment were significantly associated with PT displacements that often exceeded standard setup margins. Repeated 3-dimensional imaging is recommended in patients with AT to evaluate for PT displacements during treatment. Summary This study analyzed 12 patients with atelectasis and 10 patients without atelectasis who underwent repeat 4-dimensional fan beam computed tomography during radiation therapy. Patients with atelectasis had significantly greater tumor displacements than patients without atelectasis, and these tumor displacements often exceeded standard setup margins. Patients with atelectasis may benefit from repeated 3-dimensional imaging during radiation therapy and possible replanning for large tumor displacements
    • …
    corecore