47 research outputs found

    Elliptical cost-sensitive decision tree algorithm - ECSDT

    Get PDF
    Cost-sensitive multiclass classification problems, in which the task of assessing the impact of the costs associated with different misclassification errors, continues to be one of the major challenging areas for data mining and machine learning. The literature reviews in this area show that most of the cost-sensitive algorithms that have been developed during the last decade were developed to solve binary classification problems where an example from the dataset will be classified into only one of two available classes. Much of the research on cost-sensitive learning has focused on inducing decision trees, which are one of the most common and widely used classification methods, due to the simplicity of constructing them, their transparency and comprehensibility. A review of the literature shows that inducing nonlinear multiclass cost-sensitive decision trees is still in its early stages and further research could result in improvements over the current state of the art. Hence, this research aims to address the following question: How can non-linear regions be identified for multiclass problems and utilized to construct decision trees so as to maximize the accuracy of classification, and minimize misclassification costs? This research addresses this problem by developing a new algorithm called the Elliptical Cost-Sensitive Decision Tree algorithm (ECSDT) that induces cost-sensitive non-linear (elliptical) decision trees for multiclass classification problems using evolutionary optimization methods such as particle swarm optimization (PSO) and Genetic Algorithms (GAs). In this research, ellipses are used as non-linear separators, because of their simplicity and flexibility in drawing non-linear boundaries by modifying and adjusting their size, location and rotation towards achieving optimal results. The new algorithm was developed, tested, and evaluated in three different settings, each with a different objective function. The first considered maximizing the accuracy of classification only; the second focused on minimizing misclassification costs only, while the third considered both accuracy and misclassification cost together. ECSDT was applied to fourteen different binary-class and multiclass data sets and the results have been compared with those obtained by applying some common algorithms from Weka to the same datasets such as J48, NBTree, MetaCost, and the CostSensitiveClassifier. The primary contribution of this research is the development of a new algorithm that shows the benefits of utilizing elliptical boundaries for cost-sensitive decision tree learning. The new algorithm is capable of handling multiclass problems and an empirical evaluation shows good results. More specifically, when considering accuracy only, ECSDT performs better in terms of maximizing accuracy on 10 out of the 14 datasets, and when considering minimizing misclassification costs only, ECSDT performs better on 10 out of the 14 datasets, while when considering both accuracy and misclassification costs, ECSDT was able to obtain higher accuracy on 10 out of the 14 datasets and minimize misclassification costs on 5 out of the 14 datasets. The ECSDT also was able to produce smaller trees when compared with J48, LADTree and ADTree

    A powerful bursting radio source towards the Galactic Centre

    Full text link
    Transient astronomical sources are typically powered by compact objects and usually signify highly explosive or dynamic events. While radio astronomy has an impressive record of obtaining high time resolution observations, usually it is achieved in quite narrow fields-of-view. Consequently, the dynamic radio sky is poorly sampled, in contrast to the situation in the X- and gamma-ray bands in which wide-field instruments routinely detect transient sources. Here we report a new transient source, GCRT J1745-3009, detected in 2002 during a moderately wide-field radio transient monitoring program of the Galactic center (GC) region at 0.33 GHz. The characteristics of its bursts are unlike those known for any other class of radio transient. If located in or near the GC, its brightness temperature (~10^16 K) and the implied energy density within GCRT J1745-3009 vastly exceeds that observed in most other classes of radio astronomical sources, and is consistent with coherent emission processes rarely observed. We conclude that GCRT J1745-3009 is the first member of a new class of radio transient sources, the first of possibly many new classes to be identified through current and upcoming radio surveys.Comment: 16 pages including 3 figures. Appears in Nature, 3 March 200

    The projection score - an evaluation criterion for variable subset selection in PCA visualization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In many scientific domains, it is becoming increasingly common to collect high-dimensional data sets, often with an exploratory aim, to generate new and relevant hypotheses. The exploratory perspective often makes statistically guided visualization methods, such as Principal Component Analysis (PCA), the methods of choice. However, the clarity of the obtained visualizations, and thereby the potential to use them to formulate relevant hypotheses, may be confounded by the presence of the many non-informative variables. For microarray data, more easily interpretable visualizations are often obtained by filtering the variable set, for example by removing the variables with the smallest variances or by only including the variables most highly related to a specific response. The resulting visualization may depend heavily on the inclusion criterion, that is, effectively the number of retained variables. To our knowledge, there exists no objective method for determining the optimal inclusion criterion in the context of visualization.</p> <p>Results</p> <p>We present the projection score, which is a straightforward, intuitively appealing measure of the informativeness of a variable subset with respect to PCA visualization. This measure can be universally applied to find suitable inclusion criteria for any type of variable filtering. We apply the presented measure to find optimal variable subsets for different filtering methods in both microarray data sets and synthetic data sets. We note also that the projection score can be applied in general contexts, to compare the informativeness of any variable subsets with respect to visualization by PCA.</p> <p>Conclusions</p> <p>We conclude that the projection score provides an easily interpretable and universally applicable measure of the informativeness of a variable subset with respect to visualization by PCA, that can be used to systematically find the most interpretable PCA visualization in practical exploratory analysis.</p

    Discerning natural and anthropogenic organic matter inputs to salt marsh sediments of Ria Formosa lagoon (South Portugal)

    Get PDF
    Sedimentary organic matter (OM) origin and molecular composition provide useful information to understand carbon cycling in coastal wetlands. Core sediments from threors' Contributionse transects along Ria Formosa lagoon intertidal zone were analysed using analytical pyrolysis (Py-GC/MS) to determine composition, distribution and origin of sedimentary OM. The distribution of alkyl compounds (alkanes, alkanoic acids and alkan-2-ones), polycyclic aromatic hydrocarbons (PAHs), lignin-derived methoxyphenols, linear alkylbenzenes (LABs), steranes and hopanes indicated OM inputs to the intertidal environment from natural-autochthonous and allochthonous-as well as anthropogenic. Several n-alkane geochemical indices used to assess the distribution of main OM sources (terrestrial and marine) in the sediments indicate that algal and aquatic macrophyte derived OM inputs dominated over terrigenous plant sources. The lignin-derived methoxyphenol assemblage, dominated by vinylguaiacol and vinylsyringol derivatives in all sediments, points to large OM contribution from higher plants. The spatial distributions of PAHs (polyaromatic hydrocarbons) showed that most pollution sources were mixed sources including both pyrogenic and petrogenic. Low carbon preference indexes (CPI > 1) for n-alkanes, the presence of UCM (unresolved complex mixture) and the distribution of hopanes (C-29-C-36) and steranes (C-27-C-29) suggested localized petroleum-derived hydrocarbon inputs to the core sediments. Series of LABs were found in most sediment samples also pointing to domestic sewage anthropogenic contributions to the sediment OM.EU Erasmus Mundus Joint Doctorate fellowship (FUECA, University of Cadiz, Spain)EUEuropean Commission [FP7-ENV-2011, 282845, FP7-534 ENV-2012, 308392]MINECO project INTERCARBON [CGL2016-78937-R]info:eu-repo/semantics/publishedVersio

    Reduced expression of BAX is associated with poor prognosis in patients with epithelial ovarian cancer: a multifactorial analysis of TP53, p21, BAX and BCL-2

    Get PDF
    Traditional clinicopathological features do not predict which patients will develop chemotherapy resistance. The TP53 gene is frequently altered in ovarian cancer but its prognostic implications are controversial. Little is known on the impact of TP53-downstream genes on prognosis. Using molecular and immunohistochemical analyses we examined TP53 and its downstream genes p21 BAX and BCL-2 in ovarian tumour tissues and have evaluated the results in relation to clinico-pathological parameters, clinical outcome and response to platinum-based chemotherapy. Associations of tested factors and patient and tumour characteristics were studied by Spearman rank correlation and Pearsons χ2 test. The Cox proportional hazard model was used for univariate and multivariate analysis. The associations of tested factors with response was tested using logistic regression analysis. TP53 mutation, p21 and BCL-2 expression were not associated with increased rates of progression and death. Expression of TP53 was associated with a shorter overall survival only (relative hazard rate [RHR] 2.01 P = 0.03). Interestingly, when combining TP53 mutation and expression data, this resulted in an increased association with overall survival (P = 0.008). BAX expression was found to be associated with both progression-free (RHR 0.44 P = 0.05) and overall survival (RHR 0.42 P = 0.03). Those patients who simultaneously expressed BAX and BCL-2 had a longer progression-free and overall survival compared to patients whose tumours did not express BCL-2 (P = 0.05 and 0.015 respectively). No relations were observed between tested factors and response to platinum-based chemotherapy. We conclude that BAX expression may represent a prognostic indicator for patients with ovarian cancer and that the combined evaluation of BAX and BCL-2 may provide additional prognostic significance.   http://www.bjcancer.com © 2001 Cancer Research Campaig

    SPARC 2016 Salford postgraduate annual research conference book of abstracts

    Get PDF

    Significance of vascular endothelial growth factor in growth and peritoneal dissemination of ovarian cancer

    Get PDF
    Vascular endothelial growth factor (VEGF) is a key regulator of angiogenesis which drives endothelial cell survival, proliferation, and migration while increasing vascular permeability. Playing an important role in the physiology of normal ovaries, VEGF has also been implicated in the pathogenesis of ovarian cancer. Essentially by promoting tumor angiogenesis and enhancing vascular permeability, VEGF contributes to the development of peritoneal carcinomatosis associated with malignant ascites formation, the characteristic feature of advanced ovarian cancer at diagnosis. In both experimental and clinical studies, VEGF levels have been inversely correlated with survival. Moreover, VEGF inhibition has been shown to inhibit tumor growth and ascites production and to suppress tumor invasion and metastasis. These findings have laid the basis for the clinical evaluation of agents targeting VEGF signaling pathway in patients with ovarian cancer. In this review, we will focus on VEGF involvement in the pathophysiology of ovarian cancer and its contribution to the disease progression and dissemination

    Multi-messenger observations of a binary neutron star merger

    Get PDF
    On 2017 August 17 a binary neutron star coalescence candidate (later designated GW170817) with merger time 12:41:04 UTC was observed through gravitational waves by the Advanced LIGO and Advanced Virgo detectors. The Fermi Gamma-ray Burst Monitor independently detected a gamma-ray burst (GRB 170817A) with a time delay of ~1.7 s with respect to the merger time. From the gravitational-wave signal, the source was initially localized to a sky region of 31 deg2 at a luminosity distance of 40+8-8 Mpc and with component masses consistent with neutron stars. The component masses were later measured to be in the range 0.86 to 2.26 Mo. An extensive observing campaign was launched across the electromagnetic spectrum leading to the discovery of a bright optical transient (SSS17a, now with the IAU identification of AT 2017gfo) in NGC 4993 (at ~40 Mpc) less than 11 hours after the merger by the One- Meter, Two Hemisphere (1M2H) team using the 1 m Swope Telescope. The optical transient was independently detected by multiple teams within an hour. Subsequent observations targeted the object and its environment. Early ultraviolet observations revealed a blue transient that faded within 48 hours. Optical and infrared observations showed a redward evolution over ~10 days. Following early non-detections, X-ray and radio emission were discovered at the transient’s position ~9 and ~16 days, respectively, after the merger. Both the X-ray and radio emission likely arise from a physical process that is distinct from the one that generates the UV/optical/near-infrared emission. No ultra-high-energy gamma-rays and no neutrino candidates consistent with the source were found in follow-up searches. These observations support the hypothesis that GW170817 was produced by the merger of two neutron stars in NGC4993 followed by a short gamma-ray burst (GRB 170817A) and a kilonova/macronova powered by the radioactive decay of r-process nuclei synthesized in the ejecta
    corecore