24 research outputs found

    Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts using Multilayer Networks

    Full text link
    Discourse cohesion facilitates text comprehension and helps the reader form a coherent narrative. In this study, we aim to computationally analyze the discourse cohesion in scientific scholarly texts using multilayer network representation and quantify the writing quality of the document. Exploiting the hierarchical structure of scientific scholarly texts, we design section-level and document-level metrics to assess the extent of lexical cohesion in text. We use a publicly available dataset along with a curated set of contrasting examples to validate the proposed metrics by comparing them against select indices computed using existing cohesion analysis tools. We observe that the proposed metrics correlate as expected with the existing cohesion indices. We also present an analytical framework, CHIAA (CHeck It Again, Author), to provide pointers to the author for potential improvements in the manuscript with the help of the section-level and document-level metrics. The proposed CHIAA framework furnishes a clear and precise prescription to the author for improving writing by localizing regions in text with cohesion gaps. We demonstrate the efficacy of CHIAA framework using succinct examples from cohesion-deficient text excerpts in the experimental dataset.Comment: 26 pages, 8 figures, 4 table

    NOVEL ACCESSIBILITY METRICS BASED ON HIERARCHICAL DECOMPOSITION OF TRANSPORT NETWORKS

    Get PDF
    Scientific analysis of public transport systems at the urban, regional, and national levels is vital in this contemporary, highly connected world. Quantifying the accessibility of nodes (locations) in a transport network is considered a holistic measure of transportation and land use and an important research area. In recent years, complex networks have been employed for modeling and analyzing the topology of transport systems and services networks. However, the design of network hierarchy-based accessibility measures has not been fully explored in transport research. Thus, we propose a set of three novel accessibility metrics based on the k-core decomposition of the transport network. Core-based accessibility metrics leverage the network topology by eliciting the hierarchy while accommodating variations like travel cost, travel time, distance, and frequency of service as edge weights. The proposed metrics quantify the accessibility of nodes at different geographical scales, ranging from local to global. We use these metrics to compute the accessibility of geographical locations connected by air transport services in India. Finally, we show that the measures are responsive to changes in the topology of the transport network by analyzing the changes in accessibility for the domestic air services network for both pre-covid and post-covid times

    Results from the Supernova Photometric Classification Challenge

    Get PDF
    We report results from the Supernova Photometric Classification Challenge (SNPCC), a publicly released mix of simulated supernovae (SNe), with types (Ia, Ibc, and II) selected in proportion to their expected rate. The simulation was realized in the griz filters of the Dark Energy Survey (DES) with realistic observing conditions (sky noise, point-spread function and atmospheric transparency) based on years of recorded conditions at the DES site. Simulations of non-Ia type SNe are based on spectroscopically confirmed light curves that include unpublished non-Ia samples donated from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan Digital Sky Survey-II (SDSS-II). A spectroscopically confirmed subset was provided for training. We challenged scientists to run their classification algorithms and report a type and photo-z for each SN. Participants from 10 groups contributed 13 entries for the sample that included a host-galaxy photo-z for each SN, and 9 entries for the sample that had no redshift information. Several different classification strategies resulted in similar performance, and for all entries the performance was significantly better for the training subset than for the unconfirmed sample. For the spectroscopically unconfirmed subset, the entry with the highest average figure of merit for classifying SNe~Ia has an efficiency of 0.96 and an SN~Ia purity of 0.79. As a public resource for the future development of photometric SN classification and photo-z estimators, we have released updated simulations with improvements based on our experience from the SNPCC, added samples corresponding to the Large Synoptic Survey Telescope (LSST) and the SDSS, and provided the answer keys so that developers can evaluate their own analysis.Comment: accepted by PAS

    Not Available

    No full text
    Not AvailableCluster ensemble technique has attracted serious attention in the area of unsupervised learning. It aims at improving robustness and quality of clustering scheme, particularly in scenarios where either randomization or sampling is the part of the clustering algorithm. In this paper, we address the problem of instability and non robustness in K-means clusterings. These problems arise naturally because of random seed selection by the algorithm, order sensitivity of the algorithm and presence of noise and outliers in data. We propose a cluster ensemble method based on Discriminant Analysis to obtain robust clustering using K-means clusterer. The proposed algorithm operates in three phases. The first phase is preparatory in which multiple clustering schemes generated and the cluster correspondence is obtained. The second phase uses discriminant analysis and constructs a label matrix. In the final stage, consensus partition is generated and noise, if any, is segregated. Experimental analysis using standard public data sets provides strong empirical evidence of the high quality of resultant clustering scheme.Not Availabl

    Big Data Analytics

    No full text
    corecore