24 research outputs found
Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts using Multilayer Networks
Discourse cohesion facilitates text comprehension and helps the reader form a
coherent narrative. In this study, we aim to computationally analyze the
discourse cohesion in scientific scholarly texts using multilayer network
representation and quantify the writing quality of the document. Exploiting the
hierarchical structure of scientific scholarly texts, we design section-level
and document-level metrics to assess the extent of lexical cohesion in text. We
use a publicly available dataset along with a curated set of contrasting
examples to validate the proposed metrics by comparing them against select
indices computed using existing cohesion analysis tools. We observe that the
proposed metrics correlate as expected with the existing cohesion indices.
We also present an analytical framework, CHIAA (CHeck It Again, Author), to
provide pointers to the author for potential improvements in the manuscript
with the help of the section-level and document-level metrics. The proposed
CHIAA framework furnishes a clear and precise prescription to the author for
improving writing by localizing regions in text with cohesion gaps. We
demonstrate the efficacy of CHIAA framework using succinct examples from
cohesion-deficient text excerpts in the experimental dataset.Comment: 26 pages, 8 figures, 4 table
NOVEL ACCESSIBILITY METRICS BASED ON HIERARCHICAL DECOMPOSITION OF TRANSPORT NETWORKS
Scientific analysis of public transport systems at the urban, regional, and national levels is vital in this contemporary, highly connected world. Quantifying the accessibility of nodes (locations) in a transport network is considered a holistic measure of transportation and land use and an important research area. In recent years, complex networks have been employed for modeling and analyzing the topology of transport systems and services networks. However, the design of network hierarchy-based accessibility measures has not been fully explored in transport research. Thus, we propose a set of three novel accessibility metrics based on the k-core decomposition of the transport network. Core-based accessibility metrics leverage the network topology by eliciting the hierarchy while accommodating variations like travel cost, travel time, distance, and frequency of service as edge weights. The proposed metrics quantify the accessibility of nodes at different geographical scales, ranging from local to global. We use these metrics to compute the accessibility of geographical locations connected by air transport services in India. Finally, we show that the measures are responsive to changes in the topology of the transport network by analyzing the changes in accessibility for the domestic air services network for both pre-covid and post-covid times
Results from the Supernova Photometric Classification Challenge
We report results from the Supernova Photometric Classification Challenge
(SNPCC), a publicly released mix of simulated supernovae (SNe), with types (Ia,
Ibc, and II) selected in proportion to their expected rate. The simulation was
realized in the griz filters of the Dark Energy Survey (DES) with realistic
observing conditions (sky noise, point-spread function and atmospheric
transparency) based on years of recorded conditions at the DES site.
Simulations of non-Ia type SNe are based on spectroscopically confirmed light
curves that include unpublished non-Ia samples donated from the Carnegie
Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan
Digital Sky Survey-II (SDSS-II). A spectroscopically confirmed subset was
provided for training. We challenged scientists to run their classification
algorithms and report a type and photo-z for each SN. Participants from 10
groups contributed 13 entries for the sample that included a host-galaxy
photo-z for each SN, and 9 entries for the sample that had no redshift
information. Several different classification strategies resulted in similar
performance, and for all entries the performance was significantly better for
the training subset than for the unconfirmed sample. For the spectroscopically
unconfirmed subset, the entry with the highest average figure of merit for
classifying SNe~Ia has an efficiency of 0.96 and an SN~Ia purity of 0.79. As a
public resource for the future development of photometric SN classification and
photo-z estimators, we have released updated simulations with improvements
based on our experience from the SNPCC, added samples corresponding to the
Large Synoptic Survey Telescope (LSST) and the SDSS, and provided the answer
keys so that developers can evaluate their own analysis.Comment: accepted by PAS
Not Available
Not AvailableCluster ensemble technique has attracted serious attention in the area of unsupervised learning. It aims at improving robustness and quality of clustering scheme, particularly in scenarios where either randomization or sampling is the part of the clustering algorithm. In this paper, we address the problem of instability and non robustness in K-means clusterings. These problems arise naturally because of random seed selection by the algorithm, order sensitivity of the algorithm and presence of noise and outliers in data. We propose a cluster ensemble method based on Discriminant Analysis to obtain robust clustering using K-means clusterer. The proposed algorithm operates in three phases. The first phase is preparatory in which multiple clustering schemes generated and the cluster correspondence is obtained. The second phase uses discriminant analysis and constructs a label matrix. In the final stage, consensus partition is generated and noise, if any, is segregated. Experimental analysis using standard public data sets provides strong empirical evidence of the high quality of resultant clustering scheme.Not Availabl
Big Data AnalyticsFirst International Conference, BDA 2012, New Delhi, India, December 24-26, 2012. Proceedings /
XIV, 181 p. 83 illus.online resource