84 research outputs found
On the relationship between Gaussian stochastic blockmodels and label propagation algorithms
The problem of community detection receives great attention in recent years.
Many methods have been proposed to discover communities in networks. In this
paper, we propose a Gaussian stochastic blockmodel that uses Gaussian
distributions to fit weight of edges in networks for non-overlapping community
detection. The maximum likelihood estimation of this model has the same
objective function as general label propagation with node preference. The node
preference of a specific vertex turns out to be a value proportional to the
intra-community eigenvector centrality (the corresponding entry in principal
eigenvector of the adjacency matrix of the subgraph inside that vertex's
community) under maximum likelihood estimation. Additionally, the maximum
likelihood estimation of a constrained version of our model is highly related
to another extension of label propagation algorithm, namely, the label
propagation algorithm under constraint. Experiments show that the proposed
Gaussian stochastic blockmodel performs well on various benchmark networks.Comment: 22 pages, 17 figure
Facile Fabrication of Porous Conductive Thermoplastic Polyurethane Nanocomposite Films via Solution Casting
Content of Dataset
1. FTIR spectra
2. Tensile Properties
3. Conductivity
4. Piezoresistive Properties
5. Resistance vs. Strain
6. Porosity
Notes : This dataset is linked to Paper: Scientific Reports,2017,DOI: 10.1038/s41598-017-17647-
Ranking and Retrieval under Semantic Relevance
This thesis presents a series of conceptual and empirical developments on the ranking and retrieval of candidates under semantic relevance. Part I of the thesis introduces the concept of uncertainty in various semantic tasks (such as recognizing textual entailment) in natural language processing, and the machine learning techniques commonly employed to model these semantic phenomena. A unified view of ranking and retrieval will be presented, and the trade-off between model expressiveness, performance, and scalability in model design will be discussed.
Part II of the thesis focuses on applying these ranking and retrieval techniques to text: Chapter 3 examines the feasibility of ranking hypotheses given a premise with respect to a human's subjective probability of the hypothesis happening, effectively extending the traditional categorical task of natural language inference. Chapter 4 focuses on detecting situation frames for documents using ranking methods. Then we extend the ranking notion to retrieval, and develop both sparse (Chapter 5) and dense (Chapter 6) vector-based methods to facilitate scalable retrieval for potential answer paragraphs in question answering.
Part III turns the focus to mentions and entities in text, while continuing the theme on ranking and retrieval: Chapter 7 discusses the ranking of fine-grained types that an entity mention could belong to, leading to state-of-the-art performance on hierarchical multi-label fine-grained entity typing. Chapter 8 extends the semantic relation of coreference to a cross-document setting, enabling models to retrieve from a large corpus, instead of in a single document, when resolving coreferent entity mentions
Learning to Rank for Plausible Plausibility
Researchers illustrate improvements in contextual encoding strategies via
resultant performance on a battery of shared Natural Language Understanding
(NLU) tasks. Many of these tasks are of a categorical prediction variety: given
a conditioning context (e.g., an NLI premise), provide a label based on an
associated prompt (e.g., an NLI hypothesis). The categorical nature of these
tasks has led to common use of a cross entropy log-loss objective during
training. We suggest this loss is intuitively wrong when applied to
plausibility tasks, where the prompt by design is neither categorically
entailed nor contradictory given the context. Log-loss naturally drives models
to assign scores near 0.0 or 1.0, in contrast to our proposed use of a
margin-based loss. Following a discussion of our intuition, we describe a
confirmation study based on an extreme, synthetically curated task derived from
MultiNLI. We find that a margin-based loss leads to a more plausible model of
plausibility. Finally, we illustrate improvements on the Choice Of Plausible
Alternative (COPA) task through this change in loss.Comment: To appear in ACL 201
A Unified View of Evaluation Metrics for Structured Prediction
We present a conceptual framework that unifies a variety of evaluation
metrics for different structured prediction tasks (e.g. event and relation
extraction, syntactic and semantic parsing). Our framework requires
representing the outputs of these tasks as objects of certain data types, and
derives metrics through matching of common substructures, possibly followed by
normalization. We demonstrate how commonly used metrics for a number of tasks
can be succinctly expressed by this framework, and show that new metrics can be
naturally derived in a bottom-up way based on an output structure. We release a
library that enables this derivation to create new metrics. Finally, we
consider how specific characteristics of tasks motivate metric design
decisions, and suggest possible modifications to existing metrics in line with
those motivations.Comment: Accepted at EMNLP2023 Main Trac
- …