2,194 research outputs found
Adposition and Case Supersenses v2.5: Guidelines for English
This document offers a detailed linguistic description of SNACS (Semantic
Network of Adposition and Case Supersenses; Schneider et al., 2018), an
inventory of 50 semantic labels ("supersenses") that characterize the use of
adpositions and case markers at a somewhat coarse level of granularity, as
demonstrated in the STREUSLE corpus (https://github.com/nert-gu/streusle/;
version 4.3 tracks guidelines version 2.5). Though the SNACS inventory aspires
to be universal, this document is specific to English; documentation for other
languages will be published separately.
Version 2 is a revision of the supersense inventory proposed for English by
Schneider et al. (2015, 2016) (henceforth "v1"), which in turn was based on
previous schemes. The present inventory was developed after extensive review of
the v1 corpus annotations for English, plus previously unanalyzed genitive case
possessives (Blodgett and Schneider, 2018), as well as consideration of
adposition and case phenomena in Hebrew, Hindi, Korean, and German. Hwang et
al. (2017) present the theoretical underpinnings of the v2 scheme. Schneider et
al. (2018) summarize the scheme, its application to English corpus data, and an
automatic disambiguation task
Contextual Object Detection with a Few Relevant Neighbors
A natural way to improve the detection of objects is to consider the
contextual constraints imposed by the detection of additional objects in a
given scene. In this work, we exploit the spatial relations between objects in
order to improve detection capacity, as well as analyze various properties of
the contextual object detection problem. To precisely calculate context-based
probabilities of objects, we developed a model that examines the interactions
between objects in an exact probabilistic setting, in contrast to previous
methods that typically utilize approximations based on pairwise interactions.
Such a scheme is facilitated by the realistic assumption that the existence of
an object in any given location is influenced by only few informative locations
in space. Based on this assumption, we suggest a method for identifying these
relevant locations and integrating them into a mostly exact calculation of
probability based on their raw detector responses. This scheme is shown to
improve detection results and provides unique insights about the process of
contextual inference for object detection. We show that it is generally
difficult to learn that a particular object reduces the probability of another,
and that in cases when the context and detector strongly disagree this learning
becomes virtually impossible for the purposes of improving the results of an
object detector. Finally, we demonstrate improved detection results through use
of our approach as applied to the PASCAL VOC and COCO datasets
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
One-class support vector machine (OC-SVM) for a long time has been one of the
most effective anomaly detection methods and extensively adopted in both
research as well as industrial applications. The biggest issue for OC-SVM is
yet the capability to operate with large and high-dimensional datasets due to
optimization complexity. Those problems might be mitigated via dimensionality
reduction techniques such as manifold learning or autoencoder. However,
previous work often treats representation learning and anomaly prediction
separately. In this paper, we propose autoencoder based one-class support
vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier
features to approximate the radial basis kernel, into deep learning context by
combining it with a representation learning architecture and jointly exploit
stochastic gradient descent to obtain end-to-end training. Interestingly, this
also opens up the possible use of gradient-based attribution methods to explain
the decision making for anomaly detection, which has ever been challenging as a
result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the
interpretability of deep learning in anomaly detection. We evaluate our method
on a wide range of unsupervised anomaly detection tasks in which our end-to-end
training architecture achieves a performance significantly better than the
previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
‘Together … for only a moment’ British newspaper constructions of altruistic non-commercial surrogate motherhood
Objectives: To explore how national altruistic surrogacy is framed in a representative selection of the British press.
Methods: A study of 90 British national newspaper articles was carried out using the Lexis-Nexis data base to search for articles on altruistic surrogacy. Content analysis of gain, loss, neutral frames and high or low alarm and vulnerability frames in the titles and the body of the text was carried out. The type of construction used in the article content was also analysed. Data were coded and consensus reached using a coding strategy specifically developed for the purposes of this study.
Results: Titles and content were predominantly loss, high alarm and high vulnerability framed. The content was also gain framed, and written with a focus on the social and legal aspects differentially between the newspaper types.
Discussion: The tabloid press emphasizes social issues, and the middle market and serious press focus on legal issues of altruistic surrogacy. Selectively framed and reinforced information provided by the different newspapers, reflect the different readership, with Tabloid readers likely to be, surrogates (mostly from lower socioeconomic strata) and serious/ middle-market readers likely to be commissioning parents (mostly professionals)
Primitive Words, Free Factors and Measure Preservation
Let F_k be the free group on k generators. A word w \in F_k is called
primitive if it belongs to some basis of F_k. We investigate two criteria for
primitivity, and consider more generally, subgroups of F_k which are free
factors.
The first criterion is graph-theoretic and uses Stallings core graphs: given
subgroups of finite rank H \le J \le F_k we present a simple procedure to
determine whether H is a free factor of J. This yields, in particular, a
procedure to determine whether a given element in F_k is primitive.
Again let w \in F_k and consider the word map w:G x G x ... x G \to G (from
the direct product of k copies of G to G), where G is an arbitrary finite
group. We call w measure preserving if given uniform measure on G x G x ... x
G, w induces uniform measure on G (for every finite G). This is the second
criterion we investigate: it is not hard to see that primitivity implies
measure preservation and it was conjectured that the two properties are
equivalent. Our combinatorial approach to primitivity allows us to make
progress on this problem and in particular prove the conjecture for k=2.
It was asked whether the primitive elements of F_k form a closed set in the
profinite topology of free groups. Our results provide a positive answer for
F_2.Comment: This is a unified version of two manuscripts: "On Primitive words I:
A New Algorithm", and "On Primitive Words II: Measure Preservation". 42
pages, 14 figures. Some parts of the paper reorganized towards publication in
the Israel J. of Mat
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
Coarse-Graining and Self-Dissimilarity of Complex Networks
Can complex engineered and biological networks be coarse-grained into smaller
and more understandable versions in which each node represents an entire
pattern in the original network? To address this, we define coarse-graining
units (CGU) as connectivity patterns which can serve as the nodes of a
coarse-grained network, and present algorithms to detect them. We use this
approach to systematically reverse-engineer electronic circuits, forming
understandable high-level maps from incomprehensible transistor wiring: first,
a coarse-grained version in which each node is a gate made of several
transistors is established. Then, the coarse-grained network is itself
coarse-grained, resulting in a high-level blueprint in which each node is a
circuit-module made of multiple gates. We apply our approach also to a
mammalian protein-signaling network, to find a simplified coarse-grained
network with three main signaling channels that correspond to cross-interacting
MAP-kinase cascades. We find that both biological and electronic networks are
'self-dissimilar', with different network motifs found at each level. The
present approach can be used to simplify a wide variety of directed and
nondirected, natural and designed networks.Comment: 11 pages, 11 figure
Private Incremental Regression
Data is continuously generated by modern data sources, and a recent challenge
in machine learning has been to develop techniques that perform well in an
incremental (streaming) setting. In this paper, we investigate the problem of
private machine learning, where as common in practice, the data is not given at
once, but rather arrives incrementally over time.
We introduce the problems of private incremental ERM and private incremental
regression where the general goal is to always maintain a good empirical risk
minimizer for the history observed under differential privacy. Our first
contribution is a generic transformation of private batch ERM mechanisms into
private incremental ERM mechanisms, based on a simple idea of invoking the
private batch ERM procedure at some regular time intervals. We take this
construction as a baseline for comparison. We then provide two mechanisms for
the private incremental regression problem. Our first mechanism is based on
privately constructing a noisy incremental gradient function, which is then
used in a modified projected gradient procedure at every timestep. This
mechanism has an excess empirical risk of , where is the
dimensionality of the data. While from the results of [Bassily et al. 2014]
this bound is tight in the worst-case, we show that certain geometric
properties of the input and constraint set can be used to derive significantly
better results for certain interesting regression problems.Comment: To appear in PODS 201
Male frequent attenders of general practice and their help seeking preferences
Background: Low rates of health service usage by men are commonly linked to masculine values and traditional male gender roles. However, not all men conform to these stereotypical notions of masculinity, with some men choosing to attend health services on a frequent basis, for a variety of different reasons. This study draws upon the accounts of male frequent attenders of the General Practitioner's (GP) surgery, examining their help-seeking preferences and their reasons for choosing services within general practice over other sources of support. Methods: The study extends thematic analysis of interview data from the Self Care in Primary Care study (SCinPC), a large scale multi-method evaluation study of a self care programme delivered to frequent attenders of general practice. Data were collected from 34 semi-structured interviews conducted with men prior to their exposure to the intervention. Results: The ages of interviewed men ranged from 16 to 72 years, and 91% of the sample (n= 31) stated that they had a current health condition. The thematic analysis exposed diverse perspectives within male help-seeking preferences and the decision-making behind men's choice of services. The study also draws attention to the large variation in men's knowledge of available health services, particularly alternatives to general practice. Furthermore, the data revealed some men's lack of confidence in existing alternatives to general practice. Conclusions: The study highlights the complex nature of male help-seeking preferences, and provides evidence that there should be no 'one size fits all' approach to male service provision. It also provides impetus for conducting further studies into this under researched area of interest. © 2011 WPMH GmbH
Identification of novel subgroup a variants with enhanced receptor binding and replicative capacity in primary isolates of anaemogenic strains of feline leukaemia virus
<b>BACKGROUND:</b>
The development of anaemia in feline leukaemia virus (FeLV)-infected cats is associated with the emergence of a novel viral subgroup, FeLV-C. FeLV-C arises from the subgroup that is transmitted, FeLV-A, through alterations in the amino acid sequence of the receptor binding domain (RBD) of the envelope glycoprotein that result in a shift in the receptor usage and the cell tropism of the virus. The factors that influence the transition from subgroup A to subgroup C remain unclear, one possibility is that a selective pressure in the host drives the acquisition of mutations in the RBD, creating A/C intermediates with enhanced abilities to interact with the FeLV-C receptor, FLVCR. In order to understand further the emergence of FeLV-C in the infected cat, we examined primary isolates of FeLV-C for evidence of FeLV-A variants that bore mutations consistent with a gradual evolution from FeLV-A to FeLV-C.<p></p>
<b>RESULTS:</b>
Within each isolate of FeLV-C, we identified variants that were ostensibly subgroup A by nucleic acid sequence comparisons, but which bore mutations in the RBD. One such mutation, N91D, was present in multiple isolates and when engineered into a molecular clone of the prototypic FeLV-A (Glasgow-1), enhanced replication was noted in feline cells. Expression of the N91D Env on murine leukaemia virus (MLV) pseudotypes enhanced viral entry mediated by the FeLV-A receptor THTR1 while soluble FeLV-A Env bearing the N91D mutation bound more efficiently to mouse or guinea pig cells bearing the FeLV-A and -C receptors. Long-term in vitro culture of variants bearing the N91D substitution in the presence of anti-FeLV gp70 antibodies did not result in the emergence of FeLV-C variants, suggesting that additional selective pressures in the infected cat may drive the subsequent evolution from subgroup A to subgroup C.<p></p>
<b>CONCLUSIONS:</b>
Our data support a model in which variants of FeLV-A, bearing subtle differences in the RBD of Env, may be predisposed towards enhanced replication in vivo and subsequent conversion to FeLV-C. The selection pressures in vivo that drive the emergence of FeLV-C in a proportion of infected cats remain to be established
- …
