12,499 research outputs found
Thematic Annotation: extracting concepts out of documents
Contrarily to standard approaches to topic annotation, the technique used in
this work does not centrally rely on some sort of -- possibly statistical --
keyword extraction. In fact, the proposed annotation algorithm uses a large
scale semantic database -- the EDR Electronic Dictionary -- that provides a
concept hierarchy based on hyponym and hypernym relations. This concept
hierarchy is used to generate a synthetic representation of the document by
aggregating the words present in topically homogeneous document segments into a
set of concepts best preserving the document's content.
This new extraction technique uses an unexplored approach to topic selection.
Instead of using semantic similarity measures based on a semantic resource, the
later is processed to extract the part of the conceptual hierarchy relevant to
the document content. Then this conceptual hierarchy is searched to extract the
most relevant set of concepts to represent the topics discussed in the
document. Notice that this algorithm is able to extract generic concepts that
are not directly present in the document.Comment: Technical report EPFL/LIA. 81 pages, 16 figure
High-SIR Transmission Capacity of Wireless Networks with General Fading and Node Distribution
In many wireless systems, interference is the main performance-limiting
factor, and is primarily dictated by the locations of concurrent transmitters.
In many earlier works, the locations of the transmitters is often modeled as a
Poisson point process for analytical tractability. While analytically
convenient, the PPP only accurately models networks whose nodes are placed
independently and use ALOHA as the channel access protocol, which preserves the
independence. Correlations between transmitter locations in non-Poisson
networks, which model intelligent access protocols, makes the outage analysis
extremely difficult. In this paper, we take an alternative approach and focus
on an asymptotic regime where the density of interferers goes to 0. We
prove for general node distributions and fading statistics that the success
probability \p \sim 1-\gamma \eta^{\kappa} for , and
provide values of and for a number of important special
cases. We show that is lower bounded by 1 and upper bounded by a value
that depends on the path loss exponent and the fading. This new analytical
framework is then used to characterize the transmission capacity of a very
general class of networks, defined as the maximum spatial density of active
links given an outage constraint.Comment: Submitted to IEEE Trans. Info Theory special issu
Nonparametric analysis of unbalanced paired-comparison or ranked data
Suppose we have t objects C[subscript]1,...,C[subscript]t, and that objects C[subscript]i and C[subscript]j are judged pairwise in n[subscript]ij independent comparisons, for i,j = 1,...,t; i ≠j. In the simplest of such \u27paired-comparison\u27 experiments, all pairs of objects are compared an equal number of times (i.e., all n[subscript]ij = n); much of the paired-comparison literature pertains to the design and analysis of such \u27completely balanced\u27 experiments. Yet it is often inconvenient or impractical to carry out such a design: some pairs of objects might be compared more often than others, and some pairs might not be compared at all. Most of the available methods for analysis of unbalanced paired-comparison data are parametric, in the sense that a (paired-comparison) linear model generates, for each pair of objects, the \u27preference probability\u27 [pi][subscript]ij with which C[subscript]i is preferred to C[subscript]j. The few existing nonparameteric approaches are critically examined. David (1987) proposes a simple method of scoring objects from unbalanced paired-comparison data that takes into account differences in the strength of the competition encountered by each object as well as possible differences in the number of comparisons on each pair of objects. Statistical properties of the proposed scores are developed for the general unstructured case and for special cases of partial balance, such as when objects are arranged in a group divisible design. The asymptotic distribution of these scores leads to several approximate tests of hypotheses, including a test for equality of the objects. Through some numerical examples this proposed method will be compared with the few other nonparametric method designed for unbalanced data. The approach is then extended to unbalanced ranked data. It is shown that the previous nonparametric rank approaches fail to account adequately for the aspects of unbalanced data of concern in this dissertation. Numerical examples of unbalanced ranked data illustrate the comparison between the proposed method and the existing rank methods;Reference. David, H. A. (1987). Ranking from unbalanced paired-comparison data. Biometrika 74, 2, 432-6
Synthesis of conformationally restrained peptides
The synthesis of an artificial amino acid residue, bearing two a-amino acid centres, is detailed. The residue has been designed to act as a conformational restraint when incorporated into peptides. The intended target structural motif is the a-helix, and the restraint takes the form of a macrocycle in a central position in the peptide chain, which is intended to nucleate helix formation. The synthesis has been achieved by the use of two different asymmetric methodologies. Details of the final synthetic route to the residue are included, as well as details of several other synthetic routes which proved unsuccessful. The final route involves the use of an octanoic acid derivative. This is initially reacted with a chiral lithiated pyrazine cyanocuprate complex to generate the R-chiral centre, followed by the introduction of the S-chiral centre using an asymmetric azidation methodology. These reactions have been employed in sequence to give optimum yield and efficiency. The sequence of reaction followed also simplifies the differentiation of the two chiral centres, giving the molecule in a form suitable for solid phase peptide synthesis. The attempted syntheses of peptides bearing this residue is also detailed.
This process has been performed by standard Fmoc methodology, using the triply orthogonal allyl based protecting group, cleaved by palladium catalysis, to allow selective reaction to form the macrocycle. This loop is arranged in an i-(i+4) substitution pattern, suggested in the literature to be the most effective spacing for performing this task. Other sections of this thesis describe the general background to helical structure stabilisation, the asymmetric synthesis of amino acids and the solid phase synthesis of peptides
Unsupervised Natural Question Answering with a Small Model
The recent (2019-02) demonstration of the power of huge language models such
as GPT-2 to memorise the answers to factoid questions raises questions about
the extent to which knowledge is being embedded directly within these large
models. This short paper describes an architecture through which much smaller
models can also answer such questions - by making use of 'raw' external
knowledge. The contribution of this work is that the methods presented here
rely on unsupervised learning techniques, complementing the unsupervised
training of the Language Model. The goal of this line of research is to be able
to add knowledge explicitly, without extensive training.Comment: Accepted paper for FEVER workshop at EMNLP-IJCNLP 2019. (4 pages +
references
- …