12,499 research outputs found

    Thematic Annotation: extracting concepts out of documents

    Get PDF
    Contrarily to standard approaches to topic annotation, the technique used in this work does not centrally rely on some sort of -- possibly statistical -- keyword extraction. In fact, the proposed annotation algorithm uses a large scale semantic database -- the EDR Electronic Dictionary -- that provides a concept hierarchy based on hyponym and hypernym relations. This concept hierarchy is used to generate a synthetic representation of the document by aggregating the words present in topically homogeneous document segments into a set of concepts best preserving the document's content. This new extraction technique uses an unexplored approach to topic selection. Instead of using semantic similarity measures based on a semantic resource, the later is processed to extract the part of the conceptual hierarchy relevant to the document content. Then this conceptual hierarchy is searched to extract the most relevant set of concepts to represent the topics discussed in the document. Notice that this algorithm is able to extract generic concepts that are not directly present in the document.Comment: Technical report EPFL/LIA. 81 pages, 16 figure

    High-SIR Transmission Capacity of Wireless Networks with General Fading and Node Distribution

    Full text link
    In many wireless systems, interference is the main performance-limiting factor, and is primarily dictated by the locations of concurrent transmitters. In many earlier works, the locations of the transmitters is often modeled as a Poisson point process for analytical tractability. While analytically convenient, the PPP only accurately models networks whose nodes are placed independently and use ALOHA as the channel access protocol, which preserves the independence. Correlations between transmitter locations in non-Poisson networks, which model intelligent access protocols, makes the outage analysis extremely difficult. In this paper, we take an alternative approach and focus on an asymptotic regime where the density of interferers η\eta goes to 0. We prove for general node distributions and fading statistics that the success probability \p \sim 1-\gamma \eta^{\kappa} for η→0\eta \rightarrow 0, and provide values of γ\gamma and κ\kappa for a number of important special cases. We show that κ\kappa is lower bounded by 1 and upper bounded by a value that depends on the path loss exponent and the fading. This new analytical framework is then used to characterize the transmission capacity of a very general class of networks, defined as the maximum spatial density of active links given an outage constraint.Comment: Submitted to IEEE Trans. Info Theory special issu

    Nonparametric analysis of unbalanced paired-comparison or ranked data

    Get PDF
    Suppose we have t objects C[subscript]1,...,C[subscript]t, and that objects C[subscript]i and C[subscript]j are judged pairwise in n[subscript]ij independent comparisons, for i,j = 1,...,t; i ≠ j. In the simplest of such \u27paired-comparison\u27 experiments, all pairs of objects are compared an equal number of times (i.e., all n[subscript]ij = n); much of the paired-comparison literature pertains to the design and analysis of such \u27completely balanced\u27 experiments. Yet it is often inconvenient or impractical to carry out such a design: some pairs of objects might be compared more often than others, and some pairs might not be compared at all. Most of the available methods for analysis of unbalanced paired-comparison data are parametric, in the sense that a (paired-comparison) linear model generates, for each pair of objects, the \u27preference probability\u27 [pi][subscript]ij with which C[subscript]i is preferred to C[subscript]j. The few existing nonparameteric approaches are critically examined. David (1987) proposes a simple method of scoring objects from unbalanced paired-comparison data that takes into account differences in the strength of the competition encountered by each object as well as possible differences in the number of comparisons on each pair of objects. Statistical properties of the proposed scores are developed for the general unstructured case and for special cases of partial balance, such as when objects are arranged in a group divisible design. The asymptotic distribution of these scores leads to several approximate tests of hypotheses, including a test for equality of the objects. Through some numerical examples this proposed method will be compared with the few other nonparametric method designed for unbalanced data. The approach is then extended to unbalanced ranked data. It is shown that the previous nonparametric rank approaches fail to account adequately for the aspects of unbalanced data of concern in this dissertation. Numerical examples of unbalanced ranked data illustrate the comparison between the proposed method and the existing rank methods;Reference. David, H. A. (1987). Ranking from unbalanced paired-comparison data. Biometrika 74, 2, 432-6

    Synthesis of conformationally restrained peptides

    Get PDF
    The synthesis of an artificial amino acid residue, bearing two a-amino acid centres, is detailed. The residue has been designed to act as a conformational restraint when incorporated into peptides. The intended target structural motif is the a-helix, and the restraint takes the form of a macrocycle in a central position in the peptide chain, which is intended to nucleate helix formation. The synthesis has been achieved by the use of two different asymmetric methodologies. Details of the final synthetic route to the residue are included, as well as details of several other synthetic routes which proved unsuccessful. The final route involves the use of an octanoic acid derivative. This is initially reacted with a chiral lithiated pyrazine cyanocuprate complex to generate the R-chiral centre, followed by the introduction of the S-chiral centre using an asymmetric azidation methodology. These reactions have been employed in sequence to give optimum yield and efficiency. The sequence of reaction followed also simplifies the differentiation of the two chiral centres, giving the molecule in a form suitable for solid phase peptide synthesis. The attempted syntheses of peptides bearing this residue is also detailed. This process has been performed by standard Fmoc methodology, using the triply orthogonal allyl based protecting group, cleaved by palladium catalysis, to allow selective reaction to form the macrocycle. This loop is arranged in an i-(i+4) substitution pattern, suggested in the literature to be the most effective spacing for performing this task. Other sections of this thesis describe the general background to helical structure stabilisation, the asymmetric synthesis of amino acids and the solid phase synthesis of peptides

    Unsupervised Natural Question Answering with a Small Model

    Full text link
    The recent (2019-02) demonstration of the power of huge language models such as GPT-2 to memorise the answers to factoid questions raises questions about the extent to which knowledge is being embedded directly within these large models. This short paper describes an architecture through which much smaller models can also answer such questions - by making use of 'raw' external knowledge. The contribution of this work is that the methods presented here rely on unsupervised learning techniques, complementing the unsupervised training of the Language Model. The goal of this line of research is to be able to add knowledge explicitly, without extensive training.Comment: Accepted paper for FEVER workshop at EMNLP-IJCNLP 2019. (4 pages + references
    • …
    corecore