1,950,207 research outputs found

    An integrated architecture for shallow and deep processing

    Get PDF
    We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis

    Generating multimedia presentations: from plain text to screenplay

    Get PDF
    In many Natural Language Generation (NLG) applications, the output is limited to plain text – i.e., a string of words with punctuation and paragraph breaks, but no indications for layout, or pictures, or dialogue. In several projects, we have begun to explore NLG applications in which these extra media are brought into play. This paper gives an informal account of what we have learned. For coherence, we focus on the domain of patient information leaflets, and follow an example in which the same content is expressed first in plain text, then in formatted text, then in text with pictures, and finally in a dialogue script that can be performed by two animated agents. We show how the same meaning can be mapped to realisation patterns in different media, and how the expanded options for expressing meaning are related to the perceived style and tone of the presentation. Throughout, we stress that the extra media are not simple added to plain text, but integrated with it: thus the use of formatting, or pictures, or dialogue, may require radical rewording of the text itself

    Internal Pattern Matching Queries in a Text and Applications

    Full text link
    We consider several types of internal queries: questions about subwords of a text. As the main tool we develop an optimal data structure for the problem called here internal pattern matching. This data structure provides constant-time answers to queries about occurrences of one subword xx in another subword yy of a given text, assuming that y=O(x)|y|=\mathcal{O}(|x|), which allows for a constant-space representation of all occurrences. This problem can be viewed as a natural extension of the well-studied pattern matching problem. The data structure has linear size and admits a linear-time construction algorithm. Using the solution to the internal pattern matching problem, we obtain very efficient data structures answering queries about: primitivity of subwords, periods of subwords, general substring compression, and cyclic equivalence of two subwords. All these results improve upon the best previously known counterparts. The linear construction time of our data structure also allows to improve the algorithm for finding δ\delta-subrepetitions in a text (a more general version of maximal repetitions, also called runs). For any fixed δ\delta we obtain the first linear-time algorithm, which matches the linear time complexity of the algorithm computing runs. Our data structure has already been used as a part of the efficient solutions for subword suffix rank & selection, as well as substring compression using Burrows-Wheeler transform composed with run-length encoding.Comment: 31 pages, 9 figures; accepted to SODA 201

    Liouville Type Theorem For A Nonlinear Neumann Problem

    Full text link
    Consider the following nonlinear Neumann problem {div(yau(x,y))=0,for (x,y)R+n+1limy0+yauy=f(u),on R+n+1,u0in R+n+1, \begin{cases} \text{div}\left(y^{a}\nabla u(x,y)\right)=0, & \text{for }(x,y)\in\mathbb{R}_{+}^{n+1}\\ \lim_{y\rightarrow0+}y^{a}\frac{\partial u}{\partial y}=-f(u), & \text{on }\partial\mathbb{R}_{+}^{n+1},\\ u\ge0 & \text{in }\mathbb{R}_{+}^{n+1}, \end{cases} a(1,1)a\in(-1,1). A Liouville type theorem and its applications are given under suitable conditions on ff. Our tool is the famous moving plane method.Comment: This paper has been withdrawn by the author due to a poor writin

    Combining textual and visual information processing for interactive video retrieval: SCHEMA's participation in TRECVID 2004

    Get PDF
    In this paper, the two different applications based on the Schema Reference System that were developed by the SCHEMA NoE for participation to the search task of TRECVID 2004 are illustrated. The first application, named ”Schema-Text”, is an interactive retrieval application that employs only textual information while the second one, named ”Schema-XM”, is an extension of the former, employing algorithms and methods for combining textual, visual and higher level information. Two runs for each application were submitted, I A 2 SCHEMA-Text 3, I A 2 SCHEMA-Text 4 for Schema-Text and I A 2 SCHEMA-XM 1, I A 2 SCHEMA-XM 2 for Schema-XM. The comparison of these two applications in terms of retrieval efficiency revealed that the combination of information from different data sources can provide higher efficiency for retrieval systems. Experimental testing additionally revealed that initially performing a text-based query and subsequently proceeding with visual similarity search using one of the returned relevant keyframes as an example image is a good scheme for combining visual and textual information

    Leveraging Deep Graph-Based Text Representation for Sentiment Polarity Applications

    Full text link
    Over the last few years, machine learning over graph structures has manifested a significant enhancement in text mining applications such as event detection, opinion mining, and news recommendation. One of the primary challenges in this regard is structuring a graph that encodes and encompasses the features of textual data for the effective machine learning algorithm. Besides, exploration and exploiting of semantic relations is regarded as a principal step in text mining applications. However, most of the traditional text mining methods perform somewhat poor in terms of employing such relations. In this paper, we propose a sentence-level graph-based text representation which includes stop words to consider semantic and term relations. Then, we employ a representation learning approach on the combined graphs of sentences to extract the latent and continuous features of the documents. Eventually, the learned features of the documents are fed into a deep neural network for the sentiment classification task. The experimental results demonstrate that the proposed method substantially outperforms the related sentiment analysis approaches based on several benchmark datasets. Furthermore, our method can be generalized on different datasets without any dependency on pre-trained word embeddings.Comment: 33 pages, 6 figures, 6 Tables, Accepted for publication in Expert Systems With Applications Journa
    corecore