Search CORE

157,956 research outputs found

Sentence Complexity in French: a Corpus-Based Approach

Author: Tanguy Ludovic
Tulechki Nikola
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceLanguage complexity is a notion widely used in a number of linguistic fields and language applications, and can be described by a number of linguistic features and practical measures. This work proposes a closer, data-oriented look at sentence complexity. Starting from a number of different studies, we selected and implemented 52 linguistic features and measured them on a corpus of varied French texts. Using statistical methods, we identify five underlying dimensions of sentence complexity. In addition to providing a better understanding of the phenomenon, these dimensions have been used in some information retrieval experiments

CiteSeerX

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

Text Augmentation: Inserting markup into natural language text with PPM Models

Author: Yeates Stuart Andrew
Publication venue: The University of Waikato
Publication date: 01/01/2006
Field of study

This thesis describes a new optimisation and new heuristics for automatically marking up XML documents. These are implemented in CEM, using PPMmodels. CEM is significantly more general than previous systems, marking up large numbers of hierarchical tags, using n-gram models for large n and a variety of escape methods. Four corpora are discussed, including the bibliography corpus of 14682 bibliographies laid out in seven standard styles using the BIBTEX system and markedup in XML with every field from the original BIBTEX. Other corpora include the ROCLING Chinese text segmentation corpus, the Computists’ Communique corpus and the Reuters’ corpus. A detailed examination is presented of the methods of evaluating mark up algorithms, including computation complexity measures and correctness measures from the fields of information retrieval, string processing, machine learning and information theory. A new taxonomy of markup complexities is established and the properties of each taxon are examined in relation to the complexity of marked-up documents. The performance of the new heuristics and optimisation is examined using the four corpora

CiteSeerX

Research Commons@Waikato

Musical Deep Learning: Stylistic Melodic Generation with Complexity Based Similarity

Author: Smith Benjamin D.
Publication venue
Publication date: 01/01/2017
Field of study

The wide-ranging impact of deep learning models implies significant application in music analysis, retrieval, and generation. Initial findings from musical application of a conditional restricted Boltzmann machine (CRBM) show promise towards informing creative computation. Taking advantage of the CRBM’s ability to model temporal dependencies full reconstructions of pieces are achievable given a few starting seed notes. The generation of new material using figuration from the training corpus requires restrictions on the size and memory space of the CRBM, forcing associative rather than perfect recall. Musical analysis and information complexity measures show the musical encoding to be the primary determinant of the nature of the generated results

IUPUIScholarWorks

Video matching using DC-image and local features

Author: Ahmed Amr
Bekhet Saddam
Hunter Andrew
Publication venue: Newswood Limited/International Association of Engineers
Publication date: 01/01/2013
Field of study

This paper presents a suggested framework for video matching based on local features extracted from the DCimage of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the realtime margin. There are also various optimisations that can be done to improve this computation complexity

University of Lincoln Institutional Repository

Edge Hill University Research Information Repository

DC-image for real time compressed video matching

Author: Ahmed Amr
Bekhet Saddam
Hunter Andrew
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2014
Field of study

This chapter presents a suggested framework for video matching based on local features extracted from the DC-image of MPEG compressed videos, without full decompression. In addition, the relevant arguments and supporting evidences are discussed. Several local feature detectors will be examined to select the best for matching using the DC-image. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and computation complexity. The second experiment compares between using local features and global features regarding compressed video matching with respect to the DC-image. The results confirmed that the use of DC-image, despite its highly reduced size, it is promising as it produces higher matching precision, compared to the full I-frame. Also, SIFT, as a local feature, outperforms most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the real-time margin which leaves a space for further optimizations that can be done to improve this computation complexity

University of Lincoln Institutional Repository

Ordinary Search Engine Users Carrying Out Complex Search Tasks

Author: Lewandowski Dirk
Norbisrath Ulrich
Singer Georg
Publication venue
Publication date: 04/07/2012
Field of study

Web search engines have become the dominant tools for finding information on the Internet. Due to their popularity, users apply them to a wide range of search needs, from simple look-ups to rather complex information tasks. This paper presents the results of a study to investigate the characteristics of these complex information needs in the context of Web search engines. The aim of the study is to find out more about (1) what makes complex search tasks distinct from simple tasks and if it is possible to find simple measures for describing their complexity, (2) if search success for a task can be predicted by means of unique measures, and (3) if successful searchers show a different behavior than unsuccessful ones. The study includes 60 people who carried out a set of 12 search tasks with current commercial search engines. Their behavior was logged with the Search-Logger tool. The results confirm that complex tasks show significantly different characteristics than simple tasks. Yet it seems to be difficult to distinguish successful from unsuccessful search behaviors. Good searchers can be differentiated from bad searchers by means of measurable parameters. The implications of these findings for search engine vendors are discussed.Comment: 60 page

arXiv.org e-Print Archive

REPOSIT

Stylistic Variation in an Information Retrieval Experiment

Author: Karlgren Jussi
Publication venue
Publication date: 01/01/1996
Field of study

Texts exhibit considerable stylistic variation. This paper reports an experiment where a corpus of documents (N= 75 000) is analyzed using various simple stylistic metrics. A subset (n = 1000) of the corpus has been previously assessed to be relevant for answering given information retrieval queries. The experiment shows that this subset differs significantly from the rest of the corpus in terms of the stylistic metrics studied.Comment: Proceedings of NEMLAP-

arXiv.org e-Print Archive

CiteSeerX

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Sequential Complexity as a Descriptor for Musical Similarity

Author: Dixon S
Foster P
Mauch M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

We propose string compressibility as a descriptor of temporal structure in audio, for the purpose of determining musical similarity. Our descriptors are based on computing track-wise compression rates of quantised audio features, using multiple temporal resolutions and quantisation granularities. To verify that our descriptors capture musically relevant information, we incorporate our descriptors into similarity rating prediction and song year prediction tasks. We base our evaluation on a dataset of 15500 track excerpts of Western popular music, for which we obtain 7800 web-sourced pairwise similarity ratings. To assess the agreement among similarity ratings, we perform an evaluation under controlled conditions, obtaining a rank correlation of 0.33 between intersected sets of ratings. Combined with bag-of-features descriptors, we obtain performance gains of 31.1% and 10.9% for similarity rating prediction and song year prediction. For both tasks, analysis of selected descriptors reveals that representing features at multiple time scales benefits prediction accuracy.Comment: 13 pages, 9 figures, 8 tables. Accepted versio

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queen Mary Research Online

Affective feedback: an investigation into the role of emotions in the information seeking process

Author: Arapakis I.
Gray P.D.G.
Jose J.M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the retrieval system. However, apart from real-life problems and information objects, users interact with intentions, motivations and feelings, which can be seen as critical aspects of cognition and decision-making. The study presented in this paper serves as a starting point to the exploration of the role of emotions in the information seeking process. Results show that the latter not only interweave with different physiological, psychological and cognitive processes, but also form distinctive patterns, according to specific task, and according to specific user

Enlighten

A Semantic Similarity Measure for Expressive Description Logics

Author: d'Amato Claudia
Esposito Floriana
Fanizzi Nicola
Publication venue
Publication date: 01/01/2009
Field of study

A totally semantic measure is presented which is able to calculate a similarity value between concept descriptions and also between concept description and individual or between individuals expressed in an expressive description logic. It is applicable on symbolic descriptions although it uses a numeric approach for the calculus. Considering that Description Logics stand as the theoretic framework for the ontological knowledge representation and reasoning, the proposed measure can be effectively used for agglomerative and divisional clustering task applied to the semantic web domain.Comment: 13 pages, Appeared at CILC 2005, Convegno Italiano di Logica Computazionale also available at http://www.disp.uniroma2.it/CILC2005/downloads/papers/15.dAmato_CILC05.pd

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Bari