31 research outputs found
LaTeX, metadata, and publishing workflows
The field of scientific publishing that is served by LaTeX is increasingly
dependent on the availability of metadata about publications. We discuss how to
use LaTeX classes and BibTeX styles to curate metadata throughout the life
cycle of a published article. Our focus is on streamlining and automating much
of publishing workflow. We survey the various options and drawbacks of the
existing approaches and outline our approach as applied in a new LaTeX style
file where we have as main goal to make it easier for authors to specify their
metadata only once and use this throughout the entire publishing pipeline. We
believe this can help to reduce the cost of publishing, by reducing the amount
of human effort required for editing and providing of publication metadata
Recommended from our members
A fast portable implementation of the Secure Hash Algorithm, III.
In 1992, NIST announced a proposed standard for a collision-free hash function. The algorithm for producing the hash value is known as the Secure Hash Algorithm (SHA), and the standard using the algorithm in known as the Secure Hash Standard (SHS). Later, an announcement was made that a scientist at NSA had discovered a weakness in the original algorithm. A revision to this standard was then announced as FIPS 180-1, and includes a slight change to the algorithm that eliminates the weakness. This new algorithm is called SHA-1. In this report we describe a portable and efficient implementation of SHA-1 in the C language. Performance information is given, as well as tips for porting the code to other architectures. We conclude with some observations on the efficiency of the algorithm, and a discussion of how the efficiency of SHA might be improved
Geospatial Mapping and Navigation of the Web
Web pages may be organized, indexed, searched, and navigated along several different feature dimensions. We investigate different approaches to discovering geographic context for web pages, and describe a navigational tool for browsing web resources by geographic proximity
Language Modeling and Encryption on Packet Switched Networks ⋆
Abstract. The holy grail of a mathematical model of secure encryption is to devise a model that is both faithful in its description of the real world, and yet admits a construction for an encryption system that fulfills a meaningful definition of security against a realistic adversary. While enormous progress has been made during the last 60 years toward this goal, existing models of security still overlook features that are closely related to the fundamental nature of communication. As a result there is substantial doubt in this author’s mind as to whether there is any reasonable definition of “secure encryption ” on the Internet.
Analysis of Anchor Text for Web Search
It has been observed that anchor text in web documents is very useful in improving the quality of web text search for some classes of queries. By examining properties of anchor text in a large intranet, we hope to shed light on why this is the case. Our main premise is that anchor text behaves very much like real user queries and consensus titles. Thus an understanding of how anchor text is related to a document will likely lead to better understanding of how to translate a user's query into high quality search results. Our approach is experimental, based on a study of a large corporate intranet, including the content as well as a large stream of queries against that content. We conduct experiments to investigate several aspects of anchor text, including their relationship to titles, the frequency of queries that can be satisfied by anchortext alone, and the homogeneity of results fetched by anchor text
Untangling Compound Documents on the Web
Most text analysis is designed to deal with the concept of a "document", namely a cohesive presentation of thought on a unifying subject. By contrast, individual nodes on the World Wide Web tend to have a much smaller granularity than text documents. We claim that the notions of "document" and "web node" are not synonomous, and that authors often tend to deploy documents as collections of URLs, which we call "compound documents". In this paper we present new techniques for identifying and working with such compound documents, and the results of some largescale studies on such web documents. The primary motivation for this work stems from the fact that information retrieval techniques are better suited to working on documents than individual hypertext nodes