Search CORE

379 research outputs found

Viewing the Dictionary as a Classification System

Author: Krovetz Robert
Publication venue: 'University of Washington Libraries'
Publication date: 06/10/1990
Field of study

Information retrieval is one of the earliest applications of computers. Starting with the speculative wode of Vannevar Bush on Memex [Bush 45], to the development of Key Word in Context (KWIC) indexing by H.P. Luhn [Luhn 60] and Boolean retrieval by John Horty [Horty 62], to the statistical techniques for automatic indexing and document retrieval done in the 1960's and continuing to the present [Salton and McGill 83], Information Retrieval has continued to develop and progress. However, there is a growing consensus that current generation statistical techniques have gone about as far as they can go, and that further improvement requires the use of natural language processing and knowledge representation. We believe that the best place to start is by focusing on the lexicon, and to index documents not by words, but by word senses. Why use word senses? Conventional approaches advocate either indexing by the words themselves, or by manual indexing using a controlled vocabulary. Manual indexing offers some of the advantage of word senses, in that the terms are not ambiguous, but it suffers from problems of consistency. In addition, as text data bases continue to grow, it will only be possible to index a fraction of them by hand. In advocating word senses as indices we are not suggesting that they are the ultimate answer. There is much more to the meaning of a document then the senses of the words it contains; we are just saying that senses are a good start. Any approach to providing a semantic analysis must deal with the problem of word meaning. Existing retrieval systems try to go beyond single words by using a thesaurus,l but this has the problem that words are not synonymous in all contexts. The word 'term' may be synonymous with 'word' (as in a vocabulary term), 'sentence' (as in a prison term), or 'condition' (as in 'terms of agreement'). If we expand the query with words from a thesaurus, we must be careful to use the right senses of those words. We not only have to know the sense of the word in the query (in this example, the sense of the word 'term '), but the sense of the word that is being used to augment it (e.g., the appropriate sense of the word 'sentence'). The thesaurus we use should be one in which the senses of words are explicitly indicated [Chodorow et al. 88]. We contend that the best place to obtain word senses is a machine-readable dictionary. Although it is possible that another list of senses might be manually constructed, this strategy might cause some senses to be overlooked, and the task will entail a great degree of effort

University of Washington: ResearchWorks Journal Hosting

Viewing morphology as an inference process

Author: Krovetz Robert
Publication venue: Published by Elsevier B.V.
Publication date: 01/01/2000
Field of study

AbstractMorphology is the area of linguistics concerned with the internal structure of words. Information retrieval has generally not paid much attention to word structure, other than to account for some of the variability in word forms via the use of stemmers. We report on our experiments to determine the importance of morphology, and the effect that it has on performance. We found that grouping morphological variants makes a significant improvement in retrieval performance. Improvements are seen by grouping inflectional as well as derivational variants. We also found that performance was enhanced by recognizing lexical phrases. We describe the interaction between morphology and lexical ambiguity, and how resolving that ambiguity will lead to further improvements in performance

CiteSeerX

Elsevier - Publisher Connector

Wave Attenuation in Salt Marshes

Author: Krovetz Katharine
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/01/2018
Field of study

Salt marshes can provide several ecosystem services including shoreline protection. One of the ways marshes do this is by attenuating waves. However, the rate at which waves are attenuated by marshes is highly variable. Previous studies have found that the rate of wave attenuation varies with water depth, vegetation characteristics, bottom profile, and wave conditions. This study examined how these factors affect wave attenuation rate at four marsh sites in coastal North Carolina with different vegetation and bottom profile characteristics. The rate of wave attenuation was found to increase with decreasing water depth and increasing vegetation height and increasing vegetation density. Sites with short vegetation, which is submerged more of the time, had larger rates of decrease in wave attenuation with increasing depth than sites with tall vegetation. Waves with long periods were attenuated less than waves with short periods. Attenuation rates were much less and decreased more with increasing water depth during storm conditions than during normal conditions at the same site.Bachelor of Scienc

Carolina Digital Repository

Viewing the Dictionary as a Classification System

Author: Krovetz Robert
Publication venue: 'University of Washington Libraries'
Publication date: 06/10/1990
Field of study

University of Washington: ResearchWorks Journal Hosting

The universality of iterated hashing over variable-length strings

Author: Byers
Carter
Cohen
Daniel Lemire
Knuth
Krawczyk
Krawczyk
Krovetz
Kukich
Lemire
Liskov
Pagh
Pearson
Piret
Preneel
Ramakrishna
Rogaway
Sarkar
Shoup
Stinson
Zobrist
Publication venue: 'Elsevier BV'
Publication date: 24/11/2011
Field of study

Iterated hash functions process strings recursively, one character at a time. At each iteration, they compute a new hash value from the preceding hash value and the next character. We prove that iterated hashing can be pairwise independent, but never 3-wise independent. We show that it can be almost universal over strings much longer than the number of hash values; we bound the maximal string length given the collision probability

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

R-libre

Inferring hierarchical descriptions

Author: David M. Pennock
Eric Glover
Robert Krovetz
Steve Lawrence
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Robust Authenticated-Encryption: AEZ and the Problem that it Solves

Author: Phillip Rogaway
Ted Krovetz
Viet Tung Hoang
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 31/03/2017
Field of study

With a scheme for \textit{robust} authenticated-encryption a user can select an arbitrary value

\lambda \ge 0

and then encrypt a plaintext of any length into a ciphertext that\u27s

\lambda

characters longer. The scheme must provide all the privacy and authenticity possible for the requested~

\lambda

. We formalize and investigate this idea, and construct a well-optimized solution, AEZ, from the AES round function. Our scheme encrypts strings at almost the same rate as OCB-AES or CTR-AES (on Haswell, AEZ has a peak speed of about 0.7 cpb). To accomplish this we employ an approach we call \textit{prove-then-prune}: prove security and then instantiate with a \textit{scaled-down} primitive (e.g., reducing rounds for blockcipher calls)

Cryptology ePrint Archive

OCB Mode

Author: John Black
Mihir Bellare
Phillip Rogaway
Ted Krovetz
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/04/2001
Field of study

This paper was prepared for NIST, which is considering new block-cipher modes of operation. It describes a parallelizable mode of operation that simultaneously provides both privacy and authenticity. OCB mode encrypts-and-authenticates an arbitrary message M\in\bits^* using only

\lceil |M|/n\rceil + 2

block-cipher invocations, where

n

is the block length of the underlying block cipher. Additional overhead is small. OCB refines a scheme, IAPM, suggested by Jutla [IACR-2000/39], who was the first to devise an authenticated-encryption mode with minimal overhead compared to standard modes. Desirable new properties of OCB include: very cheap offset calculations; operating on an arbitrary message M\in\bits^*; producing ciphertexts of minimal length; using a single underlying cryptographic key; making a nearly optimal number of block-cipher calls; avoiding the need for a random IV; and rendering it infeasible for an adversary to find pretag collisions . The paper provides a full proof of security for OCB

Cryptology ePrint Archive

On the Molecular Basis of Ion Permeation in the Epithelial Na+ Channel

Author: Alvarez
Andersen
Andersen
Canessa
Canessa
Doyle
Firsov
Firsov
French
Fuller
Garty
Hille
Ismailov
Kellenberger
Krovetz
Lingueglia
Palmer
Palmer
Palmer
Renard
Schild
Tavernarakis
Waldmann
Waldmann
Waldmann
Publication venue: The Rockefeller University Press
Publication date: 01/01/1999
Field of study

The epithelial Na+ channel (ENaC) is highly selective for Na+ and Li+ over K+ and is blocked by the diuretic amiloride. ENaC is a heterotetramer made of two α, one β, and one γ homologous subunits, each subunit comprising two transmembrane segments. Amino acid residues involved in binding of the pore blocker amiloride are located in the pre-M2 segment of β and γ subunits, which precedes the second putative transmembrane α helix (M2). A residue in the α subunit (αS589) at the NH2 terminus of M2 is critical for the molecular sieving properties of ENaC. ENaC is more permeable to Li+ than Na+ ions. The concentration of half-maximal unitary conductance is 38 mM for Na+ and 118 mM for Li+, a kinetic property that can account for the differences in Li+ and Na+ permeability. We show here that mutation of amino acid residues at homologous positions in the pre-M2 segment of α, β, and γ subunits (αG587, βG529, γS541) decreases the Li+/Na+ selectivity by changing the apparent channel affinity for Li+ and Na+. Fitting single-channel data of the Li+ permeation to a discrete-state model including three barriers and two binding sites revealed that these mutations increased the energy needed for the translocation of Li+ from an outer ion binding site through the selectivity filter. Mutation of βG529 to Ser, Cys, or Asp made ENaC partially permeable to K+ and larger ions, similar to the previously reported αS589 mutations. We conclude that the residues αG587 to αS589 and homologous residues in the β and γ subunits form the selectivity filter, which tightly accommodates Na+ and Li+ ions and excludes larger ions like K+

Crossref

Serveur académique lausannois

PubMed Central

A Cooperative Paradigm for Fighting Information Overload

Author: D. Fensel
G.W. Furnas
J. Han
M. Koster
N. Cercone
P. Maes
R. Krovetz
T. Berners-Lee
T. Nishida
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Crossref