43 research outputs found
Partial fillup and search time in LC tries
Andersson and Nilsson introduced in 1993 a level-compressed trie (in short:
LC trie) in which a full subtree of a node is compressed to a single node of
degree being the size of the subtree. Recent experimental results indicated a
'dramatic improvement' when full subtrees are replaced by partially filled
subtrees. In this paper, we provide a theoretical justification of these
experimental results showing, among others, a rather moderate improvement of
the search time over the original LC tries. For such an analysis, we assume
that n strings are generated independently by a binary memoryless source with p
denoting the probability of emitting a 1. We first prove that the so called
alpha-fillup level (i.e., the largest level in a trie with alpha fraction of
nodes present at this level) is concentrated on two values with high
probability. We give these values explicitly up to O(1), and observe that the
value of alpha (strictly between 0 and 1) does not affect the leading term.
This result directly yields the typical depth (search time) in the alpha-LC
tries with p not equal to 1/2, which turns out to be C loglog n for an
explicitly given constant C (depending on p but not on alpha). This should be
compared with recently found typical depth in the original LC tries which is C'
loglog n for a larger constant C'. The search time in alpha-LC tries is thus
smaller but of the same order as in the original LC tries.Comment: 13 page
Dynamic topic adaptation for improved contextual modelling in statistical machine translation
In recent years there has been an increased interest in domain adaptation techniques
for statistical machine translation (SMT) to deal with the growing amount of data from
different sources. Topic modelling techniques applied to SMT are closely related to the
field of domain adaptation but more flexible in dealing with unstructured text. Topic
models can capture latent structure in texts and are therefore particularly suitable for
modelling structure in between and beyond corpus boundaries, which are often arbitrary.
In this thesis, the main focus is on dynamic translation model adaptation to texts of
unknown origin, which is a typical scenario for an online MT engine translating web
documents. We introduce a new bilingual topic model for SMT that takes the entire
document context into account and for the first time directly estimates topic-dependent
phrase translation probabilities in a Bayesian fashion. We demonstrate our model’s
ability to improve over several domain adaptation baselines and further provide evidence
for the advantages of bilingual topic modelling for SMT over the more common
monolingual topic modelling. We also show improved performance when deriving further
adapted translation features from the same model which measure different aspects
of topical relatedness.
We introduce another new topic model for SMT which exploits the distributional
nature of phrase pair meaning by modelling topic distributions over phrase pairs using
their distributional profiles. Using this model, we explore combinations of local and
global contextual information and demonstrate the usefulness of different levels of contextual
information, which had not been previously examined for SMT. We also show
that combining this model with a topic model trained at the document-level further improves
performance. Our dynamic topic adaptation approach performs competitively
in comparison with two supervised domain-adapted systems.
Finally, we shed light on the relationship between domain adaptation and topic
adaptation and propose to combine multi-domain adaptation and topic adaptation in a
framework that entails automatic prediction of domain labels at the document level.
We show that while each technique provides complementary benefits to the overall
performance, there is an amount of overlap between domain and topic adaptation. This
can be exploited to build systems that require less adaptation effort at runtime
The Ithacan, 1973-05-03
https://digitalcommons.ithaca.edu/ithacan_1972-73/1025/thumbnail.jp
The Ithacan, 1973-04-26
https://digitalcommons.ithaca.edu/ithacan_1972-73/1024/thumbnail.jp
Reducing Router Forwarding Table Size Using Aggregation and Caching
The fast growth of global routing table size has been causing concerns that the Forwarding Information Base (FIB) will not be able to fit in existing routers\u27 expensive line-card memory, and upgrades will lead to a higher cost for network operators and customers. FIB Aggregation, a technique that merges multiple FIB entries into one, is probably the most practical solution since it is a software solution local to a router, and does not require any changes to routing protocols or network operations. While previous work on FIB aggregation mostly focuses on reducing table size, this work focuses on algorithms that can update compressed FIBs quickly and incrementally. Quick updates are critical to routers because they have very limited time to process routing updates without impacting packet delivery performance. We have designed three algorithms: FIFA-S for the smallest table size, FIFA-T for the shortest running time, and FIFA-H for both small tables and short running time, and operators can use the one best suited to their needs. These algorithms significantly improve over existing work in terms of reducing routers\u27 computation overhead and limiting impact on the forwarding plane while maintaining a good compression ratio. Another potential solution is to install only the most popular FIB entries into the fast memory (e.g., an FIB cache), while storing the complete FIB in slow memory. In this paper, we propose an effective FIB caching scheme that achieves a considerably higher hit ratio than previous approaches while preventing the cache-hiding problem. Our experimental results using data traffic from a regional network show that with only 20K prefixes in the cache (5.36% of the actual FIB size), the hit ratio of our scheme is higher than 99.95%. Our scheme can also efficiently handle cache misses, cache replacement and routing updates
The Ithacan, 1972-10-26
https://digitalcommons.ithaca.edu/ithacan_1972-73/1007/thumbnail.jp
Extension and hardware implementation of the comprehensive integrated security system concept
Merged with duplicate record (10026.1/700) on 03.01.2017 by CS (TIS)This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.The current strategy to computer networking is to increase the accessibility that legitimate
users have to their respective systems and to distribute functionality. This creates a more
efficient working environment, users may work from home, organisations can make better
use of their computing power. Unfortunately, a side effect of opening up computer systems
and placing them on potentially global networks is that they face increased threats from
uncontrolled access points, and from eavesdroppers listening to the data communicated
between systems. Along with these increased threats the traditional ones such as
disgruntled employees, malicious software, and accidental damage must still be countered.
A comprehensive integrated security system ( CISS ) has been developed to provide
security within the Open Systems Interconnection (OSI) and Open Distributed Processing
(ODP) environments. The research described in this thesis investigates alternative methods
for its implementation and its optimisation through partial implementation within hardware
and software and the investigation of mechanismsto improve its security.
A new deployment strategy for CISS is described where functionality is divided amongst
computing platforms of increasing capability within a security domain. Definitions are given
of a: local security unit, that provides terminal security; local security servers that serve the
local security units and domain management centres that provide security service coordination
within a domain.
New hardware that provides RSA and DES functionality capable of being connected to Sun
microsystems is detailed. The board can be used as a basic building block of CISS,
providing fast cryptographic facilities, or in isolation for discrete cryptographic services.
Software written for UNIX in C/C++ is described, which provides optimised security
mechanisms on computer systems that do not have SBus connectivity.
A new identification/authentication mechanism is investigated that can be added to existing
systems with the potential for extension into a real time supervision scenario. The
mechanism uses keystroke analysis through the application of neural networks and genetic
algorithms and has produced very encouraging results.
Finally, a new conceptual model for intrusion detection capable of dealing with real time
and historical evaluation is discussed, which further enhances the CISS concept