16,923 research outputs found
Lower Tanana flashcards
Master's Project (M.A.) University of Alaska Fairbanks, 2019As part of a study of Lower Tanana, I found it expedient to create a learning tool to help myself gain
familiarity with Lower Tanana. I chose to employ Anki, an open-source tool for creating digital flashcard
based learning tools. With Anki, I created cards for individual Lower Tanana words and phrases. In
producing the computer flashcards for Lower Tanana, I realized that they could serve as a highly flexible
system for both preserving and learning Lower Tanana. Further, because of the built-in system flexibility,
such systems can be created to aid in preserving and teaching other endangered languages
Memory Networks
We describe a new class of learning models called memory networks. Memory
networks reason with inference components combined with a long-term memory
component; they learn how to use these jointly. The long-term memory can be
read and written to, with the goal of using it for prediction. We investigate
these models in the context of question answering (QA) where the long-term
memory effectively acts as a (dynamic) knowledge base, and the output is a
textual response. We evaluate them on a large-scale QA task, and a smaller, but
more complex, toy task generated from a simulated world. In the latter, we show
the reasoning power of such models by chaining multiple supporting sentences to
answer questions that require understanding the intension of verbs
A Hybrid Approach to Domain-Specific Entity Linking
The current state-of-the-art Entity Linking (EL) systems are geared towards
corpora that are as heterogeneous as the Web, and therefore perform
sub-optimally on domain-specific corpora. A key open problem is how to
construct effective EL systems for specific domains, as knowledge of the local
context should in principle increase, rather than decrease, effectiveness. In
this paper we propose the hybrid use of simple specialist linkers in
combination with an existing generalist system to address this problem. Our
main findings are the following. First, we construct a new reusable benchmark
for EL on a corpus of domain-specific conversations. Second, we test the
performance of a range of approaches under the same conditions, and show that
specialist linkers obtain high precision in isolation, and high recall when
combined with generalist linkers. Hence, we can effectively exploit local
context and get the best of both worlds.Comment: SEM'1
Prospects and limitations of full-text index structures in genome analysis
The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared
Approximating Persistent Homology in Euclidean Space Through Collapses
The \v{C}ech complex is one of the most widely used tools in applied
algebraic topology. Unfortunately, due to the inclusive nature of the \v{C}ech
filtration, the number of simplices grows exponentially in the number of input
points. A practical consequence is that computations may have to terminate at
smaller scales than what the application calls for.
In this paper we propose two methods to approximate the \v{C}ech persistence
module. Both are constructed on the level of spaces, i.e. as sequences of
simplicial complexes induced by nerves. We also show how the bottleneck
distance between such persistence modules can be understood by how tightly they
are sandwiched on the level of spaces. In turn, this implies the correctness of
our approximation methods.
Finally, we implement our methods and apply them to some example point clouds
in Euclidean space
Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality
We survey diverse approaches to the notion of information: from Shannon
entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov
complexity are presented: randomness and classification. The survey is divided
in two parts published in a same volume. Part II is dedicated to the relation
between logic and information system, within the scope of Kolmogorov
algorithmic information theory. We present a recent application of Kolmogorov
complexity: classification using compression, an idea with provocative
implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses
how Kolmogorov complexity, besides being a foundation to randomness, is also
related to classification. Another approach to classification is also
considered: the so-called "Google classification". It uses another original and
attractive idea which is connected to the classification using compression and
to Kolmogorov complexity from a conceptual point of view. We present and unify
these different approaches to classification in terms of Bottom-Up versus
Top-Down operational modes, of which we point the fundamental principles and
the underlying duality. We look at the way these two dual modes are used in
different approaches to information system, particularly the relational model
for database introduced by Codd in the 70's. This allows to point out diverse
forms of a fundamental duality. These operational modes are also reinterpreted
in the context of the comprehension schema of axiomatic set theory ZF. This
leads us to develop how Kolmogorov's complexity is linked to intensionality,
abstraction, classification and information system.Comment: 43 page
Information Outlook, July 2006
Volume 10, Issue 7https://scholarworks.sjsu.edu/sla_io_2006/1006/thumbnail.jp
- …