12,704 research outputs found
Using string-matching to analyze hypertext navigation
A method of using string-matching to analyze hypertext navigation was developed, and evaluated using two weeks of website logfile data. The method is divided into phases that use: (i) exact string-matching to calculate subsequences of links that were repeated in different navigation sessions (common trails through the website), and then (ii) inexact matching to find other similar sessions (a community of users with a similar interest). The evaluation showed how subsequences could be used to understand the information pathways users chose to follow within a website, and that exact and inexact matching provided complementary ways of identifying information that may have been of interest to a whole community of users, but which was only found by a minority. This illustrates how string-matching could be used to improve the structure of hypertext collections
On the Complexity of Exact Pattern Matching in Graphs: Binary Strings and Bounded Degree
Exact pattern matching in labeled graphs is the problem of searching paths of
a graph that spell the same string as the pattern . This
basic problem can be found at the heart of more complex operations on variation
graphs in computational biology, of query operations in graph databases, and of
analysis operations in heterogeneous networks, where the nodes of some paths
must match a sequence of labels or types. We describe a simple conditional
lower bound that, for any constant , an -time or an -time algorithm for exact pattern
matching on graphs, with node labels and patterns drawn from a binary alphabet,
cannot be achieved unless the Strong Exponential Time Hypothesis (SETH) is
false. The result holds even if restricted to undirected graphs of maximum
degree three or directed acyclic graphs of maximum sum of indegree and
outdegree three. Although a conditional lower bound of this kind can be somehow
derived from previous results (Backurs and Indyk, FOCS'16), we give a direct
reduction from SETH for dissemination purposes, as the result might interest
researchers from several areas, such as computational biology, graph database,
and graph mining, as mentioned before. Indeed, as approximate pattern matching
on graphs can be solved in time, exact and approximate matching are
thus equally hard (quadratic time) on graphs under the SETH assumption. In
comparison, the same problems restricted to strings have linear time vs
quadratic time solutions, respectively, where the latter ones have a matching
SETH lower bound on computing the edit distance of two strings (Backurs and
Indyk, STOC'15).Comment: Using Lemma 12 and Lemma 13 might to be enough to prove Lemma 14.
However, the proof of Lemma 14 is correct if you assume that the graph used
in the reduction is a DAG. Hence, since the problem is already quadratic for
a DAG and a binary alphabet, it has to be quadratic also for a general graph
and a binary alphabe
The Conference Review Process
This presentation is for students on the 3rd year ECS Multimedia course where students run their own conference, and submit and review papers.
In this presentation we explain the academic review process, look at the structure of a review, and give some examples of positive and negative reviews
Order preserving pattern matching on trees and DAGs
The order preserving pattern matching (OPPM) problem is, given a pattern
string and a text string , find all substrings of which have the
same relative orders as . In this paper, we consider two variants of the
OPPM problem where a set of text strings is given as a tree or a DAG. We show
that the OPPM problem for a single pattern of length and a text tree
of size can be solved in time if the characters of are
drawn from an integer alphabet of polynomial size. The time complexity becomes
if the pattern is over a general ordered alphabet. We
then show that the OPPM problem for a single pattern and a text DAG is
NP-complete
Generating trails automatically, to aid navigation when you revisit an environment
A new method for generating trails from a person’s movement through a virtual environment (VE) is described. The method is entirely automatic (no user input is needed), and uses string-matching to identify similar sequences of movement and derive the person’s primary trail. The method was evaluated in a virtual building, and generated trails that substantially reduced the distance participants traveled when they searched for target objects in the building 5-8 weeks after a set of familiarization sessions. Only a modest amount of data (typically five traversals of the building) was required to generate trails that were both effective and stable, and the method was not affected by the order in which objects were visited. The trail generation method models an environment as a graph and, therefore, may be applied to aiding navigation in the real world and information spaces, as well as VEs
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services
Advanced services in digital libraries (DLs) have been developed and widely used to address the required capabilities of an assortment of systems as DLs expand into diverse application domains. These systems may require support for images (e.g., Content-Based Image Retrieval), Complex (information) Objects, and use of content at fine grain (e.g., Superimposed Information). Due to the lack of consensus on precise theoretical definitions for those services, implementation efforts often involve ad hoc development, leading to duplication and interoperability problems. This article presents a methodology to address those problems by extending a precisely specified minimal digital library (in the 5S framework) with formal definitions of aforementioned services. The theoretical extensions of digital library functionality presented here are reinforced with practical case studies as well as scenarios for the individual and integrative use of services to balance theory and practice. This methodology has implications that other advanced
services can be continuously integrated into our current extended framework whenever they are identified. The theoretical definitions and case study we present may impact future development efforts and a wide range of digital library researchers, designers, and developers
Computer technologies and institutional memory
NASA programs for manned space flight are in their 27th year. Scientists and engineers who worked continuously on the development of aerospace technology during that period are approaching retirement. The resulting loss to the organization will be considerable. Although this problem is general to the NASA community, the problem was explored in terms of the institutional memory and technical expertise of a single individual in the Man-Systems division. The main domain of the expert was spacecraft lighting, which became the subject area for analysis in these studies. The report starts with an analysis of the cumulative expertise and institutional memory of technical employees of organizations such as NASA. A set of solutions to this problem are examined and found inadequate. Two solutions were investigated at length: hypertext and expert systems. Illustrative examples were provided of hypertext and expert system representation of spacecraft lighting. These computer technologies can be used to ameliorate the problem of the loss of invaluable personnel
- …