6,107 research outputs found
Reformulating Space Syntax: The Automatic Definition and Generation of Axial Lines and Axial Maps
Space syntax is a technique for measuring the relative accessibility of different locations in a spatial system which has been loosely partitioned into convex spaces.These spaces are approximated by straight lines, called axial lines, and the topological graph associated with their intersection is used to generate indices of distance, called integration, which are then used as proxies for accessibility. The most controversial problem in applying the technique involves the definition of these lines. There is no unique method for their generation, hence different users generate different sets of lines for the same application. In this paper, we explore this problem, arguing that to make progress, there need to be unambiguous, agreed procedures for generating such maps. The methods we suggest for generating such lines depend on defining viewsheds, called isovists, which can be approximated by their maximum diameters,these lengths being used to form axial maps similar to those used in space syntax. We propose a generic algorithm for sorting isovists according to various measures,approximating them by their diameters and using the axial map as a summary of the extent to which isovists overlap (intersect) and are accessible to one another. We examine the fields created by these viewsheds and the statistical properties of the maps created. We demonstrate our techniques for the small French town of Gassin used originally by Hillier and Hanson (1984) to illustrate the theory, exploring different criteria for sorting isovists, and different axial maps generated by changing the scale of resolution. This paper throws up as many problems as it solves but we believe it points the way to firmer foundations for space syntax
Efficiently Clustering Very Large Attributed Graphs
Attributed graphs model real networks by enriching their nodes with
attributes accounting for properties. Several techniques have been proposed for
partitioning these graphs into clusters that are homogeneous with respect to
both semantic attributes and to the structure of the graph. However, time and
space complexities of state of the art algorithms limit their scalability to
medium-sized graphs. We propose SToC (for Semantic-Topological Clustering), a
fast and scalable algorithm for partitioning large attributed graphs. The
approach is robust, being compatible both with categorical and with
quantitative attributes, and it is tailorable, allowing the user to weight the
semantic and topological components. Further, the approach does not require the
user to guess in advance the number of clusters. SToC relies on well known
approximation techniques such as bottom-k sketches, traditional graph-theoretic
concepts, and a new perspective on the composition of heterogeneous distance
measures. Experimental results demonstrate its ability to efficiently compute
high-quality partitions of large scale attributed graphs.Comment: This work has been published in ASONAM 2017. This version includes an
appendix with validation of our attribute model and distance function,
omitted in the converence version for lack of space. Please refer to the
published versio
Incremental Entity Resolution from Linked Documents
In many government applications we often find that information about
entities, such as persons, are available in disparate data sources such as
passports, driving licences, bank accounts, and income tax records. Similar
scenarios are commonplace in large enterprises having multiple customer,
supplier, or partner databases. Each data source maintains different aspects of
an entity, and resolving entities based on these attributes is a well-studied
problem. However, in many cases documents in one source reference those in
others; e.g., a person may provide his driving-licence number while applying
for a passport, or vice-versa. These links define relationships between
documents of the same entity (as opposed to inter-entity relationships, which
are also often used for resolution). In this paper we describe an algorithm to
cluster documents that are highly likely to belong to the same entity by
exploiting inter-document references in addition to attribute similarity. Our
technique uses a combination of iterative graph-traversal, locality-sensitive
hashing, iterative match-merge, and graph-clustering to discover unique
entities based on a document corpus. A unique feature of our technique is that
new sets of documents can be added incrementally while having to re-resolve
only a small subset of a previously resolved entity-document collection. We
present performance and quality results on two data-sets: a real-world database
of companies and a large synthetically generated `population' database. We also
demonstrate benefit of using inter-document references for clustering in the
form of enhanced recall of documents for resolution.Comment: 15 pages, 8 figures, patented wor
Two exponential neighborhoods for single machine scheduling
We study the problem of minimizing total completion time on a single machine with the presence of release dates. We present two different approaches leading to exponential neighborhoods in which the best improving neighbor can be determined in polynomial time. Furthermore, computational results are presented to get insight in the performance of the developed neighborhoods
- …