148,411 research outputs found
Recommended from our members
Optimizing Spatiotemporal Analysis Using Multidimensional Indexing with GeoWave
The open source software GeoWave bridges the gap between geographic information systems and distributed computing. This is done by preserving locality of multidimensional data when indexing it into a single-dimensional key-value store, using space filling curves. This means that like values in each dimension are stored physically close together in the datastore. We demonstrate the efficiencies and benefits of the GeoWave indexing algorithm to store and query billions of spatiotemporal data points. We show how this indexing strategy can be used to reduce query and processing times by multiple orders of magnitude using publicly available taxi trip data published by the New York City Taxi & Limousine Commission. Furthermore, we demonstrate how this efficiency lends itself to analysis that would otherwise be unfeasible
The Fascinator: A lightweight, modular contribution to the Fedora-commons world
4th International Conference on Open RepositoriesThis presentation was part of the session : Fedora User Group PresentationsDate: 2009-05-20 01:30 PM – 03:00 PMThe Australian government has supported the development of repository infrastructure for several years now. One product of this support was the ARROW project (Australian Research Repositories Online to the World). The ARROW project sponsored a hybrid commercial/open-source approach to building vendor-supported repository infrastructure with open-source underpinnings.
One of the open-source contributions, which complements the vendor-sourced product adopted by many of the ARROW partners is a simple to install and configure front-end web service for Fedora repositories known as "The Fascinator".
The Fascinator was conceived initially as a way to prove a point in an ongoing dialogue within the ARROW project about repository architecture. The goal was to test the hypothesis that it would be possible to build a useful, fast, flexible web front end for a repository using a single fast indexing system to handle browsing via facets, full-text search, multiple 'portal' views of subsets of a large corpus, and most importantly, easy-to administer security that could handle the most common uses cases seen in the ARROW community. This contrasted with the approach taken by ARROW's commercial partner, which used several different indices to achieve only some of the same functionality in an environment which was much more complex to manage and configure.
We will give an overview of the product in both functional and technical terms. Functionally, The Fascinator offers:
Click-to-create portals.
Easy to configure security based on a query-based filter system, the repository owner can express security in terms of saved-searches that define what a user or group is allowed to see.
Highly flexible indexing of a Fedora repository for administrators (and by extension anything the harvesting module can scrape-up).
Technically, The Fascinator is a modular system, written in Java so it is easy to deploy with Fedora and Solr, consisting of:
An indexing system for Fedora which builds on the standard G-Search supplied with the software, and some work done by the Muradora team.
A configurable harvesting application which can ingest data from OAI-PMH, ORE, and local file systems.
A web portal application which can be used to build flexible front end websites or act as a service to other sites via an HTTP API.
An OAI-PMH (and ATOM archive) system which can create sub-feeds from a repository very easily without complexities like OAI-PMH sets.
An easy to use installer for Unix based platforms allowing a systems administrator to install the application along with Fedora and Solr with a few keystrokes.
While The Fascinator's goals were modest it has been met with some enthusiasm by repository managers in Australia and beyond, and is being trialled and/or piloted in a small number of sites across the world.ARROW project, Monash Universit
Oblivion: Mitigating Privacy Leaks by Controlling the Discoverability of Online Information
Search engines are the prevalently used tools to collect information about
individuals on the Internet. Search results typically comprise a variety of
sources that contain personal information -- either intentionally released by
the person herself, or unintentionally leaked or published by third parties,
often with detrimental effects on the individual's privacy. To grant
individuals the ability to regain control over their disseminated personal
information, the European Court of Justice recently ruled that EU citizens have
a right to be forgotten in the sense that indexing systems, must offer them
technical means to request removal of links from search results that point to
sources violating their data protection rights. As of now, these technical
means consist of a web form that requires a user to manually identify all
relevant links upfront and to insert them into the web form, followed by a
manual evaluation by employees of the indexing system to assess if the request
is eligible and lawful.
We propose a universal framework Oblivion to support the automation of the
right to be forgotten in a scalable, provable and privacy-preserving manner.
First, Oblivion enables a user to automatically find and tag her disseminated
personal information using natural language processing and image recognition
techniques and file a request in a privacy-preserving manner. Second, Oblivion
provides indexing systems with an automated and provable eligibility mechanism,
asserting that the author of a request is indeed affected by an online
resource. The automated ligibility proof ensures censorship-resistance so that
only legitimately affected individuals can request the removal of corresponding
links from search results. We have conducted comprehensive evaluations, showing
that Oblivion is capable of handling 278 removal requests per second, and is
hence suitable for large-scale deployment
Indexed Labels for Loop Iteration Dependent Costs
We present an extension to the labelling approach, a technique for lifting
resource consumption information from compiled to source code. This approach,
which is at the core of the annotating compiler from a large fragment of C to
8051 assembly of the CerCo project, looses preciseness when differences arise
as to the cost of the same portion of code, whether due to code transformation
such as loop optimisations or advanced architecture features (e.g. cache). We
propose to address this weakness by formally indexing cost labels with the
iterations of the containing loops they occur in. These indexes can be
transformed during the compilation, and when lifted back to source code they
produce dependent costs.
The proposed changes have been implemented in CerCo's untrusted prototype
compiler from a large fragment of C to 8051 assembly.Comment: In Proceedings QAPL 2013, arXiv:1306.241
Changes to the Tax Exclusion of Employer-Sponsored Health Insurance Premiums: A Potential Source of Financing for Health Reform
Examines eight options for limiting the tax exclusion of employer-sponsored health insurance premiums. Compares, by income level, estimated effects of various caps and indices on tax revenues and after-tax incomes in the first year and over ten years
- …