3,286 research outputs found

    Extending functional databases for use in text-intensive applications

    Get PDF
    This thesis continues research exploring the benefits of using functional databases based around the functional data model for advanced database applications-particularly those supporting investigative systems. This is a growing generic application domain covering areas such as criminal and military intelligence, which are characterised by significant data complexity, large data sets and the need for high performance, interactive use. An experimental functional database language was developed to provide the requisite semantic richness. However, heavy use in a practical context has shown that language extensions and implementation improvements are required-especially in the crucial areas of string matching and graph traversal. In addition, an implementation on multiprocessor, parallel architectures is essential to meet the performance needs arising from existing and projected database sizes in the chosen application area. [Continues.

    Evolvable hardware system for automatic optical inspection

    Get PDF

    Software tools for the rapid development of signal processing and communications systems on configurable platforms

    Get PDF
    Programmers and engineers in the domains of high performance computing (HPC) and electronic system design have a shared goal: to define a structure for coordination and communication between nodes in a highly parallel network of processing tasks. Practitioners in both of these fields have recently encountered additional constraints that motivate the use of multiple types of processing device in a hybrid or heterogeneous platform, but constructing a working "program" to be executed on such an architecture is very time-consuming with current domain-specific design methodologies. In the field of HPC, research has proposed solutions involving the use of alternative computational devices such as FPGAs (field-programmable gate arrays), since these devices can exhibit much greater performance per unit of power consumption. The appeal of integrating these devices into traditional microprocessor-based systems is mitigated, however, by the greater difficulty in constructing a system for the resulting hybrid platform. In the field of electronic system design, a similar problem of integration exists. Many of the highly parallel FPGA-based systems that Xilinx and its customers produce for applications such as telecommunications and video processing require the additional use of one or more microprocessors, but coordinating the interactions between existing FPGA cores and software running on the microprocessors is difficult. The aim of my project is to improve the design flow for hybrid systems by proposing, firstly, an abstract representation of these systems and their components which captures in metadata their different models of computation and communication; secondly, novel design checking, exploration and optimisation techniques based around this metadata; and finally, a novel design methodology in which component and system metadata is used to generate software simulation models. The effectiveness of this approach will be evaluated through the implementation of two physical-layer telecommunications system models that meet the requirements of the 3GPP "LTE" standard, which is commercially relevant to Xilinx and many other organisations

    Automatic Parallelization of Data-Driven JStar Programs

    Get PDF
    Data-driven problems have common characteristics: a large number of small objects with complex dependencies. This makes the traditional parallel programming approaches more difficult to apply as pipe-lining the task dependencies may require to rewrite or recompile the program into efficient parallel implementations. This thesis focuses on data-driven JStar programs that have rules triggered by the tuples from a bulky CSV file or from other sources of complex data, and making those programs run fast in parallel. JStar is a new declarative language for parallel programming that encourages programmers to write their applications with implicit parallelism. The thesis briefly introduces the JStar language and the implicit default parallelism of the JStar compiler. It describes the root causes of the poor performance of the naive parallel JStar programs and defines a performance tuning process to increase the speed of JStar programs as the number of cores increases and to minimize the memory usage in the Java Heap. Several graphic analysis tools were developed to allow easier analysis of bottlenecks in parallel programs. The JStar compiler and runtime were extended so that it is easy to apply a variety of optimisations to a JStar program without changing the JStar source code. This process was applied to four case studies which were benchmarked on different multi-core machines to measure the performance and scalability of JStar programs

    AT-GIS: highly parallel spatial query processing with associative transducers

    Get PDF
    Users in many domains, including urban planning, transportation, and environmental science want to execute analytical queries over continuously updated spatial datasets. Current solutions for largescale spatial query processing either rely on extensions to RDBMS, which entails expensive loading and indexing phases when the data changes, or distributed map/reduce frameworks, running on resource-hungry compute clusters. Both solutions struggle with the sequential bottleneck of parsing complex, hierarchical spatial data formats, which frequently dominates query execution time. Our goal is to fully exploit the parallelism offered by modern multicore CPUs for parsing and query execution, thus providing the performance of a cluster with the resources of a single machine. We describe AT-GIS, a highly-parallel spatial query processing system that scales linearly to a large number of CPU cores. ATGIS integrates the parsing and querying of spatial data using a new computational abstraction called associative transducers(ATs). ATs can form a single data-parallel pipeline for computation without requiring the spatial input data to be split into logically independent blocks. Using ATs, AT-GIS can execute, in parallel, spatial query operators on the raw input data in multiple formats, without any pre-processing. On a single 64-core machine, AT-GIS provides 3× the performance of an 8-node Hadoop cluster with 192 cores for containment queries, and 10× for aggregation queries

    On the classification and evaluation of prefetching schemes

    Get PDF
    Abstract available: p. [2

    Charakterisierung von dem ALice-Tpc-ReadOut-Chip

    Get PDF
    Die vorliegende Arbeit beschĂ€ftigt sich mit der Charakterisierung des ALTRO Chips (ALICE TPC Readout), der ein integraler und wichtiger Bestandteil der Auslesekette des TPC (Time Projection Chamber) Detektors von ALICE (A Large Ion Collider Experiment) ist. ALICE ist ein Experiment am noch im Bau befindlichen LHC (Large Hadron Collider) am CERN mit der zentralen Ausrichtung, Schwerionenkollisionen zu untersuchen. Diese sind von besonderem Interesse, da durch sie ein experimenteller Zugriff zu dem QGP (Quark Gluon Plasma) existiert, dem einzigen vom Standardmodell vorhergesagten PhasenĂŒbergang, der unter Laborbedingungen erreichbar ist. Im Jahr 2004 wurden Messungen an einem Teststrahl am CERN PS (Proton Synchrotron) durchgefĂŒhrt. Der Prototyp wurde voll mit FECs bestĂŒckt, was 5400 KanĂ€len entspricht und einer anderen Gasmixtur (Ne/N2/CO2 90%/5%/5%) befĂŒllt. FĂŒr das optimale Leistungsverhalten der ALICE TPC muß der Digitalprozessor im ALTRO, bestehend aus vier Berechnungseinheiten, mit den passenden Werten konfiguriert werden. Der Datenfluss beginnt mit dem BCS1 (Baseline Correction and Subtraction 1) Modul, das systematische Störungen und die Grundlinie entfernt. Da der ALTRO kontinuierlich das anliegende Signal abtastet, entfernt es automatisch langsame GrundlinienverĂ€nderungen, die Beispielsweise durch TemperaturĂ€nderungen auftreten können. Gefolgt von dem TCF (Tail Cancellation Filter), der den Schweif des langsam fallenden, vom PASA generierten Signals entfernt. Um die nichtsystematischen Störungen der Grundlinie zu entfernen, folgt die BCS2 (Baseline Correction and Subtraction 2), die auf einer gleitenden Mittelwertsberechnung mit Ausschluß von Detektorsignalen ĂŒber einen doppelten Schwellenwert basiert. Die finale Einheit fĂŒr die Signalverarbeitung ist die ZSU (Zero Suppression Unit), die Meßpunkte unterhalb eines definierten Schwellwertes entfernt. Hier wird der weg beschrieben die TCF und BCS1 Parameter aus vorhandenen Detektordaten zu extrahieren. WĂ€hrend der Analyse der Daten von kosmischen Teilchen fiel bei Signalen mit hoher Amplitude (>700 ADC) eine zusĂ€tzliche Struktur in dem Schweif auf. Der Monitor wurde deswegen mit einem gleitenden Mittelwertfilter erweitert, worauf sich diese Struktur auch in kleineren Signalen (> 200 ADC) zeigte. Dieses Signal wird von Ionen erzeugt, die zur Kathode oder zu den Pads driften, bisher ist jedoch weder die Streuung der Elektronenlawine an der Anode, noch die Variationsbreite in den erzeugten Elektronlawinen verstanden oder gemessen worden. Eine erfolgreiche Messung, sowie Charakterisierung wird in dieser Arbeit beschrieben. Im Jahr 2005 im Sommer beginnt der Einbau der Gaskammern der TPC in ALICE, die Elektronik folgt am Ende dieses Jahres. Parallel hierzu wurde der Prototyp der TPC wieder in Betrieb genommen und im FrĂŒhling wird ein kompletter Sektor mit der Detektorelektronik ausgestattet. An diesen zwei Aufbauten wird die ALTRO Charakterisierung fortgefĂŒhrt, verfeinert und komplettiert

    Data Management for Dynamic Multimedia Analytics and Retrieval

    Get PDF
    Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems. However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries. Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures. All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other
    • 

    corecore