The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handling such big geospatial data, mainly due to their rigid schema requirements and limited scalability. Numerous NoSQL databases have emerged and actively serve as alternative data stores for big spatial data. This study presents a framework, called GeoYCSB, developed for benchmarking NoSQL databases with geospatial workloads. To develop GeoYCSB, we extend YCSB, a de facto benchmark framework for NoSQL systems, by integrating into its design architecture the new components necessary to support geospatial workloads. GeoYCSB supports both microbenchmarks and macrobenchmarks and facilitates the use of real datasets in both. It is extensible to evaluate any NoSQL database, provided they support spatial queries, using geospatial workloads performed on datasets of any geometric complexity. We use GeoYCSB to benchmark two leading document stores, MongoDB and Couchbase, and present the experimental results and analysis. Finally, we demonstrate the extensibility of GeoYCSB by including a new dataset consisting of complex geometries and using it to benchmark a system with a wide variety of geospatial queries: Apache Accumulo, a wide-column store, with the GeoMesa framework applied on top
Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation.
Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code.
To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process
Although existing techniques have proposed automated approaches to alleviate
the path explosion problem of symbolic execution, users still need to optimize
symbolic execution by applying various searching strategies carefully. As
existing approaches mainly support only coarse-grained global searching
strategies, they cannot efficiently traverse through complex code structures.
In this paper, we propose Eunomia, a symbolic execution technique that allows
users to specify local domain knowledge to enable fine-grained search. In
Eunomia, we design an expressive DSL, Aes, that lets users precisely pinpoint
local searching strategies to different parts of the target program. To further
optimize local searching strategies, we design an interval-based algorithm that
automatically isolates the context of variables for different local searching
strategies, avoiding conflicts between local searching strategies for the same
variable. We implement Eunomia as a symbolic execution platform targeting
WebAssembly, which enables us to analyze applications written in various
languages (like C and Go) but can be compiled into WebAssembly. To the best of
our knowledge, Eunomia is the first symbolic execution engine that supports the
full features of the WebAssembly runtime. We evaluate Eunomia with a dedicated
microbenchmark suite for symbolic execution and six real-world applications.
Our evaluation shows that Eunomia accelerates bug detection in real-world
applications by up to three orders of magnitude. According to the results of a
comprehensive user study, users can significantly improve the efficiency and
effectiveness of symbolic execution by writing a simple and intuitive Aes
script. Besides verifying six known real-world bugs, Eunomia also detected two
new zero-day bugs in a popular open-source project, Collections-C.Comment: Accepted by ACM SIGSOFT International Symposium on Software Testing
and Analysis (ISSTA) 202
La tesi discute un applicativo che mostra all'utente nomi ed informazioni dei luoghi da cui è circondato, e in particolare analizza le tecnologie che permetterebbero di realizzarlo.
Per prima cosa presentiamo tale applicativo, analizziamo quelli simili già esistenti ed osserviamo l'utilità delle applicazioni smartphone per usi turistici.
E' utile basarsi su sviluppi 3D che sono stati compiuti da Google nel corso del tempo, ne è un esempio Google Earth.
Si va quindi ad approfondire il metodo di creazione di Earth e successivamente le API di sviluppo che sono state create grazie alla possibilità di appoggiarsi a questa tecnologia che possiamo definire una mappatura 3D del mondo.
Parliamo quindi di WebGL, API che permette di creare e renderizzare disegni tridimensionali su browser web, che è stato integrato nel 2021 con Google Maps per permetterne la visualizzazione tridimensionale.
Per poi passare ad AR Core, il quale mette a disposizione gli strumenti per fondere realtà e realtà aumentata.
Una volta analizzati questi mezzi si discute l'integrazione di questi strumenti
In the realm of software applications in the transportation industry,
Domain-Specific Languages (DSLs) have enjoyed widespread adoption due to their
ease of use and various other benefits. With the ceaseless progress in computer
performance and the rapid development of large-scale models, the possibility of
programming using natural language in specified applications - referred to as
Application-Specific Natural Language (ASNL) - has emerged. ASNL exhibits
greater flexibility and freedom, which, in turn, leads to an increase in
computational complexity for parsing and a decrease in processing performance.
To tackle this issue, our paper advances a design for an intermediate
representation (IR) that caters to ASNL and can uniformly process
transportation data into graph data format, improving data processing
performance. Experimental comparisons reveal that in standard data query
operations, our proposed IR design can achieve a speed improvement of over
forty times compared to direct usage of standard XML format data
The newly founded company Oceanbox is creating a novel oceanographic forecasting system to provide oceanography as a service. These services use mathematical models that generate large hydrodynamic data sets as unstructured triangular grids with high-resolution model areas. Oceanbox makes the model results accessible in a web application. New visualizations are needed to accommodate land-masking and large data volumes.
In this thesis, we propose using a k-d tree to spatially partition unstructured triangular grids to provide the look-up times needed for interactive visualizations. A k-d tree is implemented in F# called FsKDTree. This thesis also describes the implementation of dynamic tiling map layers to visualize current barbs, scalar fields, and particle streams. The current barb layer queries data from the data server with the help of the k-d tree and displays it in the browser. Scalar fields and particle streams are implemented using WebGL, which enables the rendering of triangular grids. Stream particle visualization effects are implemented as velocity advection computed on the GPU with textures.
The new visualizations are used in Oceanbox's production systems, and spatial indexing has been integrated into Oceanbox's archive retrieval system. FsKDTree improves tree creation times by up to 4x over the C# equivalent and improves search times by up to 13x compared to the .NET C# implementation. Finally, the largest model areas can be viewed with current barbs, scalar fields, and particle stream visualizations at 60 FPS, even for the largest model areas provided by the service
'Institute of Electrical and Electronics Engineers (IEEE)'
Field of study
Edge computing enables data processing and storage closer to where the data
are created. Given the largely distributed compute environment and the
significantly dispersed data distribution, there are increasing demands of data
sharing and collaborative processing on the edge. Since data shuffling can
dominate the overall execution time of collaborative processing jobs,
considering the limited power supply and bandwidth resource in edge
environments, it is crucial and valuable to reduce the communication overhead
across edge devices. Compared with data compression, compact data structures
(CDS) seem to be more suitable in this case, for the capability of allowing
data to be queried, navigated, and manipulated directly in a compact form.
However, the relevant work about applying CDS to edge computing generally
focuses on the intuitive benefit from reduced data size, while few discussions
about the challenges are given, not to mention empirical investigations into
real-world edge use cases. This research highlights the challenges,
opportunities, and potential scenarios of CDS implementation in edge computing.
Driven by the use case of shuffling-intensive data analytics, we proposed a
three-layer architecture for CDS-aided data processing and particularly studied
the feasibility and efficiency of the CDS layer. We expect this research to
foster conjoint research efforts on CDS-aided edge data analytics and to make
wider practical impacts
This paper presents a comprehensive investigation of existing feature
extraction tools for symbolic music and contrasts their performance to
determine the set of features that best characterizes the musical style of a
given music score. In this regard, we propose a novel feature extraction tool,
named musif, and evaluate its efficacy on various repertoires and file formats,
including MIDI, MusicXML, and **kern. Musif approximates existing tools such as
jSymbolic and music21 in terms of computational efficiency while attempting to
enhance the usability for custom feature development. The proposed tool also
enhances classification accuracy when combined with other sets of features. We
demonstrate the contribution of each set of features and the computational
resources they require. Our findings indicate that the optimal tool for feature
extraction is a combination of the best features from each tool rather than
those of a single one. To facilitate future research in music information
retrieval, we release the source code of the tool and benchmarks.Comment: Published at ISMIR 202
We present an overview of our work on the SAP HANA Scale-out Extension, a novel distributed database architecture designed to support large scale analytics over real-time data. This platform permits high performance OLAP with massive scale-out capabilities, while concurrently allowing OLTP workloads. This dual capability enables analytics over real-time changing data and allows fine grained user-specified service level agreements (SLAs) on data freshness. We advocate the decoupling of core database components such as query processing, concurrency control, and persistence, a design choice made possible by advances in high-throughput low-latency networks and storage devices. We provide full ACID guarantees and build on a logical timestamp mechanism to provide MVCC-based snapshot isolation, while not requiring synchronous updates of replicas. Instead, we use asynchronous update propagation guaranteeing consistency with timestamp validation. We provide a view into the design and development of a large scale data management platform for real-time analytics, driven by the needs of modern enterprise customers