907 research outputs found
Securely Outsourcing Large Scale Eigen Value Problem to Public Cloud
Cloud computing enables clients with limited computational power to
economically outsource their large scale computations to a public cloud with
huge computational power. Cloud has the massive storage, computational power
and software which can be used by clients for reducing their computational
overhead and storage limitation. But in case of outsourcing, privacy of
client's confidential data must be maintained. We have designed a protocol for
outsourcing large scale Eigen value problem to a malicious cloud which provides
input/output data security, result verifiability and client's efficiency. As
the direct computation method to find all eigenvectors is computationally
expensive for large dimensionality, we have used power iterative method for
finding the largest Eigen value and the corresponding Eigen vector of a matrix.
For protecting the privacy, some transformations are applied to the input
matrix to get encrypted matrix which is sent to the cloud and then decrypting
the result that is returned from the cloud for getting the correct solution of
Eigen value problem. We have also proposed result verification mechanism for
detecting robust cheating and provided theoretical analysis and experimental
result that describes high-efficiency, correctness, security and robust
cheating resistance of the proposed protocol
Concurrency-Enhancing Transformations for Asynchronous Behavioral Specifications
State-of-the-art synthesis tools for the design of asynchronous systems rely on syntax-driven translation of behavioral specifications. While these tools provide the benefit of rapid design, they are severely limited in the performance of their resulting implementations (e.g., 10-100 MHz). This research proposes a synthesis approach that builds upon the existing state-of-the-art tools, preserving rapid design times and allowing for an order of magnitude increase in performance. In particular, this thesis proposes a powerful approach to enhance the concurrency of the original behavioral specifications. The proposed approach is a “source-to-source” transformation of the original behavioral specification into a new behavioral specification using two specific optimizations: automatic parallelization and automatic pipelining. The approach has been implemented in an automated design tool and applied to a suite of examples for validation. All examples were synthesized to the gate level after optimization and compared with the original, non-optimized versions. Results indicate improvement in throughput by a factor of up to 23X and a reduction in latency by up to 72%
Recommended from our members
Percolation-based compiling for evaluation of parallelism and hardware design trade-offs
This thesis investigates parallelism and hardware design trade-offs of parallel and pipelined architectures. To explore these trade-offs we developed a retargetable compiler based on a set of powerful code transformations called Percolation Scheduling (PS) that map programs with real-time constraints and/or massive time requirements onto synchronous, parallel, high-performance or semi-custom architectures.High-performance is achieved through extraction of application inherent fine-grain parallelism and the use of a suitable architecture. Exploiting fine-grain parallelism is a critical part of exploiting all of the parallelism available in a given program, particularly since highly irregular forms of parallelism are often not visible at coarser levels and since the use of low-level parallelism has a multiplicative effect on the overall performance.To extract substantial parallelism from both the hardware and the compiler, we use a clean, highly parallel VLIW-like architecture that is synchronous, has multiple functional units and has a single program counter. The use of a hazard-free and homogeneous architecture does not result only in a better VLSI design but also considerably increases the compiler's ability to produce better code. To further enhance parallelism we modified the uni-cycle VLIW model and extended the transformations such that pipelined units that provide extra parallelism are used.Another approach presented is of resource constrained scheduling (RCS). Since the RCS problem is known to be NP-hard, in practice it may be solved only by a heuristic approach. We argue that using the heuristic after extraction of the unlimited-resources schedule may yield better results than if the heuristic has been applied at the beginning of the scheduling process.Through a series of benchmarks we evaluate hardware design trade-offs and show that speed-ups on average of one order of magnitude are feasible with sufficient functional units. However, when resources are limited we show that the number of functional units needed may be optimized for a particular suite of application programs
The Dafny Integrated Development Environment
In recent years, program verifiers and interactive theorem provers have
become more powerful and more suitable for verifying large programs or proofs.
This has demonstrated the need for improving the user experience of these tools
to increase productivity and to make them more accessible to non-experts. This
paper presents an integrated development environment for Dafny-a programming
language, verifier, and proof assistant-that addresses issues present in most
state-of-the-art verifiers: low responsiveness and lack of support for
understanding non-obvious verification failures. The paper demonstrates several
new features that move the state-of-the-art closer towards a verification
environment that can provide verification feedback as the user types and can
present more helpful information about the program or failed verifications in a
demand-driven and unobtrusive way.Comment: In Proceedings F-IDE 2014, arXiv:1404.578
Effectiveness of abstract interpretation in automatic parallelization: a case study in logic programming
We report on a detailed study of the application and effectiveness of program analysis based on abstract interpretation to automatic program parallelization. We study the case of parallelizing logic programs using the notion of strict independence. We first propose and prove correct a methodology for the application in the parallelization task of the information inferred by abstract
interpretation, using a parametric domain. The methodology is generic in the sense of allowing the use of different analysis domains. A number of well-known approximation domains are then studied and the transformation into the parametric domain defined. The transformation directly
illustrates the relevance and applicability of each abstract domain for the application. Both local and global analyzers are then built using these domains and embedded in a complete parallelizing compiler. Then, the performance of the domains in this context is assessed through a number
of experiments. A comparatively wide range of aspects is studied, from the resources needed by the analyzers in terms of time and memory to the actual benefits obtained from the information inferred. Such benefits are evaluated both in terms of the characteristics of the parallelized code and of the actual speedups obtained from it. The results show that data flow analysis plays an important role in achieving efficient parallelizations, and that the cost of such analysis can be reasonable even for quite sophisticated abstract domains. Furthermore, the results also offer significant insight into the characteristics of the domains, the demands of the application, and the
trade-offs involved
- …