145 research outputs found
Recommended from our members
Duplo: A framework for OCaml post-link optimisation
We present a novel framework,
Duplo
, for the low-level post-link optimisation of OCaml programs, achieving a speedup of 7% and a reduction of at least 15% of the code size of widely-used OCaml applications. Unlike existing post-link optimisers, which typically operate on target-specific machine code, our framework operates on a Low-Level Intermediate Representation (LLIR) capable of representing both the OCaml programs and any C dependencies they invoke through the foreign-function interface (FFI). LLIR is analysed, transformed and lowered to machine code by our post-link optimiser, LLIR-OPT. Most importantly, LLIR allows the optimiser to cross the OCaml-C language boundary, mitigating the overhead incurred by the FFI and enabling analyses and transformations in a previously unavailable context. The optimised IR is then lowered to amd64 machine code through the existing target-specific code generator of LLVM, modified to handle garbage collection just as effectively as the native OCaml backend. We equip our optimiser with a suite of SSA-based transformations and points-to analyses capable of capturing the semantics and representing the memory models of both languages, along with a cross-language inliner to embed C methods into OCaml callers. We evaluate the gains of our framework, which can be attributed to both our optimiser and the more sophisticated amd64 backend of LLVM, on a wide-range of widely-used OCaml applications, as well as an existing suite of micro- and macro-benchmarks used to track the performance of the OCaml compiler.
EPSRC EP/P020011/1, Cambridge Trust
Developing Secure Software With C And C++: A Different Approach
Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2005Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2005Ağa bağlı bilgisayarlar yaygınlaştıkça, günlük işlerin yürütülmesinden devlet sistemlerinin otomasyonuna kadar her seviyede rol almaya başlamışlar ve bu sistemlerin güvenliği de kritik hal almıştır. Bilgi işlem sistemlerinin güvene layık olabilmesi için bütün bileşenlerinin güvenli olması gerekir, yazılım da bu bileşenlerden belki de en önemlisidir. Yazılımların, yaşam süreçlerinin bütün aşamalarında güvenli bir yapıyla sonuçlanacak şekilde tasarlanmaları gerekmektedir. Bu makale, bir yazılımın yaşam sürecini baştan sona ele almaktadır. Güvene layık bir yazılım için her aşamada, nelere dikkat edilmesi gerektiği anlatılmış, hangi tasarım seçeneklerinin olduğu sıralanmış, farklı metotlardan hangilerinin izlenmesinin daha iyi olacağı tartışılmış ve hangi araçların kullanılabileceği incelenmiştir. Bu sayede geliştirme veya bakım gibi değişik aşamalardaki projelere referans kaynağı olarak hizmet verebilmektedir. Bu makalede ele alınan yaşam süreci, yazılım mühendisliğinde sıklıkla başvuru olarak kullanılan, süreci isteklerin tanımı, tasarım, geliştirme, kontrol etme ve bakım olarak bölümleyen “Şelale Yaşam Süreci”dir. Yeni nesil programlama dilleri çıktıkça, C/C++ ve Birleştirici gibi düşük seviye dillerin yeni öğrencilerce benimsenmesi azalmaktadır. Buna ve başka sebeplere de bağlı olarak bu dillerde tecrübeli eleman eksikliği baş gösterdikçe, zaten güvenliğin sağlanmasının göreceli olarak daha zor olduğu bu ortamlarda ciddi güvenlik açıkları oluşmaktadır. Dünya üzerindeki kod tabanının çoğunluğunun halen bu dillerden oluşması durumu daha kritik yapmaktadır. Bu makalede bahsedilen konuların çoğunluğu dilden bağımsız olsa da, ilgili bölümlerde, az önce bahsedilen sorunu göz önüne alarak C/C++ ve Birleştirici dilleri üstünde durulmuştur. Sonuç olarak, yazılım güvenliğinin etkin olarak sağlanabilmesi için, güvenliğin bütün yaşam süreci evrelerinde ele alınması gerekliliği gösterilmiştir. Ayrıca, yaşam sürecinin aşamalarından bir çoğuna, daha önce bu kapsamda uygulanmamış olan yeni yöntemler önerilmiştir.As networked computing penetrates daily life more and more, it becomes more common in every level from daily life to automation of government systems. In order computing systems to be secure, each and every of their components must be secure, too. Software is most important component among those. Each phase of software lifecycle must be implemented in a secure fashion. This thesis is inspecting lifecycle of software from beginning to the end and aligns the new ideas that it is bringing to the lifecycle. After giving necessary background information about the subject, new ideas have been presented, examples have been given and possible other options have been discussed. During explaining most of the subjects, the topics that is considered to be complimentary is either added or referred to. Thanks to that, this thesis can be a reference source to projects in different phases like implementation and maintenance. Waterfall lifecycle model, which is used frequently in software development projects and divides software projects into phases as analysis of requirements, design, implementation, verification and maintenance, is used as a template in this thesis. As new generations of programming languages emerge, adoption of low-level languages such as C/C++ and assembly by new students is decreasing. As lack of experienced staff shows up itself due to this and other causes, severe vulnerabilities are happening in such environments, where developing of secure software is already proven to be hard. The fact that majority of current code base in the world is in those languages makes the situation even more critical. Although most of the subjects in this thesis are programming language independent, C/C++ and assembler language problems are especially covered because of the reasons just mentioned. As a result, it has been shown that security countermeasures must be taken in all phases of software lifecycle in order to ensure high level of security throughout the application. Furthermore, new ideas of security countermeasures have been brought to many of the phases of software lifecycle.Yüksek LisansM.Sc
A Type System for Julia
The Julia programming language was designed to fill the needs of scientific
computing by combining the benefits of productivity and performance languages.
Julia allows users to write untyped scripts easily without needing to worry
about many implementation details, as do other productivity languages. If one
just wants to get the work done-regardless of how efficient or general the
program might be, such a paradigm is ideal. Simultaneously, Julia also allows
library developers to write efficient generic code that can run as fast as
implementations in performance languages such as C or Fortran. This combination
of user-facing ease and library developer-facing performance has proven quite
attractive, and the language has increasing adoption.
With adoption comes combinatorial challenges to correctness. Multiple
dispatch -- Julia's key mechanism for abstraction -- allows many libraries to
compose "out of the box." However, it creates bugs where one library's
requirements do not match what another provides. Typing could address this at
the cost of Julia's flexibility for scripting.
I developed a "best of both worlds" solution: gradual typing for Julia. My
system forms the core of a gradual type system for Julia, laying the foundation
for improving the correctness of Julia programs while not getting in the way of
script writers. My framework allows methods to be individually typed or
untyped, allowing users to write untyped code that interacts with typed library
code and vice versa. Typed methods then get a soundness guarantee that is
robust in the presence of both dynamically typed code and dynamically generated
definitions. I additionally describe protocols, a mechanism for typing
abstraction over concrete implementation that accommodates one common pattern
in Julia libraries, and describe its implementation into my typed Julia
framework.Comment: PhD thesi
Recommended from our members
Complete spatial safety for C and C++ using CHERI capabilities
Lack of memory safety in commonly used systems-level languages such as C and C++ results in a constant stream of new exploitable software vulnerabilities and exploit techniques. Many exploit mitigations have been proposed and deployed over the years, yet none address the root issue: lack of memory safety. Most C and C++ implementations assume a memory model based on a linear array of bytes rather than an object-centric view. Whilst more efficient on contemporary CPU architectures, linear addresses cannot encode the target object, thus permitting memory errors such as spatial safety violations (ignoring the bounds of an object). One promising mechanism to provide memory safety is CHERI
(Capability Hardware Enhanced RISC Instructions), which extends existing processor architectures with capabilities that provide hardware-enforced checks for all accesses and can be used to prevent spatial memory violations. This dissertation prototypes and evaluates a pure-capability programming model (using CHERI capabilities for all pointers) to provide complete spatial memory protection for traditionally unsafe languages.
As the first step towards memory safety, all language-visible pointers can be implemented as capabilities. I analyse the programmer-visible impact of this change and refine the pure-capability programming model to provide strong source-level compatibility with existing code. Second, to provide robust spatial safety, language-invisible pointers (mostly arising from program linkage) such as those used for functions calls and global variable accesses must also be protected. In doing so, I highlight trade-offs between performance and privilege minimization for implicit and programmer-visible pointers. Finally, I present
CheriSH, a novel and highly compatible technique that protects against buffer overflows between fields of the same object, hereby ensuring that the CHERI spatial memory protection is complete.
I find that the byte-granular spatial safety provided by CHERI pure-capability code is not only stronger than most other approaches, but also incurs almost negligible performance overheads in common cases (0.1% geometric mean) and a worst-case overhead of only 23.3% compared to the insecure MIPS baseline. Moreover, I show that the pure-capability programming model provides near-complete source-level compatibility with existing programs. I evaluate this based on porting large widely used open-source applications such as PostgreSQL and WebKit with only minimal changes: fewer than 0.1% of source lines.
I conclude that pure-capability CHERI C/C++ is an eminently viable programming environment offering strong memory protection, good source-level compatibility and low performance overheads
Doctor of Philosophy
dissertationCompilers are indispensable tools to developers. We expect them to be correct. However, compiler correctness is very hard to be reasoned about. This can be partly explained by the daunting complexity of compilers. In this dissertation, I will explain how we constructed a random program generator, Csmith, and used it to find hundreds of bugs in strong open source compilers such as the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). The success of Csmith depends on its ability of being expressive and unambiguous at the same time. Csmith is composed of a code generator and a GTAV (Generation-Time Analysis and Validation) engine. They work interactively to produce expressive yet unambiguous random programs. The expressiveness of Csmith is attributed to the code generator, while the unambiguity is assured by GTAV. GTAV performs program analyses, such as points-to analysis and effect analysis, efficiently to avoid ambiguities caused by undefined behaviors or unspecifed behaviors. During our 4.25 years of testing, Csmith has found over 450 bugs in the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). We analyzed the bugs by putting them into different categories, studying the root causes, finding their locations in compilers' source code, and evaluating their importance. We believe analysis results are useful to future random testers, as well as compiler writers/users
Exploring novel designs of NLP solvers: Architecture and Implementation of WORHP
Mathematical Optimization in general and Nonlinear Programming in particular, are applied by many scientific disciplines, such as the automotive sector, the aerospace industry, or the space agencies. With some established NLP solvers having been available for decades, and with the mathematical community being rather conservative in this respect, many of their programming standards are severely outdated. It is safe to assume that such usability shortcomings impede the wider use of NLP methods; a representative example is the use of static workspaces by legacy FORTRAN codes. This dissertation gives an account of the construction of the European NLP solver WORHP by using and combining software standards and techniques that have not previously been applied to mathematical software to this extent. Examples include automatic code generation, a consistent reverse communication architecture and the elimination of static workspaces. The result is a novel, industrial-grade NLP solver that overcomes many technical weaknesses of established NLP solvers and other mathematical software
Recommended from our members
Sandboxed, Online Debugging of Production Bugs for SOA Systems
Short time-to-bug localization is extremely important for any 24x7 service-oriented application. To this end, we introduce a new debugging paradigm called live debugging. There are two goals that any live debugging infrastructure must meet: Firstly, it must offer real-time insight for bug diagnosis and localization, which is paramount when errors happen in user-facing applications. Secondly, live debugging should not impact user-facing performance for normal events. In large distributed applications, bugs which impact only a small percentage of users are common. In such scenarios, debugging a small part of the application should not impact the entire system.
With the above-stated goals in mind, this thesis presents a framework called Parikshan, which leverages user-space containers (OpenVZ) to launch application instances for the express purpose of live debugging. Parikshan is driven by a live-cloning process, which generates a replica (called debug container) of production services, cloned from a production container which continues to provide the real output to the user. The debug container provides a sandbox environment, for safe execution of monitoring/debugging done by the users without any perturbation to the execution environment. As a part of this framework, we have designed customized-network proxies, which replicate inputs from clients to both the production and test-container, as well safely discard all outputs. Together the network duplicator, and the debug container ensure both compute and network isolation of the debugging environment. We believe that this piece of work provides the first of its kind practical real-time debugging of large multi-tier and cloud applications, without requiring any application downtime, and minimal performance impact
HW/SW mechanisms for instruction fusion, issue and commit in modern u-processors
In this thesis we have explored the co-designed paradigm to show alternative processor design points. Specifically, we have provided HW/SW mechanisms for instruction fusion, issue and commit for modern processors. We have implemented a co-designed virtual machine monitor that binary translates x86 instructions into RISC like micro-ops. Moreover, the translations are stored as superblocks, which are a trace of basic blocks. These superblocks are further optimized using speculative and non-speculative optimizations. Hardware mechanisms exists in-order to take corrective action in case of misspeculations. During the course of this PhD we have made following contributions.
Firstly, we have provided a novel Programmable Functional unit, in-order to speed up general-purpose applications. The PFU consists of a grid of functional units, similar to CCA, and a distributed internal register file. The inputs of the macro-op are brought from the Physical Register File to the internal register file using a set of moves and a set of loads. A macro-op fusion algorithm fuses micro-ops at runtime. The fusion algorithm is based on a scheduling step that indicates whether the current fused instruction is beneficial or not. The micro-ops corresponding to the macro-ops are stored as control signals in a configuration. The macro-op consists of a configuration ID which helps in locating the configurations. A small configuration cache is present inside the Programmable Functional unit, that holds these configurations. In case of a miss in the configuration cache configurations are loaded from I-Cache. Moreover, in-order to support bulk commit of atomic superblocks that are larger
than the ROB we have proposed a speculative commit mechanism. For this we have proposed a Speculative commit register map table that holds the mappings of the speculatively committed instructions. When all the instructions of the superblock have committed the speculative state is copied to Backend Register Rename Table.
Secondly, we proposed a co-designed in-order processor with with two kinds of accelerators. These FU based accelerators run a pair of fused instructions. We have considered two kinds of instruction fusion. First, we fused a pair of independent loads together into vector loads and execute them on vector load units. For the second kind of instruction fusion we have fused a pair of dependent simple ALU instructions and execute them in Interlock Collapsing ALUs (ICALU). Moreover, we have evaluated performance of various code optimizations such as list-scheduling, load-store telescoping and load hoisting among others. We have compared our co-designed processor with small instruction window out-of-order processors.
Thirdly, we have proposed a co-designed out-of-order processor. Specifically we have reduced complexity in two areas. First
of all, we have co-designed the commit mechanism, that enable bulk commit of atomic superblocks. In this solution we got rid of the conventional ROB, instead we introduce the Superblock Ordering Buffer (SOB). SOB ensures program order is maintained at the granularity of the superblock, by bulk committing the program state. The program state consists of the register state and the memory state. The register state is held in a per superblock register map table, whereas the memory state is held in gated store buffer and updated in bulk. Furthermore, we have tackled the complexity of Out-of-Order issue logic by using FIFOs. We have proposed an enhanced steering heuristic that fixes the inefficiencies of the existing dependence-based heuristic. Moreover, a mechanism to release the FIFO entries earlier is also proposed that further improves the performance of the steering heuristic.En aquesta tesis hem explorat el paradigma de les màquines issue i commit per processadors actuals. Hem implementat una màquina virtual que tradueix binaris x86 a micro-ops de tipus RISC. Aquestes traduccions es guarden com a superblocks, que en realitat no és més que una traça de virtuals co-dissenyades. En particular, hem proposat mecanismes hw/sw per a la fusió d’instruccions, blocs bàsics. Aquests superblocks s’optimitzen utilitzant optimizacions especualtives i d’altres no speculatives. En cas de les optimizations especulatives es consideren mecanismes per a la gestió de errades en l’especulació. Al llarg d’aquesta tesis s’han fet les següents contribucions:
Primer, hem proposat una nova unitat functional programmable (PFU) per tal de millorar l’execució d’aplicacions de proposit general. La PFU està formada per un conjunt d’unitats funcionals, similar al CCA, amb un banc de registres intern a la PFU distribuït a les unitats funcionals que la composen. Les entrades de la macro-operació que s’executa en la PFU es mouen del banc de registres físic convencional al intern fent servir un conjunt de moves i loads. Un algorisme de fusió combina més micro-operacions en temps d’execució. Aquest algorisme es basa en un pas de planificació que mesura el benefici de les decisions de fusió. Les micro operacions corresponents a la macro operació s’emmagatzemen com a senyals de control en una configuració. Les macro-operacions tenen associat un identificador de configuració que ajuda a localitzar d’aquestes. Una petita cache de configuracions està present dintre de la PFU per tal de guardar-les. En cas de que la configuració no estigui a la cache, les configuracions es carreguen de la cache d’instruccions. Per altre banda, per tal de donar support al commit atòmic dels superblocks que sobrepassen el tamany del ROB s’ha proposat un mecanisme de commit especulatiu. Per aquest mecanisme hem proposat una taula de mapeig especulativa dels registres, que es copia a la taula no especulativa quan totes les instruccions del superblock han comitejat.
Segon, hem proposat un processador en order co-dissenyat que combina dos tipus d’acceleradors. Aquests acceleradors executen un parell d’instruccions fusionades. S’han considerat dos tipus de fusió d’instructions. Primer, combinem un parell de loads independents formant loads vectorials i els executem en una unitat vectorial. Segon, fusionem parells d’instruccions simples d’alu que són dependents i que s’executaran en una Interlock Collapsing ALU (ICALU). Per altra aquestes tecniques les hem evaluat conjuntament amb diverses optimizacions com list scheduling, load-store telescoping i hoisting de loads, entre d’altres. Aquesta proposta ha estat comparada amb un processador fora d’ordre.
Tercer, hem proposat un processador fora d’ordre co-dissenyat efficient reduint-ne la complexitat en dos areas principals. En primer lloc, hem co-disenyat el mecanisme de commit per tal de permetre un eficient commit atòmic del superblocks. En aquesta solució hem substituït el ROB convencional, i en lloc hem introduït el Superblock Ordering Buffer (SOB). El SOB manté l’odre de programa a granularitat de superblock. L’estat del programa consisteix en registres i memòria. L’estat dels registres es manté en una taula per superblock, mentre que l’estat de memòria es guarda en un buffer i s’actulitza atòmicament. La segona gran area de reducció de complexitat considerarada és l’ús de FIFOs a la lògica d’issue. En aquest últim àmbit hem proposat una heurística de distribució que solventa les ineficiències de l’heurística basada en dependències anteriorment proposada. Finalment, i junt amb les FIFOs, s’ha proposat un mecanisme per alliberar les entrades de la FIFO anticipadament
Recommended from our members
Programming an interim report on the SETL project. Part I: generalities. Part II: the SETL language and examples of its use
A summary of work during the past several years on SETL, a new programming language drawing its dictions and basic concepts from the mathematical theory of sets, is presented. The work was started with the idea that a programming language modeled after an appropriate version of the formal language of mathematics might allow a programming style with some of the succinctness of mathematics, and that this might ultimately enable one to express and experiment with more complex algorithms than are now within reach. Part I discusses the general approach followed in the work. Part II focuses directly on the details of the SETL language as it is now defined. It describes the facilities of SETL, includes short libraries of miscellaneous and of code optimization algorithms illustrating the use of SETL, and gives a detailed description of the manner in which the set-theoretic primitives provided by SETL are currently implemented. (RWR
- …