Search CORE

883 research outputs found

The FermiFab Toolbox for Fermionic Many-Particle Quantum Systems

Author: Anderson
Ando
Christian B. Mendl
Coleman
Dirac
Friesecke
Friesecke
Hohenberg
Intel
Intel
Intel
Kohn
Loewdin
Mazziotti
Mendl
Mendl
Peskin
von Neumann
Publication venue: 'Elsevier BV'
Publication date: 04/03/2011
Field of study

This paper introduces the FermiFab toolbox for many-particle quantum systems. It is mainly concerned with the representation of (symbolic) fermionic wavefunctions and the calculation of corresponding reduced density matrices (RDMs). The toolbox transparently handles the inherent antisymmetrization of wavefunctions and incorporates the creation/annihilation formalism. Thus, it aims at providing a solid base for a broad audience to use fermionic wavefunctions with the same ease as matrices in Matlab, say. Leveraging symbolic computation, the toolbox can greatly simply tedious pen-and-paper calculations for concrete quantum mechanical systems, and serves as "sandbox" for theoretical hypothesis testing. FermiFab (including full source code) is freely available as a plugin for both Matlab and Mathematica.Comment: 17 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Annual report

Author: Intel Corporation
Publication venue: [S.l.] : Intel,
Publication date
Field of study

Diposit Digital de Documents de la UAB

Global citizenship report

Author: Intel Corporation
Publication venue: [S.l.] : Intel,
Publication date
Field of study

Diposit Digital de Documents de la UAB

MoonGen: A Scriptable High-Speed Packet Generator

Author: Datasheet Intel Ethernet
Datasheet Intel Ethernet
Gallenmüller Sebastian
Gigabit Ethernet Controller Datasheet Intel
IEEE
Rizzo Luigi
Salim Jamal Hadi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/03/2016
Field of study

We present MoonGen, a flexible high-speed packet generator. It can saturate 10 GbE links with minimum sized packets using only a single CPU core by running on top of the packet processing framework DPDK. Linear multi-core scaling allows for even higher rates: We have tested MoonGen with up to 178.5 Mpps at 120 Gbit/s. We move the whole packet generation logic into user-controlled Lua scripts to achieve the highest possible flexibility. In addition, we utilize hardware features of Intel NICs that have not been used for packet generators previously. A key feature is the measurement of latency with sub-microsecond precision and accuracy by using hardware timestamping capabilities of modern commodity NICs. We address timing issues with software-based packet generators and apply methods to mitigate them with both hardware support on commodity NICs and with a novel method to control the inter-packet gap in software. Features that were previously only possible with hardware-based solutions are now provided by MoonGen on commodity hardware. MoonGen is available as free software under the MIT license at https://github.com/emmericp/MoonGenComment: Published at IMC 201

arXiv.org e-Print Archive

Crossref

Exploiting asynchrony from exact forward recovery for DUE in iterative solvers

Author: Architectures Software Developer's Intel®
Berry M.
Degalahal V.
Family Intel® Xeon®
Kleen A.
Li X.
Manual Architecture Programmer's
Shewchuk J. R.
Sorin D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

This paper presents a method to protect iterative solvers from Detected and Uncorrected Errors (DUE) relying on error detection techniques already available in commodity hardware. Detection operates at the memory page level, which enables the use of simple algorithmic redundancies to correct errors. Such redundancies would be inapplicable under coarse grain error detection, but become very powerful when the hardware is able to precisely detect errors. Relations straightforwardly extracted from the solver allow to recover lost data exactly. This method is free of the overheads of backwards recoveries like checkpointing, and does not compromise mathematical convergence properties of the solver as restarting would do. We apply this recovery to three widely used Krylov subspace methods, CG, GMRES and BiCGStab, and their preconditioned versions. We implement our resilience techniques on CG considering scenarios from small (8 cores) to large (1024 cores) scales, and demonstrate very low overheads compared to state-of-the-art solutions. We deploy our recovery techniques either by overlapping them with algorithmic computations or by forcing them to be in the critical path of the application. A trade-off exists between both approaches depending on the error rate the solver is suffering. Under realistic error rates, overlapping decreases overheads from 5.37% down to 3.59% for a non-preconditioned CG on 8 cores.This work has been partially supported by the European Research Council under the European Union's 7th FP, ERC Advanced Grant 321253, and by the Spanish Ministry of Science and Innovation under grant TIN2012-34557. L. Jaulmes has been partially supported by the Spanish Ministry of Education, Culture and Sports under grant FPU2013/06982. M. Moreto has been partially supported by the Spanish Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship JCI-2012-15047. M. Casas has been partially supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Co-fund programme of the Marie Curie Actions of the European Union's 7th FP (contract 2013 BP B 00243).Peer ReviewedPostprint (author's final draft

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Disengaged Scheduling for Fair, Protected Access to Fast Computational Accelerators

Author: Dwarakinath A.
GPU
Gupta V.
Intel Corporation
Kato S.
Kato S.
Kyriazis G.
Menychtas K.
Shen K.
Soares L.
Publication venue
Publication date: 04/09/2014
Field of study

Today’s operating systems treat GPUs and other computational accelerators as if they were simple devices, with bounded and predictable response times. With accelerators assuming an increasing share of the workload on modern machines, this strategy is already problematic, and likely to become untenable soon. If the operating system is to enforce fair sharing of the machine, it must assume responsibility for accelerator scheduling and resource management. Fair, safe scheduling is a particular challenge on fast accelerators, which allow applications to avoid kernel-crossing overhead by interacting directly with the device. We propose a disengaged scheduling strategy in which the kernel intercedes between applications and the accelerator on an infrequent basis, to monitor their use of accelerator cycles and to determine which applications should be granted access over the next time interval. Our strategy assumes a well defined, narrow interface exported by the accelerator. We build upon such an interface, systematically inferred for the latest Nvidia GPUs. We construct several example schedulers, including Disengaged Timeslice with overuse control that guarantees fairness and Disengaged Fair Queueing that is effective in limiting resource idleness, but probabilistic. Both schedulers ensure fair sharing of the GPU, even among uncooperative or adversarial applications; Disengaged Fair Queueing incurs a 4 % overhead on average (max 18%) compared to direct devic

CiteSeerX

Crossref

Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory

Author: Alglave Jade
Bertot Yves
Boudol Gérard
Burckhardt Sebastian
Collier William
Compaq Computer Corp. 2002.
Grisenthwaite Richard
Howells David
IBM Corp. 2009.
Intel Corp. 2002.
Intel Corp. 2009.
Kuperstein Michael
Ltd ARM
Ltd ARM
Nardelli Francesco Zappa
Neiger Gil
Paul
SPARC International Inc. 1992.
SPARC International Inc. 1994.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We propose an axiomatic generic framework for modelling weak memory. We show how to instantiate this framework for SC, TSO, C++ restricted to release-acquire atomics, and Power. For Power, we compare our model to a preceding operational model in which we found a flaw. To do so, we define an operational model that we show equivalent to our axiomatic model. We also propose a model for ARM. Our testing on this architecture revealed a behaviour later acknowl-edged as a bug by ARM, and more recently 31 additional anomalies. We offer a new simulation tool, called herd, which allows the user to specify the model of his choice in a concise way. Given a specification of a model, the tool becomes a simulator for that model. The tool relies on an axiomatic description; this choice allows us to outperform all previous simulation tools. Additionally, we confirm that verification time is vastly improved, in the case of bounded model checking. Finally, we put our models in perspective, in the light of empirical data obtained by analysing the C and C++ code of a Debian Linux distribution. We present our new analysis tool, called mole, which explores a piece of code to find the weak memory idioms that it uses

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Queen Mary Research Online

Towards optimal packed string matching

Author: Aho
Aho
AMD
AMD
Apostolico
Arlazarov
Baeza-Yates
Belazzougui
Ben-Kiki
Ben-Nissan
Bille
Boyer
Breslauer
Breslauer
Breslauer
Breslauer
Breslauer
Brodnik
Cole
Cole
Commentz-Walter
Crochemore
Crochemore
Crochemore
Czumaj
Césari
Dany Breslauer
Daykin
Duval
Faro
Faro
Faro
Fich
Fine
Fischer
Fredriksson
Fredriksson
Furst
Galil
Galil
Goldberg
Gusfield
Gąsieniec
Iliopoulos
Intel
Intel
Intel
Knuth
Knuth
Leszek Ga̧sieniec
Lothaire
Muthukrishnan
Muthukrishnan
Muthukrishnan
Navarro
Oren Ben-Kiki
Oren Weimann
Philip Bille
Roberto Grossi
Rytter
Tarhio
Vishkin
Vishkin
Yao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

a r t i c l e i n f o a b s t r a c t Dedicated to Professor Gad M. Landau, on the occasion of his 60th birthday Keywords: String matching Word-RAM Packed strings In the packed string matching problem, it is assumed that each machine word can accommodate up to α characters, thus an n-character string occupies n/α memory words. The main word-size string-matching instruction wssm is available in contemporary commodity processors. The other word-size maximum-suffix instruction wslm is only required during the pattern pre-processing. Benchmarks show that our solution can be efficiently implemented, unlike some prior theoretical packed string matching work. (b) We also consider the complexity of the packed string matching problem in the classical word-RAM model in the absence of the specialized micro-level instructions wssm and wslm. We propose micro-level algorithms for the theoretically efficient emulation using parallel algorithms techniques to emulate wssm and using the Four-Russians technique to emulate wslm. Surprisingly, our bit-parallel emulation of wssm also leads to a new simplified parallel random access machine string-matching algorithm. As a byproduct to facilitate our results we develop a new algorithm for finding the leftmost (most significant) 1 bits in consecutive non-overlapping blocks of uniform size inside a word. This latter problem is not known to be reducible to finding the rightmost 1, which can be easily solved, since we do not know how to reverse the bits of a word in O (1) time

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

Online Research Database In Technology

デルとインテルの戦略的パートナーシップ

Author: DELL's strategic partnership with Intel山本雅昭
Publication venue: 広島経済大学経済学会
Publication date: 31/10/2007
Field of study

１本研究と本稿について13; ２驚異的な成長と「インテル・インサイド」の謎13; ３デルのエンタープライズ・ソリューション事業13; ４製品戦略と品質管理13; ５民事訴訟： 05-44113; ６システム・ロックイン戦略13; ７小

Robot object manipulation using stereoscopic vision and conformal geometric algebra

Author: Jalisco Mexico Eduardo Bayro-Corrochano B A Intel
Julio Zamora-Esquivel
Publication venue
Publication date: 01/01/2011
Field of study

Abstract. This paper uses geometric algebra to formulate, in a single framework, the kinematics of a three finger robotic hand, a binocular robotic head, and the interactions between 3D objects, all of which are seen in stereo images. The main objective is the formulation of a kinematic control law to close the loop between perception and actions, which allows to perform a smooth visually guided object manipulation

CiteSeerX