Search CORE

9 research outputs found

Immediate Split Trees: Immediate Encoding of Floating Point Split Values in Random Forests

Author: Chen Jian-Jia
Chen Kuan-Hsun
Hakert Christian
Publication venue: Springer
Publication date: 17/09/2023
Field of study

University of Twente Research Information

ROLLED: Racetrack Memory Optimized Linear Layout and Efficient Decomposition of Decision Trees

Author: Castrillon Jeronimo
Chen Jian-Jia
Chen Kuan
Hakert Christian
Hameed Fazal
Khan Asif Ali
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/08/2022
Field of study

Modern low power distributed systems tend to integrate machine learning algorithms. In resource-constrained setups, the execution of the models has to be optimized for performance and energy consumption. Racetrack memory (RTM) promises to achieve these goals by offering unprecedented integration density, smaller access latency, and reduced energy consumption. However, to access data in RTM, it needs to be shifted to the access port first. We investigate decision trees and develop placement strategies to reduce the total number of shifts in RTM. Decision trees allow profiling during training, resulting in tree paths' access probabilities. We map tree nodes to RTM so that the total number of shifts is minimal. Concretely, we present two different placement approaches: 1) where tree nodes are closely packed and placed uniformly in a single RTM location and 2) where decision tree nodes are decomposed to separate RTM blocks. We discuss theoretical cost models for both approaches, we formally prove an upper bound of 4× for the unified and an upper bound of 12× for the decomposed organization towards the optimal placement. We conduct a thorough experimental evaluation to compare our algorithms to the state-of-the-art placement strategies Our experimental evaluations show that the unified and decomposed solutions reduce the number of shifts by 58.1% and 80.1%, respectively, leading to a 53.8% and 46.3% reduction in the overall runtime and 52.6% and 61.7% reduction in the energy consumption, compared to a naive baseline

University of Twente Research Information

Rapid NVM Simulation and Analysis on Single Bit Granularity Featuring Gem5 and NVMain

Author: Chen Jian-Jia
Chen Kuan Hsun
Hakert Christian
Hölscher Nils
Seidl Tristan
Truong Minh Duy
Publication venue: IEEE
Publication date: 22/03/2024
Field of study

University of Twente Research Information

Memory Carousel: LLVM-Based Bitwise Wear-Leveling for Non-Volatile Main Memory

Author: Bauer Lars
Chen Jian-Jia
Chen Kuan-Hsun
Hakert Christian
Henkel Jörg
Hölscher Nils
Nassar Hassan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/12/2022
Field of study

Emerging non-volatile memory yields, alongside many advantages, technical shortcomings, such as reduced cell lifetime. Although many wear-leveling approaches exist to extend the lifetime of such memories, usually a trade-off for the granularity of wear-leveling has to be made. Due to iterative write schemes (repeatedly sense and write), wear-out of memory in certain systems is directly dependent on the written bit value and thus can be highly imbalanced, requiring dedicated bit-wise wear-leveling. Such a bit-wise wear-leveling so far has only be proposed together with a special hardware support. However, if no dedicated hardware solutions are available, especially for commercial off-the-shelf systems with non-volatile memories, a software solution can be crucial for the system lifetime. In this work, we propose entirely software-based bit-wise wearleveling, where the position of bits within CPU words in main memory is rotated on a regular basis. We leverage the LLVM intermediate representation to adjust load and store operations of the application with a custom compiler pass. Experimental evaluation shows that the lifetime by applying local rotation within the CPU word can be extended by a factor of up to 21×. We also show that our method can incorporate with coarser-grained wear-leveling, e.g. on block granularity and assist achievement of higher lifetime improvements

KITopen

University of Twente Research Information

Efficient Realization of Decision Trees for Real-Time Inference

Author: Buschjager Sebastian
Chen Jian Jia
Chen Kuan-Hsun
Hakert Christian
Lee Chao-Lin
Lee Jenq-Kuen
Morik Katharina
Su ChiaHui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2022
Field of study

For timing-sensitive edge applications, the demand for efficient lightweight machine learning solutions has increased recently. Tree ensembles are among the state-of-the-art in many machine learning applications. While single decision trees are comparably small, an ensemble of trees can have a significant memory footprint leading to cache locality issues, which are crucial to performance in terms of execution time. In this work, we analyze memory-locality issues of the two most common realizations of decision trees, i.e. native and if-else trees. We highlight, that both realizations demand a more careful memory layout to improve caching behavior and maximize performance. We adopt a probabilistic model of decision tree inference to find the best memory layout for each tree at the application layer. Further, we present an efficient heuristic to take architecture-dependent information into account thereby optimizing the given ensemble for a target computer architecture. Our code-generation framework, which is freely available on an open-source repository, produces optimized code sessions while preserving the structure and accuracy of the trees. With several real-world data sets, we evaluate the elapsed time of various tree realizations on server hardware as well as embedded systems for Intel and ARM processors. Our optimized memory layout achieves a reduction in execution time up to 75 % execution for server-class systems, and up to 70 % for embedded systems, respectively

University of Twente Research Information

Immediate Split Trees: Immediate Encoding of Floating Point Split Values in Random Forests

Author: Amini Massih-Reza
Canu Stéphane
Chen Jian-Jia
Chen Kuan-Hsun
Fischer Asja
Guns Tias
Hakert Christian
Kralj Novak Petra
Tsoumakas Grigorios
Publication venue: Springer Science + Business Media
Publication date: 17/03/2023
Field of study

Random forests and decision trees are increasingly interesting candidates for resource-constrained machine learning models. In order to make the execution of these models efficient under resource limitations, various optimized implementations have been proposed in the literature, usually implementing either native trees or if-else trees. While a certain motivation for the optimization of if-else trees is to benefit the behavior of dedicated instruction caches, in this work we highlight that if-else trees might also strongly depend on data caches. We identify one crucial issue of if-else tree implementations and propose an optimized implementation, which keeps the logic tree structure untouched and thus does not influence the accuracy, but eliminates the need to load comparison values from the data caches. Experimental evaluation of this implementation shows that we can greatly reduce the amount of data cache misses by up to 99%, while not increasing the amount of instruction cache misses in comparison to the state-of-the-art. We additionally highlight various scenarios, where the reduction of data cache misses draws important benefit on the allover execution time

Pulse-to-pulse wavelength switching of a nanosecond fiber laser by four-wave mixing seeded stimulated Raman amplification

Author: Agrawal
Chemnitz
Christian Jirauschek
Eibl
Feng
Fève
Ghatak
Hubertus Hakert
Jan Philip Kolb
Karpf
Karpf
Karpf
Kawakami
Klein
Lefort
Matthias Eibl
Paschotta
Robert Huber
Runcorn
Sebastian Karpf
Stolen
Stolen
Supradeepa
Sylvestre
Torben Blömker
Xu
Zhang
Publication venue: 'The Optical Society'
Publication date
Field of study

Crossref