76 research outputs found
Threshold Encrypted Mempools: Limitations and Considerations
Encrypted mempools are a class of solutions aimed at preventing or reducing
negative externalities of MEV extraction using cryptographic privacy. Mempool
encryption aims to hide information related to pending transactions until a
block including the transactions is committed, targeting the prevention of
frontrunning and similar behaviour. Among the various methods of encryption,
threshold schemes are particularly interesting for the design of MEV mitigation
mechanisms, as their distributed nature and minimal hardware requirements
harmonize with a broader goal of decentralization.
This work looks beyond the formal and technical cryptographic aspects of
threshold encryption schemes to focus on the market and incentive implications
of implementing encrypted mempools as MEV mitigation techniques. In particular,
this paper argues that the deployment of such protocols without proper
consideration and understanding of market impact invites several undesired
outcomes, with the ultimate goal of stimulating further analysis of this class
of solutions outside of pure cryptograhic considerations. Included in the paper
is an overview of a series of problems, various candidate solutions in the form
of mempool encryption techniques with a focus on threshold encryption,
potential drawbacks to these solutions, and Osmosis as a case study. The paper
targets a broad audience and remains agnostic to blockchain design where
possible while drawing from mostly financial examples
Mysticeti: Low-Latency DAG Consensus with Fast Commit Path
We introduce Mysticeti-C a byzantine consensus protocol with low-latency and
high resource efficiency. It leverages a DAG based on Threshold Clocks and
incorporates innovations in pipelining and multiple leaders to reduce latency
in the steady state and under crash failures. Mysticeti-FPC incorporates a fast
commit path that has even lower latency. We prove the safety and liveness of
the protocols in a byzantine context. We evaluate Mysticeti and compare it with
state-of-the-art consensus and fast path protocols to demonstrate its low
latency and resource efficiency, as well as more graceful degradation under
crash failures. Mysticeti is the first byzantine protocol to achieve WAN
latency of 0.5s for consensus commit, at a throughput of over 50k TPS that
matches the state-of-the-art
Towards A Practical High-Assurance Systems Programming Language
Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation.
Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code.
To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process
Buying Time: Latency Racing vs. Bidding in Fair Transaction Ordering
We design a practical algorithm for transaction ordering that takes into
account both transaction timestamps and bids. The algorithm guarantees that
users get their transactions published with bounded delay against a bid, while
it extracts a fair value from sophisticated users that have an edge in latency,
by moving expenditure from investment in latency improvement technology to
bidding. The algorithm creates a score from timestamps and bids, and orders
transactions based on the score. We first show that a scoring rule is the only
type of rule that satisfies the independence of latency races. We provide an
economic analysis of the protocol in an environment of private information,
where investment in latency is made ex-ante or interim stages, while bidding
happens at the interim stage where private signals have been observed. The
algorithm is useful for transaction sequencing in rollups or in other
environments where the sequencer has privileged access to order flows
Adaptive Automated Machine Learning
The ever-growing demand for machine learning has led to the development of automated machine learning (AutoML) systems that can be used off the shelf by non-experts. Further, the demand for ML applications with high predictive performance exceeds the number of machine learning experts and makes the development of AutoML systems necessary. Automated Machine Learning tackles the problem of finding machine learning models with high predictive performance. Existing approaches incorporating deep learning techniques assume that all data is available at the beginning of the training process (offline learning). They configure and optimise a pipeline of preprocessing, feature engineering, and model selection by choosing suitable hyperparameters in each model pipeline step. Furthermore, they assume that the user is fully aware of the choice and, thus, the consequences of the underlying metric (such as precision, recall, or F1-measure). By variation of this metric, the search for suitable configurations and thus the adaptation of algorithms can be tailored to the user’s needs. With the creation of a vast amount of data from all kinds of sources every day, our capability to process and understand these data sets in a single batch is no longer viable. By training machine learning models incrementally (i.ex. online learning), the flood of data can be processed sequentially within data streams. However, if one assumes an online learning scenario, where an AutoML instance executes on evolving data streams, the question of the best model and its configuration remains open.
In this work, we address the adaptation of AutoML in an offline learning scenario toward a certain utility an end-user might pursue as well as the adaptation of AutoML towards evolving data streams in an online learning scenario with three main contributions:
1. We propose a System that allows the adaptation of AutoML and the search for neural architectures towards a particular utility an end-user might pursue.
2. We introduce an online deep learning framework that fosters the research of deep learning models under the online learning assumption and enables the automated search for neural architectures.
3. We introduce an online AutoML framework that allows the incremental adaptation of ML models.
We evaluate the contributions individually, in accordance with predefined requirements and to state-of-the- art evaluation setups. The outcomes lead us to conclude that (i) AutoML, as well as systems for neural architecture search, can be steered towards individual utilities by learning a designated ranking model from pairwise preferences and using the latter as the target function for the offline learning scenario; (ii) architectual small neural networks are in general suitable assuming an online learning scenario; (iii) the configuration of machine learning pipelines can be automatically be adapted to ever-evolving data streams and lead to better performances
A Survey of Large Language Models
Language is essentially a complex, intricate system of human expressions
governed by grammatical rules. It poses a significant challenge to develop
capable AI algorithms for comprehending and grasping a language. As a major
approach, language modeling has been widely studied for language understanding
and generation in the past two decades, evolving from statistical language
models to neural language models. Recently, pre-trained language models (PLMs)
have been proposed by pre-training Transformer models over large-scale corpora,
showing strong capabilities in solving various NLP tasks. Since researchers
have found that model scaling can lead to performance improvement, they further
study the scaling effect by increasing the model size to an even larger size.
Interestingly, when the parameter scale exceeds a certain level, these enlarged
language models not only achieve a significant performance improvement but also
show some special abilities that are not present in small-scale language
models. To discriminate the difference in parameter scale, the research
community has coined the term large language models (LLM) for the PLMs of
significant size. Recently, the research on LLMs has been largely advanced by
both academia and industry, and a remarkable progress is the launch of ChatGPT,
which has attracted widespread attention from society. The technical evolution
of LLMs has been making an important impact on the entire AI community, which
would revolutionize the way how we develop and use AI algorithms. In this
survey, we review the recent advances of LLMs by introducing the background,
key findings, and mainstream techniques. In particular, we focus on four major
aspects of LLMs, namely pre-training, adaptation tuning, utilization, and
capacity evaluation. Besides, we also summarize the available resources for
developing LLMs and discuss the remaining issues for future directions.Comment: ongoing work; 51 page
Approaches to Conflict-free Replicated Data Types
Conflict-free Replicated Data Types (CRDTs) allow optimistic replication in a
principled way. Different replicas can proceed independently, being available
even under network partitions, and always converging deterministically:
replicas that have received the same updates will have equivalent state, even
if received in different orders. After a historical tour of the evolution from
sequential data types to CRDTs, we present in detail the two main approaches to
CRDTs, operation-based and state-based, including two important variations, the
pure operation-based and the delta-state based. Intended as a tutorial for
prospective CRDT researchers and designers, it provides solid coverage of the
essential concepts, clarifying some misconceptions which frequently occur, but
also presents some novel insights gained from considerable experience in
designing both specific CRDTs and approaches to CRDTs.Comment: 36 page
Privacy-aware Biometric Blockchain based e-Passport System for Automatic Border Control
In the middle of 1990s, World Wide Web technology initially steps into our life. Now, 30 years after that, widespread internet access and established computing technology bring embodied real life into Metaverse by digital twin. Internet is not only blurring the concept of physical distance, but also blurring the edge between the real and virtual world. Another breakthrough in computing is the blockchain, which shifts the root of trust attached to a system administrator to the computational power of the system. Furthermore, its favourable properties such as immutable time-stamped transaction history and atomic smart contracts trigger the development of decentralized autonomous organizations (DAOs). Combining above two, this thesis presents a privacy-aware biometric Blockchain based e-passport system for automatic border control(ABC), which aims for improving the efficiency of existing ABC system. Specifically, through constructing a border control Metaverse DAO, border control workload can be autonomously self-executed by atomic smart contracts as transaction and then immutably recorded on Blockchain. What is more, to digitize border crossing documentation, biometric Blockchain based e-passport system(BBCVID) is created to generate an immutable real-world identity digital twin in the border control Metaverse DAO through Blockchain and biometric identity authentication. That is to say, by digitizing border crossing documentation and automatizing both biometric identity authentication and border crossing documentation verification, our proposal is able to significantly improve existing border control efficiency. Through system simulation and performance evaluation by Hyperledger Caliper, the proposed system turns out to be able to improve existing border control efficiency by 3.5 times more on average, which is remarkable. What is more, the dynamic digital twin constructed by BBCVID enables computing techniques such as machine learning and big data analysis applicable to real-world entity, which has a huge potential to create more value by constructing smarter ABC systems
FINE-GRAINED ACCESS CONTROL ON ANDROID COMPONENT
The pervasiveness of Android devices in today’s interconnected world emphasizes the importance of mobile security in protecting user privacy and digital assets. Android’s current security model primarily enforces application-level mechanisms, which fail to address component-level (e.g., Activity, Service, and Content Provider) security concerns. Consequently, third-party code may exploit an application’s permissions, and security features like MDM or BYOD face limitations in their implementation. To address these concerns, we propose a novel Android component context-aware access control mechanism that enforces layered security at multiple Exception Levels (ELs), including EL0, EL1, and EL3. This approach effectively restricts component privileges and controls resource access as needed. Our solution comprises Flasa at EL0, extending SELinux policies for inter-component interactions and SQLite content control; Compac, spanning EL0 and EL1, which enforces component-level permission controls through Android runtime and kernel modifications; and TzNfc, leveraging TrustZone technologies to secure third-party services and limit system privileges via Trusted Execution Environment (TEE). Our evaluations demonstrate the effectiveness of our proposed solution in containing component privileges, controlling inter-component interactions and protecting component level resource access. This enhanced solution, complementing Android’s existing security architecture, provides a more comprehensive approach to Android security, benefiting users, developers, and the broader mobile ecosystem
A Simple Single Slot Finality Protocol For Ethereum
Currently, Gasper, the implemented consensus protocol of Ethereum, takes between 64 and 95 slots to finalize blocks. Because of that, a significant portion of the chain is susceptible to reorgs. The possibility to capture MEV (Maximum Extractable Value) through such reorgs can then disincentivize honestly following the protocol, breaking the desired correspondence of honest and rational behavior. Moreover, the relatively long time to finality forces users to choose between economic security and faster transaction confirmation. This motivates the study of the so-called single slot finality protocols: consensus protocols that finalize a block in each slot and, more importantly, that finalize the block proposed at a given slot within such slot.
In this work we propose a simple, non-blackbox protocol that combines a synchronous dynamically available protocol with a partially synchronous finality gadget, resulting in a consensus protocol that can finalize one block per slot, paving the way to single slot finality within Ethereum. Importantly, the protocol we present can finalize the block proposed in a slot, within such slot
- …