7,645 research outputs found

    Secure and efficient application monitoring and replication

    Get PDF
    Memory corruption vulnerabilities remain a grave threat to systems software written in C/C++. Current best practices dictate compiling programs with exploit mitigations such as stack canaries, address space layout randomization, and control-flow integrity. However, adversaries quickly find ways to circumvent such mitigations, sometimes even before these mitigations are widely deployed. In this paper, we focus on an "orthogonal" defense that amplifies the effectiveness of traditional exploit mitigations. The key idea is to create multiple diversified replicas of a vulnerable program and then execute these replicas in lockstep on identical inputs while simultaneously monitoring their behavior. A malicious input that causes the diversified replicas to diverge in their behavior will be detected by the monitor; this allows discovery of previously unknown attacks such as zero-day exploits. So far, such multi-variant execution environments (MVEEs) have been held back by substantial runtime overheads. This paper presents a new design, ReMon, that is non-intrusive, secure, and highly efficient. Whereas previous schemes either monitor every system call or none at all, our system enforces cross-checking only for security critical system calls while supporting more relaxed monitoring policies for system calls that are not security critical. We achieve this by splitting the monitoring and replication logic into an in-process component and a cross-process component. Our evaluation shows that ReMon offers same level of security as conservative MVEEs and run realistic server benchmarks at near-native speeds

    Computation of Buffer Capacities for Throughput Constrained and Data Dependent Inter-Task Communication

    Get PDF
    Streaming applications are often implemented as task graphs. Currently, techniques exist to derive buffer capacities that guarantee satisfaction of a throughput constraint for task graphs in which the inter-task communication is data-independent, i.e. the amount of data produced and consumed is independent of the data values in the processed stream. This paper presents a technique to compute buffer capacities that satisfy a throughput constraint for task graphs with data dependent inter-task communication, given that the task graph is a chain. We demonstrate the applicability of the approach by computing buffer capacities for an MP3 playback application, of which the MP3 decoder has a variable consumption rate. We are not aware of alternative approaches to compute buffer capacities that guarantee satisfaction of the throughput constraint for this application

    Scalability of broadcast performance in wireless network-on-chip

    Get PDF
    Networks-on-Chip (NoCs) are currently the paradigm of choice to interconnect the cores of a chip multiprocessor. However, conventional NoCs may not suffice to fulfill the on-chip communication requirements of processors with hundreds or thousands of cores. The main reason is that the performance of such networks drops as the number of cores grows, especially in the presence of multicast and broadcast traffic. This not only limits the scalability of current multiprocessor architectures, but also sets a performance wall that prevents the development of architectures that generate moderate-to-high levels of multicast. In this paper, a Wireless Network-on-Chip (WNoC) where all cores share a single broadband channel is presented. Such design is conceived to provide low latency and ordered delivery for multicast/broadcast traffic, in an attempt to complement a wireline NoC that will transport the rest of communication flows. To assess the feasibility of this approach, the network performance of WNoC is analyzed as a function of the system size and the channel capacity, and then compared to that of wireline NoCs with embedded multicast support. Based on this evaluation, preliminary results on the potential performance of the proposed hybrid scheme are provided, together with guidelines for the design of MAC protocols for WNoC.Peer ReviewedPostprint (published version

    Token Tenure and PATCH: A Predictive/Adaptive Token-Counting Hybrid

    Get PDF
    Traditional coherence protocols present a set of difficult trade-offs: the reliance of snoopy protocols on broadcast and ordered interconnects limits their scalability, while directory protocols incur a performance penalty on sharing misses due to indirection. This work introduces PATCH (Predictive/Adaptive Token-Counting Hybrid), a coherence protocol that provides the scalability of directory protocols while opportunistically sending direct requests to reduce sharing latency. PATCH extends a standard directory protocol to track tokens and use token-counting rules for enforcing coherence permissions. Token counting allows PATCH to support direct requests on an unordered interconnect, while a mechanism called token tenure provides broadcast-free forward progress using the directory protocol’s per-block point of ordering at the home along with either timeouts at requesters or explicit race notification messages. PATCH makes three main contributions. First, PATCH introduces token tenure, which provides broadcast-free forward progress for token-counting protocols. Second, PATCH deprioritizes best-effort direct requests to match or exceed the performance of directory protocols without restricting scalability. Finally, PATCH provides greater scalability than directory protocols when using inexact encodings of sharers because only processors holding tokens need to acknowledge requests. Overall, PATCH is a “one-size-fits-all” coherence protocol that dynamically adapts to work well for small systems, large systems, and anywhere in between

    Black Hole Search with Finite Automata Scattered in a Synchronous Torus

    Full text link
    We consider the problem of locating a black hole in synchronous anonymous networks using finite state agents. A black hole is a harmful node in the network that destroys any agent visiting that node without leaving any trace. The objective is to locate the black hole without destroying too many agents. This is difficult to achieve when the agents are initially scattered in the network and are unaware of the location of each other. Previous studies for black hole search used more powerful models where the agents had non-constant memory, were labelled with distinct identifiers and could either write messages on the nodes of the network or mark the edges of the network. In contrast, we solve the problem using a small team of finite-state agents each carrying a constant number of identical tokens that could be placed on the nodes of the network. Thus, all resources used in our algorithms are independent of the network size. We restrict our attention to oriented torus networks and first show that no finite team of finite state agents can solve the problem in such networks, when the tokens are not movable. In case the agents are equipped with movable tokens, we determine lower bounds on the number of agents and tokens required for solving the problem in torus networks of arbitrary size. Further, we present a deterministic solution to the black hole search problem for oriented torus networks, using the minimum number of agents and tokens

    Buffer Capacity Computation for Throughput Constrained Streaming Applications with Data-Dependent Inter-Task Communication

    Get PDF
    Streaming applications are often implemented as task graphs, in which data is communicated from task to task over buffers. Currently, techniques exist to compute buffer capacities that guarantee satisfaction of the throughput constraint if the amount of data produced and consumed by the tasks is known at design-time. However, applications such as audio and video decoders have tasks that produce and consume an amount of data that depends on the decoded stream. This paper introduces a dataflow model that allows for data-dependent communication, together with an algorithm that computes buffer capacities that guarantee satisfaction of a throughput constraint. The applicability of this algorithm is demonstrated by computing buffer capacities for an H.263 video decoder

    Artificial Neural Network Based Prediction Mechanism for Wireless Network on Chips Medium Access Control

    Get PDF
    As per Moore’s law, continuous improvement over silicon process technologies has made the integration of hundreds of cores on to a single chip possible. This has resulted in the paradigm shift towards multicore and many-core chips where, hundreds of cores can be integrated on the same die and interconnected using an on-chip packet-switched network called a Network-on-Chip (NoC). Various tasks running on different cores generate different rates of communication between pairs of cores. This lead to the increase in spatial and temporal variation in the workloads, which impact the long distance data communication over multi-hop wire line paths in conventional NoCs. Among different alternatives, due to the CMOS compatibility and energy-efficiency, low-latency wireless interconnects operating in the millimeter wave (mm-wave) band is nearer term solution to this multi-hop communication problem in traditional NoCs. This has led to the recent exploration of millimeter-wave (mm-wave) wireless technologies in wireless NoC architectures (WiNoC). In a WiNoC, the mm-wave wireless interconnect is realized by equipping some NoC switches with an wireless interface (WI) that contains an antenna and transceiver circuit tuned to operate in the mm-wave frequency. To enable collision free and energy-efficient communication among the WIs, the WIs is also equipped with a medium access control mechanism (MAC) unit. Due to the simplicity and low-overhead implementation, a token passing based MAC mechanism to enable Time Division Multiple Access (TDMA) has been adopted in many WiNoC architectures. However, such simple MAC mechanism is agnostic of the demand of the WIs. Based on the tasks mapped on a multicore system the demand through the WIs can vary both spatially and temporally. Hence, if the MAC is agnostic of such demand variation, energy is wasted when no flit is transferred through the wireless channel. To efficiently utilize the wireless channel, MAC mechanisms that can dynamically allocate token possession period of the WIs have been explored in recent time for WiNoCs. In the dynamic MAC mechanism, a history-based prediction is used to predict the bandwidth demand of the WIs to adjust the token possession period with respect to the traffic variation. However, such simple history based predictors are not accurate and limits the performance gain due to the dynamic MACs in a WiNoC. In this work, we investigate the design of an artificial neural network (ANN) based prediction methodology to accurately predict the bandwidth demand of each WI. Through system level simulation, we show that the dynamic MAC mechanisms enabled with the ANN based prediction mechanism can significantly improve the performance of a WiNoC in terms of peak bandwidth, packet energy and latency compared to the state-of-the-art dynamic MAC mechanisms