11 research outputs found
Study of Virtual Memory
This research report gives a general description of virtual memory systems. The mechanisms and policies and their effect on the operation and efficiency of virtual memory are explained. A virtual memory using a real time virtual address decoder, to decode a 32 bits of virtual address for the secondary memory to obtain the primary address location discussed. The decoder is developed with the use of associative or content-addressable memories. Replacement algorithms, used for selecting the pages of the main memory to be replaced, are described. The hardware implementation of the least recently used and least often used replacement policies using associative memories is presented
Recommended from our members
Scalable hardware memory disambiguation
This dissertation deals with one of the long-standing problems in Computer Architecture
– the problem of memory disambiguation. Microprocessors typically reorder
memory instructions during execution to improve concurrency. Such microprocessors
use hardware memory structures for memory disambiguation, known as LoadStore
Queues (LSQs), to ensure that memory instruction dependences are satisfied
even when the memory instructions execute out-of-order. A typical LSQ implementation
(circa 2006) holds all in-flight memory instructions in a physically centralized
LSQ and performs a fully associative search on all buffered instructions to ensure
that memory dependences are satisfied. These LSQ implementations do not scale
because they use large, fully associative structures, which are known to be slow and
power hungry. The increasing trend towards distributed microarchitectures further
exacerbates these problems. As on-chip wire delays increase and high-performance
processors become necessarily distributed, centralized structures such as the LSQ
can limit scalability.
This dissertation describes techniques to create scalable LSQs in both centralized
and distributed microarchitectures. The problems and solutions described
in this thesis are motivated and validated by real system designs. The dissertation
starts with a description of the partitioned primary memory system of the TRIPS
processor, of which the LSQ is an important component, and then through a series
of optimizations describes how the power, area, and centralization problems
of the LSQ can be solved with minor performance losses (if at all) even for large
number of in flight memory instructions. The four solutions described in this dissertation
— partitioning, filtering, late binding and efficient overflow management —
enable power-, area-efficient, distributed and scalable LSQs, which in turn enable
aggressive large-window processors capable of simultaneously executing thousands
of instructions.
To mitigate the power problem, we replaced the power-hungry, fully associative
search with a power-efficient hash table lookup using a simple address-based
Bloom filter. Bloom filters are probabilistic data structures used for testing set
membership and can be used to quickly check if an instruction with the same data
address is likely to be found in the LSQ without performing the associative search.
Bloom filters typically eliminate more than 80% of the associative searches and they
are highly effective because in most programs, it is uncommon for loads and stores
to have the same data address and be in execution simultaneously.
To rectify the area problem, we observe the fact that only a small fraction
of all memory instructions are dependent, that only such dependent instructions
need to be buffered in the LSQ, and that these instructions need to be in the LSQ
only for certain parts of the pipelined execution. We propose two mechanisms to
exploit these observations. The first mechanism, area filtering, is a hardware mechanism
that couples Bloom filters and dependence predictors to dynamically identify
and buffer only those instructions which are likely to be dependent. The second
mechanism, late binding, reduces the occupancy and hence size of the LSQ. Both of
these optimizations allows the number of LSQ slots to be reduced by up to one-half
compared to a traditional organization without any performance degradation.
Finally, we describe a new decentralized LSQ design for handling LSQ structural
hazards in distributed microarchitectures. Decentralization of LSQs, and to
a large extent distributed microarchitectures with memory speculation, has proved
to be impractical because of the high performance penalties associated with the
mechanisms for dealing with hazards. To solve this problem, we applied classic
flow-control techniques from interconnection networks for handling resource con-
flicts. The first method, memory-side buffering, buffers the overflowing instructions
in a separate buffer near the LSQs. The second scheme, execution-side NACKing,
sends the overflowing instruction back to the issue window from which it is later
re-issued. The third scheme, network buffering, uses the buffers in the interconnection
network between the execution units and memory to hold instructions when the
LSQ is full, and uses virtual channel flow control to avoid deadlocks. The network
buffering scheme is the most robust of all the overflow schemes and shows less than
1% performance degradation due to overflows for a subset of SPEC CPU 2000 and
EEMBC benchmarks on a cycle-accurate simulator that closely models the TRIPS
processor.
The techniques proposed in this dissertation are independent, architectureneutral
and their cumulative benefits result in LSQs that can be partitioned at a
fine granularity and have low design complexity. Each of these partitions selectively
buffers only memory instructions with true dependences and can be closely coupled
with the execution units thus minimizing power, area, and latency. Such LSQ
designs with near-ideal characteristics are well suited for microarchitectures with
thousands of instructions in-flight and may enable even more aggressive microarchitectures
in the future.Computer Science
Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip
The sustained demand for faster, more powerful chips has been met by the
availability of chip manufacturing processes allowing for the integration of increasing
numbers of computation units onto a single die. The resulting outcome,
especially in the embedded domain, has often been called SYSTEM-ON-CHIP
(SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC).
MPSoC design brings to the foreground a large number of challenges, one of
the most prominent of which is the design of the chip interconnection. With a
number of on-chip blocks presently ranging in the tens, and quickly approaching
the hundreds, the novel issue of how to best provide on-chip communication
resources is clearly felt.
NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable
answer to this design concern. By bringing large-scale networking concepts to
the on-chip domain, they guarantee a structured answer to present and future
communication requirements. The point-to-point connection and packet switching
paradigms they involve are also of great help in minimizing wiring overhead
and physical routing issues. However, as with any technology of recent inception,
NoC design is still an evolving discipline. Several main areas of interest
require deep investigation for NoCs to become viable solutions:
• The design of the NoC architecture needs to strike the best tradeoff among
performance, features and the tight area and power constraints of the onchip
domain.
• Simulation and verification infrastructure must be put in place to explore,
validate and optimize the NoC performance.
• NoCs offer a huge design space, thanks to their extreme customizability in
terms of topology and architectural parameters. Design tools are needed
to prune this space and pick the best solutions.
• Even more so given their global, distributed nature, it is essential to evaluate
the physical implementation of NoCs to evaluate their suitability for
next-generation designs and their area and power costs.
This dissertation performs a design space exploration of network-on-chip architectures,
in order to point-out the trade-offs associated with the design of
each individual network building blocks and with the design of network topology
overall. The design space exploration is preceded by a comparative analysis
of state-of-the-art interconnect fabrics with themselves and with early networkon-
chip prototypes. The ultimate objective is to point out the key advantages
that NoC realizations provide with respect to state-of-the-art communication
infrastructures and to point out the challenges that lie ahead in order to make
this new interconnect technology come true. Among these latter, technologyrelated
challenges are emerging that call for dedicated design techniques at all
levels of the design hierarchy. In particular, leakage power dissipation, containment
of process variations and of their effects. The achievement of the above
objectives was enabled by means of a NoC simulation environment for cycleaccurate
modelling and simulation and by means of a back-end facility for the
study of NoC physical implementation effects. Overall, all the results provided
by this work have been validated on actual silicon layout
Steep, Spatially Graded Recruitment of Feedback Inhibition by Sparse Dentate Granule Cell Activity
The dentate gyrus of the hippocampus is thought to subserve important physiological functions, such as 'pattern separation'. In chronic temporal lobe epilepsy, the dentate gyrus constitutes a strong inhibitory gate for the propagation of seizure activity into the hippocampus proper. Both examples are thought to depend critically on a steep recruitment of feedback inhibition by active dentate granule cells. Here, I used two complementary experimental approaches to quantitatively investigate the recruitment of feedback inhibition in the dentate gyrus. I showed that the activity of approximately 4% of granule cells suffices to recruit maximal feedback inhibition within the local circuit. Furthermore, the inhibition elicited by a local population of granule cells is distributed non-uniformly over the extent of the granule cell layer. Locally and remotely activated inhibition differ in several key aspects, namely their amplitude, recruitment, latency and kinetic properties. Finally, I show that net feedback inhibition facilitates during repetitive stimulation. Taken together, these data provide the first quantitative functional description of a canonical feedback inhibitory microcircuit motif. They establish that sparse granule cell activity, within the range observed in-vivo, steeply recruits spatially and temporally graded feedback inhibition
Illumination matters. Revisiting the Roman house in a new light
Interpreting the social complexity of the Roman house requires a careful evaluation of existing evidence. With this in mind, recent work in the field has proposed a variety of different approaches, focusing each time on a specific type of source (architecture and décor, ancient texts, material evidence from excavated houses), each in turn recursively deemed more adequate for the purpose or more fruitful and less biased. This opposition of approaches and critiques between scholars has yielded an extraordinarily rich picture that, however, leaves some of the social dynamics of domestic space out of our reach. This dissertation, focusing on the case study of the House of the Greek Epigrams in the northern part of Insula V 1 in Pompeii, suggests a further level of understanding that combines the aforementioned types of sources with simulations and digital analyses to support archaeological interpretation. Everything visible in the house, including its architecture and its decorations, actively participated in the construction of the social identity of the owner of the house and the Romanitas of his family. However, everything visible is so by virtue of light, which is not a mere medium, but actively partakes in social dynamics and can be manipulated to meet certain demands. In this dissertation, light is considered in its dual aspect as a physical and as a visual and sensory phenomenon. Starting from the assumption that light is a powerful social agent, the study investigates, through historically grounded and physically accurate lighting simulations and analyses, the intertwined spatial and social circulation patterns in order to derive new insights into the social dynamics of the Roman house. In particular, this study argues that the social space of the Roman house was characterized by a greater complexity than that conveyed by ancient sources. It suggests a more nuanced picture, one of light and shadow but also of activity at different times of the day and year, and richer in people both in the foreground and in the background
A hardware-software codesign framework for cellular computing
Until recently, the ever-increasing demand of computing power has been met on one hand by increasing the operating frequency of processors and on the other hand by designing architectures capable of exploiting parallelism at the instruction level through hardware mechanisms such as super-scalar execution. However, both these approaches seem to have reached a plateau, mainly due to issues related to design complexity and cost-effectiveness. To face the stabilization of performance of single-threaded processors, the current trend in processor design seems to favor a switch to coarser-grain parallelization, typically at the thread level. In other words, high computational power is achieved not only by a single, very fast and very complex processor, but through the parallel operation of several processors, each executing a different thread. Extrapolating this trend to take into account the vast amount of on-chip hardware resources that will be available in the next few decades (either through further shrinkage of silicon fabrication processes or by the introduction of molecular-scale devices), together with the predicted features of such devices (e.g., the impossibility of global synchronization or higher failure rates), it seems reasonable to foretell that current design techniques will not be able to cope with the requirements of next-generation electronic devices and that novel design tools and programming methods will have to be devised. A tempting source of inspiration to solve the problems implied by a massively parallel organization and inherently error-prone substrates is biology. In fact, living beings possess characteristics, such as robustness to damage and self-organization, which were shown in previous research as interesting to be implemented in hardware. For instance, it was possible to realize relatively simple systems, such as a self-repairing watch. Overall, these bio-inspired approaches seem very promising but their interest for a wider audience is problematic because their heavily hardware-oriented designs lack some of the flexibility achievable with a general purpose processor. In the context of this thesis, we will introduce a processor-grade processing element at the heart of a bio-inspired hardware system. This processor, based on a single-instruction, features some key properties that allow it to maintain the versatility required by the implementation of bio-inspired mechanisms and to realize general computation. We will also demonstrate that the flexibility of such a processor enables it to be evolved so it can be tailored to different types of applications. In the second half of this thesis, we will analyze how the implementation of a large number of these processors can be used on a hardware platform to explore various bio-inspired mechanisms. Based on an extensible platform of many FPGAs, configured as a networked structure of processors, the hardware part of this computing framework is backed by an open library of software components that provides primitives for efficient inter-processor communication and distributed computation. We will show that this dual software–hardware approach allows a very quick exploration of different ways to solve computational problems using bio-inspired techniques. In addition, we also show that the flexibility of our approach allows it to exploit replication as a solution to issues that concern standard embedded applications
Photochemische Eigenschaften und ultraschnelle Dynamik von Azobenzol-funktionalisierten Goldnanopartikeln
The main goal of the present Thesis was the investigation of the photoinduced
isomerization processes of azobenzenes (ABs) attached to
small gold nanoparticles (AuNPs) by static and femtosecond timeresolved
transient absorption spectroscopy. ABs functionalized with linker chains of three different lengths (C3, C7, C11)
were brought onto the surface of ~4 nm diameter AuNPs in a mixed
self-assembled monolayer (mSAM) with decanethiol or pentanethiol
as co-ligand. The AB-functionalized AuNPs show excellent photoisomerization
yields independent from the linker chain length. AuNPs functionalized with ABs bearing long alkyl
linker chains and short co-ligands reversibly
aggregate when switched to their cis state and disaggregate when isomerized
back to their trans state. The time-resolved absorption data
reveal excited-state absorption of AB and strong
signals of the simultaneously excited localized surface plasmon resonance
(LSPR). The excited-state dynamics of AB show surprisingly
little difference compared to the AB dynamics in solution, leading to
the conclusion that the AB photochrome and the AuNP core are efficiently
decoupled by the long (C11) linker chain. Furthermore, the
excited-state dynamics of Disperse
Red 1 in a mSAM with decanethiol on a AuNP surface
were investigated.
A second goal of this Thesis was the spectroscopic investigation
of the ultrafast relaxation dynamics of functionalized AuNPs themselves,
especially the hot electron cooling dynamics, and how they
are influenced by different ligands. AB-functionalized AuNPs show
the same hot electron lifetime of Ď„~2 ps as purely decanethiol
functionalized AuNPs, revealing that the electronic states of AB do
not participate in the electron cooling process. A study investigating
the influence of (i) the ligand heat capacity and (ii) resonant coupling
between ligand absorption band and LSPR was performed
Postcolonial studies after Foucault : Discourse, discipline, biopower, and governmentality as travelling concepts
In the wake of Edward Said´s Orientalism, a substantial number of scholars have drawn on the work of Michel Foucault in their efforts to conceptualize both colonialism and the resistance against it. Postcolonial Studies After Foucault attempts to map this postcolonial engagement with Foucault, focusing on the use of four Foucauldian concepts in key postcolonial texts: "discourse", "discipline", "biopower", and "governmentality". Through a comparative analysis of the multiple meanings, functions, and effects of these concepts as they travel from one context into another, this study seeks to highlight the complex processes of transformation that underlie the recontextualization of these concepts. Moreover, by analyzing the patterns that appear in these transformations, Postcolonial Studies After Foucault aims to raise and address the question of whether the various postcolonial appropriations of Foucauldian concepts have given rise to a distinctively "Foucauldian" critique of colonial power, and how such a "Foucault Effect" relates to other conceptualizations and critiques of colonial power
Ultraschnelle photochrome Reaktionen strukturell modifizierter Furylfulgide und eines verbrĂĽckten Azobenzols
In this Thesis, the effect of structural modifications on the ultrafast dynamics and photochemical properties of furylfulgides and azobenzenes, as two important classes of photochromic switches, was investigated by femtosecond time-resolved transient absorption spectroscopy (fsTAS). The parent compound of the furylfulgides (MeF) was systematically modified by steric and electronic effects with a bulky isopropyl substituent (iPrF) or by intramolecular bridging (7rF) at the hexatriene/cyclohexadiene chromophore and by benzannulation of the furyl ring (MeBF), respectively. The second task was the investigation of the photo-induced isomerization of a highly constrained bridged azobenzene derivative (brAB), which can be regarded as a new improved class of azobenzenes.In der vorliegenden Dissertation wurde der Einfluss struktureller Modifikationen zweier wichtiger Klassen molekularer Schalter auf deren photodynamische und -chemische Eigenschaften mittels Femtosekunden-zeitaufgelöster transienter Absorptionsspektroskopie (fsTAS) untersucht. Die Stammverbindung der untersuchten Furylfulgide (MeF) wurde durch sterische Beschränkungen (iPrF und 7rF) an ihrem Hexatrien/Cyclohexadien- Chromophor oder elektronisch durch Benzanellierung des Furylrings (MeBF) modifiziert. Aus der Klasse der Azobenzole (AB) wurde das durch eine Ethylenbrücke stark sterisch beschränkte Diazozinderivat brAB untersucht, das als Stammverbindung einer neuen Klasse molekularer Schalter betrachtet werden kann
Water and Brain Function: Effects of Hydration Status on Neurostimulation and Neurorecording
Introduction: TMS and EEG are used to study normal neurophysiology, diagnose, and treat clinical neuropsychiatric conditions, but can produce variable results or fail. Both techniques depend on electrical volume conduction, and thus brain volumes. Hydration status can affect brain volumes and functions (including cognition), but effects on these techniques are unknown. We aimed to characterize the effects of hydration on TMS, EEG, and cognitive tasks. Methods: EEG and EMG were recorded during single-pulse TMS, paired-pulse TMS, and cognitive tasks from 32 human participants on dehydrated (12-hour fast/thirst) and rehydrated (1 Liter oral water ingestion in 1 hour) testing days. Hydration status was confirmed with urinalysis. MEP, ERP, and network analyses were performed to examine responses at the muscle, brain, and higher-order functioning. Results: Rehydration decreased motor threshold (increased excitability) and shifted the motor hotspot. Significant effects on TMS measures occurred despite being re-localized and re-dosed to these new parameters. Rehydration increased SICF of the MEP, magnitudes of specific TEP peaks in inhibitory protocols, specific ERP peak magnitudes and reaction time during the cognitive task. Rehydration amplified nodal inhibition around the stimulation site in inhibitory paired-pulse networks and strengthened nodes outside the stimulation site in excitatory and CSP networks. Cognitive performance was not improved by rehydration, although similar performance was achieved with generally weaker network activity. Discussion: Results highlight differences between mild dehydration and rehydration. The rehydrated brain was easier to stimulate with TMS and produced larger responses to external and internal stimuli. This is explainable by the known physiology of body water dynamics, which encompass macroscopic and microscopic volume changes. Rehydration can shift 3D cortical positioning, decrease scalp cortex distance (bringing cortex closer to stimulator/recording electrodes), and cause astrocyte swelling-induced glutamate release. Conclusions: Previously unaccounted variables like osmolarity, astrocyte and brain volumes likely affect neurostimulation/neurorecording. Controlling for and carefully manipulating hydration may reduce variability and improve therapeutic outcomes of neurostimulation. Dehydration is common and produces less excitable circuits. Rehydration should offer a mechanism to macroscopically bring target cortical areas closer to an externally applied neurostimulation device to recruit greater volumes of tissue and microscopically favor excitability in the stimulated circuits