THIS PAGE
The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. 
Names of Post Doctorates

Names of Faculty Supported
Names of Under Graduate students supported
Names of Personnel receiving masters degrees
Names of personnel receiving PHDs
Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale): Number of graduating undergraduates funded by a DoD funded Center of Excellence grant for Education, Research and Engineering:
The number of undergraduates funded by your agreement who graduated during this period and intend to work for the Department of Defense The number of undergraduates funded by your agreement who graduated during this period and will receive scholarships or fellowships for further studies in science, mathematics, engineering or technology fields:
Student Metrics This section only applies to graduating undergraduates supported by this agreement in this reporting period
The number of undergraduates funded by this agreement who graduated during this period: The number of undergraduates funded by this agreement who graduated during this period with a degree in science, mathematics, engineering, or technology fields: 
Sub Contractors (DD882)
Names of other research staff
Inventions (DD882) Scientific Progress
This grant enabled the development of a new field in computer design and fabrication based on self-assembled resonance energy transfer devices. Through this work we have also discovered new ways to apply these new kinds of devices. We also were the first to develop a thermo-mechanical model of DNA that reconciles observed UV absorption changes with the actual structure of complex DNA nanostructures. The following are the major themes of our other accomplishments: (1) proof of concept high-density optical memory, (2) a physical key based information assurance system, (3) a network-on-chip computer architecture based on RET, and (4) a high on/off ratio RET switch for all-optical computation with nanoscale device density.
Technology Transfer
PERCENT_SUPPORTED NAME FTE Equivalent: Total Number:
Final Report
This report provides summary overviews of the progress completed through ARO YIP grant 54696-EL-PCS from the period 10/1/2009 to 9/30/2014 for the eight sub-project thrusts. The document is organized into sections that describe each sub-project with an emphasis on highlighting important results and findings. Each sub-project is described in complete detail in the papers identified in the bibliography.
Project 1. Design Automation for Resonance Energy Transfer Networks
Resonance energy transfer (RET) circuits are networks of photo-active molecules that can implement arbitrary logic functions. The nanoscale size of these structures can bring high-density computation to new domains, e.g., in vivo sensing and computation. A key challenge in the design of a RET network is to find, among a huge set of configurations (i.e., design space), the optimum choice and arrangement of molecules on a nanostructure. The prohibitively large size of the design space makes it impractical to evaluate every possible configuration, motivating the need for design-space pruning to be integrated into the design flow. To this end, this project developed a computer-aided design framework, called RETLab, that enables structured pruning of the design space to extract a sufficiently small subset, which is fully evaluated and ranked based on user-defined metrics to yield the best configuration. More importantly, we have developed a new RET-simulation algorithm, which is several orders of magnitude (e.g., for a 4-node network, one million times) faster than conventional Monte-Carlo-based simulation (MCS). This speedup in configuration evaluation enabled a significantly more extensive design-space exploration with fewer and less constrained heuristics, compared to existing RET-network design methods which were ad-hoc and rely on MCS for configuration evaluation.
We found that the actual functionality of a fabricated RET network usually deviated from its desired functionality due mainly to two reasons: 1-Undesired modified inter-chromophore distances imposed by the underlying nanostructure, and 2-Undesired RET properties dictated by the molecular structure of chromophores. We described the desired functionality by an exciton-flow graph (EFG) which served as the ideal model. Therefore, the automation problem we discovered was to find the configuration that yields the best behavioral match to the EFG.
Existing methods for RET network design were all ad hoc and limited to a particular functionality. Aside from lack of generalizability, due to their low throughput, these methods are incapable of efficiently exploring large design spaces within a limited design time.
We developed a generic RET-network design framework which has a higher throughput compared to ad hoc existing methods. The higher throughput of our design flow was enabled by avoiding unnecessary simulations, as well as employing a novel simulation algorithm which, in addition to being highly precise, is several orders of magnitude faster than conventional simulation methods. This higher throughput enabled a more extensive exploration of larger design spaces, compared to the existing methods, making RETLab the first efficient framework for optimizing RET circuit components and devices.
Project 2. Optical Storage Based on Self-assembled DNA Networks
Building on the potential for DNA to achieve vast enhancements in storage density, this project demonstrated a novel retrieval technique which we call polychromatic address multiplexing (PAM) to access multiple bits stored in the same cell. By exploiting nanoscale fluorescence-based storage elements, PAM enables storage of hundreds of bits in a single cell. In this fashion, higher storage densities beyond the diffraction limit are achieved with this technique, which can also be incorporated into conventional optical-storage technologies.
We demonstrated the information retrieval technique and showed that it has the potential to enable a as high as 5500X increase in the areal storage density of current optical discs beyond the diffraction limit. PAM exploits acceptor saturation in precisely-assembled nanoscale FRET-based structures to induce a fluorescence increase which is exclusively caused by the addressed structures while other structures remain inactive. Using commercially available dyes, we synthesized two kinds of storage elements to experimentally demonstrate, as a proof-of-concept, storage cell capacities of two, three, and four bits. We also simulated 6-color storage-elements based on PAM-optimized synthetic dyes and the channel capacities were shown to be at least six bits. The exponential growth of the address space enabled by PAM makes it an efficient technique for many data-multiplexing applications including, but not limited to, optical storage media.
Project 3. Self-assembling Circuit Defect Modeling
The self-assembly of nanoelectronic devices provide an opportunity to achieve unprecedented density and manufacturing scale in the post-Moore's Law era. Bottom-up DNA self-assembly has emerged as a promising technique towards achieving this vision and it has been used to demonstrate precise patterning and functionalisation at resolutions below 20 nm. However, a lack of understanding of fabrication defects and their impact on circuit behaviour are major obstacles to the eventual application of these substrates to circuit design. This project developed a classification of defects observed in our experimental work on self-assembled nanostructures. Atomic force microscope (AFM) images were used to study these defects and determine their relative frequencies. We then connected these defects to fault models and predicted their likely impact on the behavior of logic gates. Based on simulation program with integrated circuit emphasis simulation data for the proposed layouts, we concluded that there is a predictive connection between faulty logic behavior and physical defects for future DNA selfassembled nanoelectronics. This work will be useful in predicting the potential success of defecttolerance techniques for DNA self-assembled nanoelectronic substrates.
We characterized the structural defects that occur during bottom-up DNA self-assembly. We developed a classification of these defects using experimental data from AFM images of self-assembled DNA nanostructures. These images have been used to study defects in detail and to determine their relative frequencies of occurrence. We have classified defects as being either structural defects that can currently be observed, or anticipated defects that are expected to occur in future nanostructures. We related these defects to circuit-level fault models and predicted their likely impact on the behavior of logic gates. We then developed a set of fault models for CMOS and carbon-nanotube (CNT) layouts of an inverter, a NAND gate, and a NOR gate. These gates were simulated using the SPICE3f5 software with several fault injections to compare the outputs of the two technologies. Missing functionalisation sites and tiles make up approximately 65% of the known physical defects. These defects tend to result in single-stuck line faults, assuming that this occurrence is common for anticipated defects too. Nonspecific functionalisation defects result in potential delay faults. However, these faults are not observable at the simulated switching frequencies for the CNT models. Future work will continue to explore the potential connections between physical defects and fault models for more complex circuits and complex faults such as bridging faults.
Project 4. Integration of Sensing and Computation at the Nanoscale
This project explored the architectural implications of integrating computation and molecular probes to form nanoscale sensor processors (nSP). We showed how nSPs may enable new computing domains and automate tasks that currently require expert scientific training and costly equipment. This new application domain severely constrains nSP size, which significantly impacts the architectural design space. In this context, we explored nSP architectures and developed an nSP design that includes a simple accumulator-based ISA, sensors, limited memory and communication transceivers. To reduce the application memory footprint, we introduced the concept of instruction-fused sensing. We used simulation and analytical models to evaluate nSP designs executing a representative set of target applications. Furthermore, we designed a candidate nSP technology based on optical Resonance Energy Transfer (RET) logic that enables the small size required by the application domain; our smallest design is about the size of the largest known virus. We also showed laboratory results that demonstrate initial steps towards a simple prototype.
Two driving forces on computer architecture are application requirements and technology change. The combination of important problems in the life sciences and advances in material science are exposing a new computational domain: biological scale integrated sensing and processing. The ability to utilize programmable devices at biological scales may enable life scientists to perform hypothesis testing previously thought impossible. This domain presents new challenges to computer architects due to the extreme size constraints: a device must be capable of diffusing through small volumes while still meeting application requirements. This project developed the first-of-its-kind architecture for nanoscale sensing and processing. We analyzed the application characteristics (e.g., long time scales and common operations) to design a multicycle accumulator-based architecture. A novel aspect of this architecture is the use of instruction-fused sensing that exploits the unified use of nanoscale devices for both sensing and logic design to allow sensors to directly modify logic values (i.e., instruction opcode bits). We implemented several representative applications that execute on the architecture and demonstrated capabilities (e.g., sensing based on complex logic) beyond those achievable with current simple biological sensors. This work represented our first steps toward developing biological scale computing systems.
Project 5. Label-free Sensing with Self-assembling Nanoscale Systems
The self-assembly of molecularly precise nanostructures is widely expected to form the basis of future high-speed integrated circuits, but the technologies suitable for such circuits are not well understood. In this project, DNA self-assembly was used to create molecular logic circuits that can selectively identify specific biomolecules in solution by encoding the optical response of near-field coupled arrangements of chromophores. The resulting circuits can detect label-free, femtomole quantities of multiple proteins, DNA oligomers, and small fragments of RNA in solution via ensemble optical measurements. This method, which is capable of creating multiple logic-gate-sensor pairs on a 2X80X80-nm DNA grid, was a first step toward more sophisticated nanoscale logic circuits capable of interfacing computers with biological processes.
The methods developed in this project demonstrated the feasibility of sensing biomolecules with RET logic. The method takes advantage of DNA self-assembly to build nanoscale grids with integrated RET circuits designed to operate as digital multiplexers. However, an important aspect of this work was to demonstrate viable pathways towards building larger RET circuits suitable for high-throughput sensing of many analytes, simultaneously. To do this, we used Boolean logic to create a multiplexer by designing logic terms that enable a distinct output for each unique sensor. Upon excitation, or addressing, we observed the output from one and only one kind of sensor. In this way, it is possible to combine many such sensors into a single monolith, smaller than the diffraction limit, as long as each address uniquely identifies a single kind of sensor. The logprime-output-encoding technique we developed trades signalto-noise for address space by requiring a greater degree of significance in any measured response.
Thus, single-photon-counting techniques, with an ability to detect attomoles of chromophores, dilute gates where the analyte is in excess, and a library of only six distinct chromophores (i.e., five inputs and one output) will enable sensing of over 24 analytes on a single 4X4 DNA grid. After accounting for yield (35%) and a diffraction-limited spot size of approximately 700nm 2 (e.g., using 600-nm fluorescent output) the overall sensor density could be as high as 10 13 m -2 unique analytes which currently exceeds the density of next-generation sensor arrays (e.g., gene chips with 500-nm-diameter probe spots) by an order of magnitude.
Project 6. Molecular-scale Network-on-chip (NoC)
In this project, we explored the use of emerging molecular scale devices to construct nanophotonic networks called Molecular-scale Network-on-Chip (mNoC). We leveraged quantum dot LEDs, which provide electrical to optical signal modulation, and chromophores, which provide optical signal filtering for receivers. These devices replace the ring resonators and the external laser source used in contemporary nanophotonic NoCs. We developed different crossbar structures such as Single Writer Multiple Reader (SWMR), etc. We also studied the implications of the new mNoC crossbar on overall system design. An mNoC SWMR crossbar can scale up to radix 256 and our preliminary evaluation shows that it reduces over 50% average packet latency and 40% power consumption compared with ring-based alternative.
The chromophores replace the ring resonators and the quantum dot LED replaces the external laser source used in current nanophotonic NoCs. We also studied SWMR, MWSR, and MWMR bus-based crossbar mNoC and show that without limitations of current nanophotonic networks, an SWMR mNoC crossbar can easily scale to a radix-256 crossbar, larger than any existing NoC. We evaluated the SWMR mNoC with both synthetic benchmarks and a few PARSEC benchmarks. The synthetic benchmarks simulation results showed that compared with rNoC and eMesh, mNoC has greatly reduced average latency cycles (half atleast) and has higher tolerance for network traffic. mNoC also exhibits performance improvement over rNoC and eMesh with PARSEC benchmarks. Furthermore, mNoC trades static power for dynamic power and greatly reduces power consumption. An mNoC can achieve 75% reduction in power for a 64 X 64 crossbar compared to similar ring resonator based designs. Additionally, an mNoC can scale to a 256 X 256 crossbar with 40% of power reduction. A large single crossbar allows for the possibility of high radix routers and efficient broadcast based directory protocols which are immediately useful in modern computer processor designs.
Project 7. Optical Logic Elements
Optical nanoscale computing is one promising alternative to the CMOS process. For this project we explored the application of Resonance Energy Transfer (RET) logic to common digital circuits. We proposed an Optical Logic Element (OLE) as a basic unit from which larger systems can be built. An OLE is a layered structure that works similar to a lookup table but instead uses wavelength division multiplexing for its inputs and output. Waveguides provide a convenient mechanism to connect multiple OLEs into large circuits. We built a SPICE model from first principles for each component to estimate the timing and power behavior of the OLE system. We analyzed various logic circuits and the simulation results show that the components are theoretically correct and that the models faithfully reproduce the fundamental phenomena; the power delay product of OLE systems is at least 2.5× less than the 14 nm CMOS technology node with 100× better density.
Unlike traditional CMOS, an OLE uses wavelength division multiplexing for its inputs and output. Conventional optical fibers are used in a platform for studying the OLE structure and in the future will serve as wires to interconnect multiple OLEs into larger systems. To better understand the characteristics of the system and to be useful as a guide for future experimental demonstration, a SPICE model for each component including a mapped power model was built from first principles and verified with experimental results. Various combinational and sequential circuits were designed and simulated to estimate their timing and power behavior. The simulation results suggest that the propagation delay, rise and fall times in this system are much faster than the 22 nm CMOS technology node and that the system is at least 2.5× more energy efficient as well. Moreover, the area density of the OLE-based technology is at least two orders of magnitude better than the state-of-the-art CMOS technology.
Future work will focus on the experimental demonstration of integrated OLEs and the study of various methods to improve the longevity of the chromophores within. Beyond molecular techniques to solve these problems, we will also investigate various fault-tolerant circuit design approaches as they apply to building larger systems from OLEs.
Project 8. Routing Algorithms for Irregular Self-assembled Networks
The integration of novel nanotechnologies onto silicon platforms is likely to increase fabrication defectscompared with traditional CMOS technologies. Furthermore, the number of nodes connected with these networks makes acquiring a global defect map impractical. As a result, on-chip networks will provide defect tolerance by self-organizing into irregular topologies. In this scenario, simple static routing algorithms based on regular physical topologies, such as meshes, will be inadequate. Additionally, previous routing approaches for irregular networks assume abundant resources and do not apply to this domain of resource-constrained self-organizing nano-scale networks. Consequently, routing algorithms that work in irregular networks with limited resources are needed. In this project, we explored routing for self-organizing nano-scale irregular networks in the context of a Self-Organizing SIMD Architecture (SOSA). Our approach traded configuration time and a small amount of storage for reduced communication latency. We augmented an Euler path-based routing technique for trees to generate static shortest paths between certain pairs of nodes while remaining deadlock free.
Simulations of several applications executing on SOSA showed the routing algorithm can reduce execution time by 8% to 30%.
