Overview:

Continuing improvements in semiconductor density are enabling new classes of System-on-a-Chip architectures that combine extensive processing logic and high-density memory.  By 2009, the International Technology Roadmap for Semiconductors [1] predicts that a single high-end microprocessor die will contain approximately 84 million logic transistors.  Dynamic Random Access Memory (DRAM) density is increasing at an even faster pace, with 2Gbyte DRAM chips expected in the same timeframe.  Large-scale problems may be handled “on-chip” using the increased memory and logic density, but new architectures must be developed which can effectively exploit these tremendous on-chip resources.

Computing in Memory Architectures (CIMA) offer an SOC strategy that can effectively accelerate many high-performance computing and real-time image/signal/data processing problems.  Our research group has studied CIMA architectures [2], and has identified several key applications that can benefit from CIMA acceleration.  We are developing a CIMA hardware prototype using FPGA and DRAM to create a reconfigurable (Smart) module that is inserted into the high-bandwidth PC-100 (DIMM) memory module slot of a conventional PC (a SmartDIMM.  Since our design is mapped directly onto the “frontside” memory bus on a standard PC, we gain several advantages over existing FPGA-based accelerator cards.  The most important of these is reduced latency and increased bandwidth to and from the CPU.  Our SmartDIMM design provides higher available bandwidth to Main Memory (800MB/sec) than any existing FPGA-based accelerator, can act as traditional memory when not in use as Smart Memory, and offers extremely large memory resources (up to 16MByte per chip).  By adopting the PC-100 bus standard, the SmartDIMM design can target the traditional desktop PC market, as well as the interface-compatible SODIMM standard for portable/miniaturized applications.  Furthermore, we can support the next-generation DDR (Double Data Rate) SDRAM standard.  The extensive logic resources of a Virtex FPGA can accommodate large-scale application solutions; with partial reconfiguration of the FPGA the effective problem solution space can be further expanded.


Architecture:

The proposed SmartDIMM design integrates a large amount of main memory (64MB), and a large FPGA onto a DIMM form-factor PCB that is designed as a completely backwards-compatible replacement for a standard desktop PC-100 memory module.  In order to provide for simultaneous CPU and FPGA access to memory, we propose a dual-bank memory arrangement where each device has direct access to one bank at a time (and the other device is correspondingly locked out during this time). 

In spite of tight design timing limitations, this approach achieves the highest-bandwidth access to its local block of main memory (800MB/sec), for the largest potential performance increase.  The largest drawback to this design is the lack of any facility for signaling the host:  all host accesses to the memory card must be facilitated in real time with no wait states, and any signals to the CPU can only be accomplished by setting a semaphore and waiting for the CPU to poll.

SmartDIMM Key Features:

+      Highest bandwidth to Main Memory (800MB/sec)

+      Acts as traditional memory when not in use as Smart Memory

+      Largest memory size (16MByte per chip)

-      Access to memory by only one of host CPU or FPGA at any given time: no cycle-by-cycle arbitration

-      No FPGA-driven communication: must wait for CPU to poll

-      Tight timing/signal propagation constraints

 

SmartDIMM Design:

Simplified SmartDIMM block diagram

In order to be completely PC100 (or PC66) compliant, there are many timing and operational restrictions that must be observed by the SmartDIMMdesign.  We have already analyzed the critical timing paths, and have taken PC memory bus timing measurements at 100MHz and 66MHz on two separate BX-Chipset motherboards.  Here is a numerical analysis of the "write burst timing" at 100MHz (PC-100) speeds.


SmartDIMM Programming and Technical Reference Information:

This link will take you to the current working version of our SmartDIMM Control Register Address Definition (updated 4/20/00 ).  

Here is a link to the previous version of this specification (which defined both active & passive modes).  This older version includes CPLD Interface requirements and details of the most recent set of changes and enhancements to the design and control register set.  


The SmartDIMM Team:

FACULTY (web page and CV links):

STUDENTS:


Publications:

Computing In Memory Architectures, Rapid Prototyping, Hardware Acceleration, and other relevant related references by the SmartDIMM team are listed here (with hyperlinks to full text online where available - ? to group ? keep here, or move off to a separate page?)

  1. D. Landis, L. Roth, P. Hulina, L. Coraor, and S. Deno, “Evaluation of Computing in Memory Architectures for Digital Image Processing Applications”, Proceedings of the 1999 Intl. Conference on Computer Design, October 1999,  pp. 146-151.
  2. D. Landis, L. Roth, P. Hulina, L. Coraor, and S. Deno, “Computing in Memory Architectures for Digital Image Processing”, Proceedings of the 1999 IEEE International Workshop on Memory Technology, August 1999, pp. 8-15
  3. A. Murthy, N. Vijaykrishnan and A. Sivasubramaniam, 'How can hardware support Just-in-Time Compilation ?', Workshop on Hardware Support for Objects amd Microarchitectures for Java, pp. 15-19, October 1999.
  4. R. Radhakrishn, N. Vijaykrishnan, L. K. John and A. Sivasubramaniam, 'Architectural Issues in Java Run-time Systems', International Conference on High Performance Computer Architecture, pages 387-398, Jan 2000.
  5. D. Landis, “Experiences using RASSP Instructional Modules in a Senior-Level Rapid System Prototyping Class, Proceedings of the 1997 Intl. Conf. On Microelectronic Systems Education, July 1999, pp. 53-54
  6. D. Landis, P. Guddetti, P. Hulina and L. Coraor, “Language-Based Rapid Prototyping Methods for Legacy System Re-Engineering and Re-Use”, Proceedings of the 1999 IEEE International Workshop on Rapid System Prototyping, Tampa, FL, June 16-18, 1999, pp. 52-57
  7. S. Deno, B. Balasubramanian, D. Landis, and P. Hulina, “A Rapid Prototyping Methodology for Reverse Engineering of Legacy Electronic Systems”, Proceedings of the 1999 IEEE International Workshop on Rapid System Prototyping, Tampa, FL, June 16-18, 1999, pp. 222-227.
  8. D. Landis & P. Hulina, “A WWW Facilitated Rapid System Prototyping Class”, Proceedings of the 1997 Intl. Conf. On Microelectronic Systems Education, July 1997, pp. 99-101.
  9. P. Hulina & D. Landis, “An Electronics Manufacturing Minor in Engineering with Emphasis on Rapid Prototyping”, to appear in Proceedings of the 1997 Intl. Conf. On Microelectronic Systems Education, July 1997, pp. 21-23

PDG Proposal References (are found on this link)


SmartDIMM-related LINKS:


Quote of Interest:"...put all your eggs in one basket, and WATCH THAT BASKET" - Mark Twain

This page has been accessed Hit Counter times, and was last meddled with: 10/19/04