Go to USC home page USC Logo Computer Science & Engineering
UNIVERSITY OF SOUTH CAROLINA
CSE Home Page | CSE Research | CSE Site Map | CSE Faculty
Reconfigurable Computing
in the CSE Department
Reconfigurable computing lab

Introduction

An exciting area of research in the Department of Computer Science and Engineering is that of reconfigurable computing. Current reconfigurable computing machines (RCMs) make use of field programmable gate arrays (FPGAs), chips that can be configured to implement an arbitrary hardware design. FPGAs can be reconfigured, even in the middle of a computation, by simply changing the bit file that provides the configuration. With such a machine, a user designs a special purpose hardware solution for a computational problem as if designing a software solution. The design is then synthesized into logic and the synthesized logic used to configure the FPGAs.

Not all applications are suitable for an RCM. Applications that involve extensive recursion, for example, are a poor match because the synthesized "hardware" must be of fixed size. But with many applications, including image and signal processing and biological sequence comparison, orders of magnitude speedup can be obtained. This is because hardware execution is inherently parallel, because the overhead of the fetch-decode-execute cycle is eliminated, and because the "CPU" can be custom-designed to accommodate computations not normally supported by instruction set architectures.

There are presently two primary problems in reconfigurable computing. The first is to demonstrate that there is a sufficient range of problems on which one can get a factor of 50 to 100 speedup by using a reconfigurable machine and thus to justify the development of reconfigurable hardware instead of simply following existing technology curves. The second problem is to demonstrate that this speedup can be obtained by a process that can be described as "programming" and not as "hardware design."

Previous funded work at USC included attempts to implement relevant complete algorithms on hardware from Star Bridge Systems. Following the conclusion of that phase of the work, our attention was shifted to the SRC-6 manufactured and sold by SRC Computers, Inc.. These two represent the only extant systems of significant size that are currently available from commercial vendors, although. SGI and Cray are in the process of releasing hardware.

Our current focus is on the DARPA HPCS Discrete Mathematics Benchmarks, most of which are highly amenable to a solution on an RCM. In addition, we are looking at bioinformatics, molecular dynamics, and error-correcting codes as secondary topic areas.

Other applications are also being pursued in wireless and embedded systems and in bioinformatics, including both the tree searching needed for statistical Phylogenetics and the clustering and classification necessary for the analysis of DNA microarray data. Research is also ongoing on the architecture of embedded systems that would include processors, ASICS, and FPGAs and on the automation of problem partitioning for such devices.


News

The contract has been signed! We have funding for this project!

Related courses in VLSI

Course work applicable to reconfigurable computing includes the full suite of hardware design courses, including
  • CSCE 611 -- VLSI Systems Architecture & Design
  • CSCE 612 -- HDL-based Design for VLSI Systems
  • CSCE 613 -- Fundamentals of CMOS VLSI Design
  • CSCE 818 -- Low-Power VLSI Design
  • CSCE 819 -- Hardware-Software Co-design for Embedded Systems
For details on these courses, see the departmental listing or contact Professor Davis ('jimdavis' in domain 'cse.sc.edu' or Professor Quan ('gquan' in domain cse.sc.edu).

DARPA HPCS Discrete Math Benchmarks

Our current focus is on the DARPA HPCS Discrete Mathematics Benchmarks as they would be implemented on the SRC reconfigurable computer.

The research team, Fall 2004.


FCCM 2005

Luis Cordova at FCCM, April 2005.


High-Performance Secure Computing

Professors BuellDavis, and Quan are investigating the application of reconfigurable computing architectures, and the realization of VLSI architecture on reconfigurable computing platforms, as a means to increase computational throughput by exploiting inherent parallelism of VLSI and configurability of custom computing machines.

The HC-36m® RCM platform.

Some of the problems associated with this endeavor include the following: (1) algorithm analysis - to determine the most appropriate architecture to place the computationally-intensive portions of the application on the reconfigurable platform; (2) architecture development - to assess and empirically evaluate different architecture alternatives, which generally involves trading-off the computational throughput of the portion of the architecture executing on the reconfigurable custom computing engine with the communications overhead associated with a given partitioning scheme.

The HC-36m® RCM platform.

There are a number of algorithms and architectures supporting secure computing.  Most of these involve the computation of integers--multiplication, addition, division and subtraction.  There is a great body of work on the computational aspects of different arithmetic processing techniques for the various operations for integer arithmetic.  We are evaluating the partitioning, architecture and implementation of a number of these algorithms on a reconfigurable computing platform.

The Reconfigurable Custom Computing Laboratory has a number of RCM platforms at its disposal on which to assess and evaluate different architectural realizations of computing algorithms for secure computing.  The Lab consists of a network of Sun machines, a Beowulf cluster, a set of PCs with WildCard(TM) RCM boards from Annapolis Microsystems, a HAL-15 Hypercomputer® and a HC-36m® Hypercomputer®, both from Star Bridge Systems.  You can take a look at Starbridge's flagship RCM platform, the HC-36m, in the following photos: view-1 (JPG), view-2 (JPG), view-3 (JPG), view-4 (JPG).

Preliminary Results (Fall 2002)

See this link for a description of the research results to date, and check out the latest publications we're working on relative to the architecture modeling for realizing elliptic curve cryptography on reconfigurable platforms.


Bioinformatics

Professors Buell and  Davis are collaborating with other faculty in the department and in the Biology, Statistics, and Mathematics departments on the application of RCM techniques to problems in Biomedical Engineering and Bioinformatics.

Current projects underway include the re-architecting and re-implementation of Phylogenetics algorithms onto RCM platforms.  One such project involves the development of an accelerated version of the UPGMA algorithm for determining the taxa distance in Phylogenetic trees.  Software algorithms exist for this technique, and these have been highly optimized in software for execution on a conventional computer.  However, as the number of species taxa grows large, meaning that the corresponding data sets grow large as well, existing techniques don't scale well.  

Therefore, we are investigating methods for improving the data throughput for large amounts of data collected through genetic analysis of species, so that the overall computation time can be by one to two orders of magnitude.  This means Biologists and Statisticians involved in the analysis of species' genomic data sets can get processing results much faster than would ordinarily be possible.  In addition to the UPGMA approach, other techniques, such as Least Squares and Parsimony, are being explored and will be exploited for high-performance execution on a parallel processing RCM platform.


Reconfigurable Computing for Wireless Mobile Applications

Professors Davis and Quan are investigating the application of RCM techniques, architectures and platforms to problems in wireless mobile computing.  With the proliferation of mobile devices and wireless communications, there is an opportunity to apply RCM techniques to some of the problems associated with creating new architectures for next-generation wireless VLSI systems.

Whereas most of the problems we are looking at in Scientific computing have a large data path component, the problem of RCM in wireless mobile computing is one that is more control dominated--requiring many parallel state machines that share data paths and communicate, synchronize and coordinate with one another.  These are very useful for design of protocols and control-oriented algorithms.  We seek to manage complexity of these protocols and algorithms by adopting architectures that use a lot of smaller, concurrent state machines.  We see the use of RCM techniques, employing a much higher degree of custom logic content in wireless systems solution, as an alternative to the predominant embedded systems and SoC methods of delivering feature content in appropriate mobile device "footprints".  Issues such as energy management, performance throughput and cost will provide the catalyst for realizing tetherless products on a custom logic substrate.  The use of the RCM paradigm will enable such applications to be created that better exploit the power of VLSI custom logic, while also gaining the advantages of dynamic reconfigurability of the VLSI platform for multiple applications--a benefit of software-based systems that is now available to VLSI hardware.

As wireless networks become large, there are a number of scaling problems which need to be addressed.  Wireless LANs are becoming a means for communities of users to set up ad-hoc networks, to offer commercial services (such as access to the Internet for wireless laptops and PDAs) or provide support for military operations (such as through wireless sensor networks).  Operating using unregulated frequencies in the 2.5 GHZ and 5 GHZ frequency bands, wireless LANs using the IEEE 802.11 protocol family are deceptively easy to set up and manage.  However, there are a number of security, traffic congestion management and resource management issues that need to be addressed if wireless networks based in 802.11 protocol are to become and integral part of the Internet fabric and infrastructure.

We are looking to apply principles of adaptive, self-organizing systems to this problem.  Wireless  LAN nodes constitute a population of individuals that exhibit behaviors defined by a rigid protocol.  What if this protocol were not so rigid, but adaptive, where each wireless station (namely, the MAC layer that manages the resources and resource access of the wireless medium) were able to learn and adapt to changing conditions in its environment?  What if WLAN stations were able to share more than frame information, but meta-level information governing their internal state and what they have learned from their environment?  Would it be possible to create behaviors in individual wireless nodes such that the behavior of the wireless network as a whole were capable of "intelligence"?  These are the questions we want to answer in this research, and we believe that RCM techniques provide an appropriate vehicle for realizing such adaptive systems on a compact and cost effective VLSI hardware platform.  See this ASOWN abstract and Research Quad Chart (PDF) for more information on this project.


Design and Architecture of Reconfigurable Computing Applications

Professors BuellDavis and Quan are investigating the creation of design methods and EDA tools specifically for development of reconfigurable computing applications using design heuristics and automated techniques.  Place and Route tools do a reasonable job of mapping a design onto the resources of a single FPGA, but the techniques to map a large VLSI systems architecture onto a network of FPGA devices, or hybrid devices of FPGA, ASIC, and processors, must still be researched and developed.

MAPLD 2003, Washington, DC.  

High-performance RCM platforms provide a wide-range of architectures and circuit realization possibilities.  The primary problem being investigated is how to formalize the analysis, architecture and implementation of applications that are specially formulated for the reconfigurable computing paradigm that make the most of the resources that RCM platforms have to offer.  In general, we are looking to explore different mappings of applications--in terms of their algorithms and data structures--onto various reconfigurable computing platforms in an efficient manner.  Such RCM platforms have resources at our disposal--in terms of logic "slices" (the primitive unit of resource allocation on a Xilinx® FPGA) and CLBs on a given FPGA device, and interconnect resources (signals paths and bus structures) available for communicating logic functions that span multiple FPGA devices configured in the RCM platform.

The problem of identifying the "best fit" of a given application starts with exploring how best to map the application

Working hard.

architecture to this set of available resources so as to provide the most efficient use of these resources while delivering the most application performance.  However, we are also interested in the relationship between the algorithms and data structures that may be in prevalent use for a given application domain, and how to assess and select the most optimal VLSI architecture among many candidate architectures for subsequent mapping onto the RCM platform.  In general, we are looking at this for problem domains where computation consists of very-wide data paths, resulting in operations that span multiple FPGA devices. 

Team members working with the SBS HAL-15(R).

Therefore, the analysis, architecture selection, and platform mapping tasks are interrelated and iterative.  For instance, we might identify a number of different ways to construct a large, 256-bit multiplier architecture for use in a high-performance application, but we must consider the selection of the architecture not just on the general efficiency and throughput of the architecture, but also on the quality of the resultant design partitioning and mapping to the fixed resources according to resource and  timing "budgets" available to us for the RCM platform being used.  We are currently pursuing this investigation using the Star Bridge Systems® HC-36m® platform.


Publications

2005
  • L. Cordova, D. Buell, and S. Akella, "The DARPA dynamic programming benchmark on a reconfigurable computer," FCCM poster session, 18-20 April 2005. (ppt).
2004
  • Davis, J. P., H. A. Wake, C. Leehaug, J. Branton, and N. Namilae, "Design versus programming in custom computing machine applications: Experiences using the Algorithm State Machine method," in Katz, R. (ed.), Proceedings MAPLD-2004 Military Applications of Programmable Logic Devices, NASA Office of Logic Programming, 2004. reprint (PDF).
  • Quan, G., A. Michalski, D. Buell, and J. Davis, "The DARPA sorting benchmark on the SRC platform," in Katz, R. (ed.), Proceedings MAPLD-2004 Military Applications of Programmable Logic Devices, NASA Office of Logic Programming, 2004.
  • Buell, D. A., S. Akella, J. P. Davis, G. Quan, and D. Caliga, "The DARPA boolean equation benchmark on a reconfigurable computer," in Katz, R. (ed.), Proceedings MAPLD-2004 Military Applications of Programmable Logic Devices, NASA Office of Logic Programming, 2004.
  • Devarkal, S. "Design space exploration of elliptic curve arithmetic on a reconfigurable platform," Master's thesis, University of South Carolina, August 2004.
  • Buell, D. A., S. Devarkal, and H. A. Wake, "Reconfigurable computing machines and their applications in computational number theory," Fields Institute Communications: High Primes and Misdemeanours: A Conference in Honor of Hugh Williams, volume 41, pp. 123-148, Fields Institute, Toronto, 2004.
  • Akella, S., J. Davis, and P. Waddell, "Custom computing for phylogenetics: Exploring the solution space for UPGMA," submitted to FPT 04: International Conference on Field Programmable Technology, Brisbane, Australia, December 2004, preprint (PDF).
  • Davis, J., S. Akella, and P. Waddell, "Accelerating phylogenetics computing on the desktop: Experiments with executing UPGMA in programmable logic," EMBC-04: 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Francisco, CA, September 2004, to appear. reprint (PDF).
  • Quan, G., J. Davis, and S. Devarkal, "Pruning the design space for wide-bit multipliers: An estimation approach for programmable logic-based embedded systems," submitted CASES-04, preprint (PDF).
  • Buell, D., J. Davis, G. Quan, S. Akella, S. Devarkal, P. Kancharla, E. Michalski, and H. Wake, "Experiences with a reconfigurable computer," Proceedings ERSA-04: International Conference on the Engineering of Reconfigurable Systems and Algorithms, Las Vegas, NV, June 2004, preprint (PDF).
  • Quan, G., L. Niu, and J. P. Davis, "Power aware scheduling for real-time systems with (m,k)-guarantee," Proceedings CNDS-04: Communication Networks and Distributed Systems Modeling and Simulation, The Society for Modeling and Simulation International, 2004.  (PDF).
2003
  • Quan, G., D. A. Buell, J. P. Davis, and S. Devarkal, "Design exploration of 192-bit elliptic curve adder on the Star Bridge HC-36 system," in Katz, R. (ed.), Proceedings MAPLD-2003 Military Applications of Programmable Logic Devices, NASA Office of Logic Programming, 2003. (PDF).
  • Wake, H. A., and D. A. Buell, "Congruential sieves on a reconfigurable computer," Proceedings IEEE Symposium on Field Programmable Custom Computing Machines, pp. 11-18, 2003.
  • Akella, S., H. A. Wake, J. P. Davis, and D. A. Buell, "Porting EDIF netlists to the Viva environment for integrated custom computing applications," in Katz, R. (ed.), Proceedings MAPLD-2003 Military Applications of Programmable Logic Devices, NASA Office of Logic Programming, 2003. (PDF).
  • Davis, J. P., S. Devarkal, and G. Quan, "Architecting wide-bit multipliers on programmable logic devices," in Katz, R. (ed.), Proceedings MAPLD-2003 Military Applications of Programmable Logic Devices, NASA Office of Logic Programming, 2003. Extended Abstract (PDF). Slides (PDF).
  • Davis, J. P., G. Bhadri, S. Bhoomireddy, and S. Sugavanam, "Exploring nano-computing: Building computing systems on nanoscale devices," Poster Session, USC NanoCenter Symposium V: Molecular Electronics, November 21, 2003, University of South Carolina.  Extended Abstract (PDF), Slides (PDF).
2002
  • Davis, J. P., D. A. Buell, and S. Akella. "SoC methods and architecture for realizing cryptographic computing engines," preprint (PDF).
  • Buell, D. A., J. P. Davis, and G. Quan. "Reconfigurable computing applied to problems in communications security," Military and Aerospace Applications of Programmable Logic Devices, 10-12 September 2002, Baltimore, Maryland. Abstract (PDF), Paper (PDF), Slides (MS-PowerPoint).
  • Davis, J. P., S. Akella, and P. Waddell, "Methods and architectures for realizing fast phylogenetic computation engines using VLSI array-based logic." Preprint (PDF).
  • Akella, S., J. P. Davis, and H. Kobayashi, "On the tradeoff between productivity and area optimization in VLSI design," Preprint (PDF).
Pre-2002
  • G. Quan, X. (Sharon) Hu, and G.Greenwood, "Preference-driven hierarchical hardware/software partitioning," IEEE/ACM Proceedings of International Conference On Computer Design, pp 652-657, 1999. (PDF).

Faculty contacts


Some Industry links

SRC Computers
SGI
Cray
Star Bridge Systems
Celoxica
Xilinx
Altera
Synopsys
Synplicity
ModelSim
Mentor Graphics
Exsedia