Reconfigurable Computing
in the CSE Department
|
|
|
|
Reconfigurable computing lab
|
|
|
Introduction
An exciting area of research in the Department of Computer Science
and Engineering is that of reconfigurable computing.
Current reconfigurable computing machines (RCMs) make use of field
programmable gate arrays (FPGAs), chips that can be configured to
implement an arbitrary hardware design. FPGAs can be reconfigured,
even in the middle of a computation, by simply changing the bit file
that provides the configuration. With such a machine, a user designs
a special purpose hardware solution for a computational problem as
if designing a software solution. The design is then synthesized
into logic and the synthesized logic used to configure the FPGAs.
Not all applications are suitable for an RCM. Applications that
involve extensive recursion, for example, are a poor match because
the synthesized "hardware" must be of fixed size. But with
many applications, including image and signal processing and
biological sequence comparison, orders of magnitude speedup can be
obtained. This is because hardware execution is inherently parallel,
because the overhead of the fetch-decode-execute cycle is
eliminated, and because the "CPU" can be custom-designed
to accommodate computations not normally supported by instruction
set architectures.
There are presently two primary problems in reconfigurable computing.
The first is to demonstrate that there is a sufficient range
of problems on which one can get a factor of 50 to 100 speedup by using
a reconfigurable machine and thus to justify the development of reconfigurable
hardware instead of simply following existing technology curves.
The second problem is to demonstrate that this speedup can be obtained
by a process that can be described as "programming" and not
as "hardware design."
Previous funded work at USC included attempts to implement relevant complete
algorithms on hardware from
Star Bridge Systems.
Following the conclusion of that phase of the work, our attention was shifted
to the SRC-6 manufactured and sold by
SRC Computers, Inc..
These two represent the only extant systems of significant size that are
currently available from commercial vendors, although.
SGI
and
Cray
are in the process of releasing hardware.
Our current focus is on the DARPA HPCS Discrete Mathematics
Benchmarks,
most of which are highly amenable to a solution on an RCM.
In addition, we are looking at bioinformatics, molecular dynamics, and
error-correcting codes as secondary topic areas.
Other applications are also being pursued in wireless and
embedded systems and in bioinformatics, including both the tree
searching needed for statistical Phylogenetics and the clustering
and classification necessary for the analysis of DNA microarray
data. Research is also ongoing on the architecture of embedded
systems that would include processors, ASICS, and FPGAs and on the
automation of problem partitioning for such devices.
|
|
News
The contract has been signed!
We have funding for this project!
|
|
Related courses in VLSI
Course work applicable to reconfigurable computing includes the
full suite of hardware design courses, including
- CSCE 611
-- VLSI Systems Architecture & Design
- CSCE 612
-- HDL-based Design for VLSI Systems
- CSCE 613
-- Fundamentals of CMOS VLSI Design
- CSCE 818 -- Low-Power VLSI Design
- CSCE 819 -- Hardware-Software Co-design for Embedded Systems
For details on these courses, see the
departmental listing
or contact Professor Davis ('jimdavis' in domain 'cse.sc.edu'
or Professor Quan ('gquan' in domain cse.sc.edu). |
|
DARPA HPCS Discrete Math Benchmarks
Our current focus is on the DARPA HPCS Discrete Mathematics
Benchmarks
as they would be implemented on the SRC reconfigurable computer.
The research team, Fall 2004.
|
|
FCCM 2005
Luis Cordova at FCCM, April 2005.
|
|
High-Performance Secure Computing
Professors
Buell,
Davis,
and Quan
are investigating the application of reconfigurable computing
architectures, and the realization of VLSI architecture on
reconfigurable computing platforms, as a means to increase
computational throughput by exploiting inherent parallelism of
VLSI and configurability of custom computing machines.
The HC-36m® RCM platform.
Some of the problems associated with this endeavor include the
following: (1) algorithm analysis - to determine the most
appropriate architecture to place the computationally-intensive
portions of the application on the reconfigurable platform; (2)
architecture development - to assess and empirically evaluate
different architecture alternatives, which generally involves
trading-off the computational throughput of the portion of the
architecture executing on the reconfigurable custom computing
engine with the communications overhead associated with a given
partitioning scheme.
The HC-36m® RCM
platform.
|

|

|
There are a number of algorithms and architectures supporting
secure computing. Most of these involve the computation of
integers--multiplication, addition, division and
subtraction. There is a great body of work on the
computational aspects of different arithmetic processing
techniques for the various operations for integer
arithmetic. We are evaluating the partitioning, architecture
and implementation of a number of these algorithms on a
reconfigurable computing platform.
The Reconfigurable Custom Computing Laboratory has a number of
RCM platforms at its disposal on which to assess and evaluate
different architectural realizations of computing algorithms for
secure computing. The Lab consists of a network of Sun
machines, a Beowulf cluster, a set of PCs with WildCard(TM) RCM
boards from Annapolis
Microsystems, a HAL-15 Hypercomputer® and a HC-36m®
Hypercomputer®, both from Star
Bridge Systems. You can take a look at Starbridge's flagship RCM
platform, the HC-36m, in the following photos: view-1
(JPG), view-2
(JPG), view-3 (JPG), view-4
(JPG).
See this link for a description of the research results to
date, and check out the latest publications we're working on
relative to the architecture modeling for realizing elliptic curve
cryptography on reconfigurable platforms.
|
|
Bioinformatics
Professors Buell
and Davis are
collaborating with other faculty in the department and in the
Biology, Statistics, and Mathematics departments on the
application of RCM techniques to problems in Biomedical
Engineering and Bioinformatics.
Current projects underway include the re-architecting and
re-implementation of Phylogenetics algorithms onto RCM
platforms. One such project involves the development of an
accelerated version of the UPGMA algorithm for determining the
taxa distance in Phylogenetic trees. Software algorithms
exist for this technique, and these have been highly optimized in
software for execution on a conventional computer. However,
as the number of species taxa grows large, meaning that the
corresponding data sets grow large as well, existing techniques
don't scale well.
Therefore, we are investigating methods for improving the data
throughput for large amounts of data collected through genetic
analysis of species, so that the overall computation time can be
by one to two orders of magnitude. This means
Biologists and Statisticians involved in the analysis of species'
genomic data sets can get processing results much faster than
would ordinarily be possible. In addition to the UPGMA
approach, other techniques, such as Least Squares and Parsimony,
are being explored and will be exploited for high-performance
execution on a parallel processing RCM platform. |
|
Reconfigurable Computing for Wireless Mobile Applications
Professors Davis
and Quan are
investigating the application of RCM techniques, architectures and
platforms to problems in wireless mobile computing. With the
proliferation of mobile devices and wireless communications, there
is an opportunity to apply RCM techniques to some of the problems
associated with creating new architectures for next-generation
wireless VLSI systems.
Whereas most of the problems we are looking at in Scientific
computing have a large data path component, the problem of RCM in
wireless mobile computing is one that is more control
dominated--requiring many parallel state machines that share data
paths and communicate, synchronize and coordinate with one
another. These are very useful for design of protocols and
control-oriented algorithms. We seek to manage complexity of
these protocols and algorithms by adopting architectures that use
a lot of smaller, concurrent state machines. We see the use
of RCM techniques, employing a much higher degree of custom logic
content in wireless systems solution, as an alternative to the
predominant embedded systems and SoC methods of delivering feature
content in appropriate mobile device "footprints".
Issues such as energy management, performance throughput and cost
will provide the catalyst for realizing tetherless products on a
custom logic substrate. The use of the RCM paradigm will
enable such applications to be created that better exploit the
power of VLSI custom logic, while also gaining the advantages of
dynamic reconfigurability of the VLSI platform for multiple
applications--a benefit of software-based systems that is now
available to VLSI hardware.
As wireless networks become large, there are a number of
scaling problems which need to be addressed. Wireless LANs
are becoming a means for communities of users to set up ad-hoc
networks, to offer commercial services (such as access to the
Internet for wireless laptops and PDAs) or provide support for
military operations (such as through wireless sensor
networks). Operating using unregulated frequencies in the
2.5 GHZ and 5 GHZ frequency bands, wireless LANs using the IEEE
802.11 protocol family are deceptively easy to set up and
manage. However, there are a number of security, traffic
congestion management and resource management issues that need to
be addressed if wireless networks based in 802.11 protocol are to
become and integral part of the Internet fabric and
infrastructure.
We are looking to apply principles of adaptive, self-organizing
systems to this problem. Wireless LAN nodes constitute
a population of individuals that exhibit behaviors defined by a
rigid protocol. What if this protocol were not so rigid, but
adaptive, where each wireless station (namely, the MAC layer that
manages the resources and resource access of the wireless medium)
were able to learn and adapt to changing conditions in its
environment? What if WLAN stations were able to share more
than frame information, but meta-level information governing their
internal state and what they have learned from their
environment? Would it be possible to create behaviors in
individual wireless nodes such that the behavior of the wireless
network as a whole were capable of "intelligence"?
These are the questions we want to answer in this research, and we
believe that RCM techniques provide an appropriate vehicle for
realizing such adaptive systems on a compact and cost effective
VLSI hardware platform.
See this
ASOWN abstract and
Research Quad Chart (PDF) for more information on this project. |
|
Design and Architecture of Reconfigurable Computing Applications
Professors Buell,
Davis
and Quan
are investigating the creation of design methods and EDA tools
specifically for development of reconfigurable computing
applications using design heuristics and automated
techniques. Place and Route tools do a reasonable job of
mapping a design onto the resources of a single FPGA, but the
techniques to map a large VLSI systems architecture onto a network
of FPGA devices, or hybrid devices of FPGA, ASIC, and processors,
must still be researched and developed.
MAPLD 2003, Washington, DC.
High-performance RCM platforms provide a wide-range of
architectures and circuit realization possibilities. The
primary problem being investigated is how to formalize the
analysis, architecture and implementation of applications that are
specially formulated for the reconfigurable computing paradigm
that make the most of the resources that RCM platforms have to
offer. In general, we are looking to explore different
mappings of applications--in terms of their algorithms and data
structures--onto various reconfigurable computing platforms in an
efficient manner. Such RCM platforms have resources at our
disposal--in terms of logic "slices" (the primitive unit
of resource allocation on a Xilinx® FPGA) and CLBs on a given
FPGA device, and interconnect resources (signals paths and bus
structures) available for communicating logic functions that span
multiple FPGA devices configured in the RCM platform.
The problem of identifying the "best fit" of a given
application starts with exploring how best to map the application
Working hard.
architecture to this set of available resources so as to provide
the most efficient use of these resources while delivering the
most application performance. However, we are also
interested in the relationship between the algorithms and data
structures that may be in prevalent use for a given application
domain, and how to assess and select the most optimal VLSI
architecture among many candidate architectures for subsequent
mapping onto the RCM platform. In general, we are looking at
this for problem domains where computation consists of very-wide
data paths, resulting in operations that span multiple FPGA
devices.
Team members working with the SBS HAL-15(R).
Therefore, the analysis, architecture selection, and platform
mapping tasks are interrelated and iterative. For instance,
we might identify a number of different ways to construct a large,
256-bit multiplier architecture for use in a high-performance
application, but we must consider the selection of the
architecture not just on the general efficiency and throughput of
the architecture, but also on the quality of the resultant design
partitioning and mapping to the fixed resources according to
resource and timing "budgets" available to us for
the RCM platform being used. We are currently pursuing this
investigation using the Star Bridge Systems® HC-36m® platform.
|
|
Publications
2005
-
L. Cordova, D. Buell, and S. Akella,
"The DARPA dynamic programming benchmark on a reconfigurable computer,"
FCCM poster session, 18-20 April 2005.
(ppt).
2004
-
Davis, J. P., H. A. Wake, C. Leehaug, J. Branton, and N. Namilae,
"Design versus programming in custom computing machine applications:
Experiences using the Algorithm State Machine method,"
in Katz, R. (ed.),
Proceedings MAPLD-2004 Military Applications of Programmable
Logic Devices,
NASA Office of Logic Programming, 2004.
reprint
(PDF).
-
Quan, G., A. Michalski, D. Buell, and J. Davis,
"The DARPA sorting benchmark on the SRC platform,"
in Katz, R. (ed.),
Proceedings MAPLD-2004 Military Applications of Programmable
Logic Devices,
NASA Office of Logic Programming, 2004.
-
Buell, D. A., S. Akella, J. P. Davis, G. Quan, and D. Caliga,
"The DARPA boolean equation benchmark on a reconfigurable computer,"
in Katz, R. (ed.),
Proceedings MAPLD-2004 Military Applications of Programmable
Logic Devices,
NASA Office of Logic Programming, 2004.
-
Devarkal, S.
"Design space exploration of elliptic curve arithmetic on a reconfigurable
platform,"
Master's thesis, University of South Carolina, August 2004.
-
Buell, D. A., S. Devarkal, and H. A. Wake,
"Reconfigurable computing machines and their applications in computational
number theory,"
Fields Institute Communications: High Primes and Misdemeanours:
A Conference in Honor of Hugh Williams,
volume 41, pp. 123-148,
Fields Institute, Toronto, 2004.
-
Akella, S., J. Davis, and P. Waddell,
"Custom computing for phylogenetics: Exploring the solution space
for UPGMA,"
submitted to FPT 04: International Conference on Field Programmable
Technology,
Brisbane, Australia, December 2004, preprint (PDF).
-
Davis, J., S. Akella, and P. Waddell,
"Accelerating phylogenetics computing on the desktop:
Experiments with executing UPGMA in programmable logic,"
EMBC-04: 26th Annual International Conference of the
IEEE Engineering in Medicine and Biology Society, San Francisco, CA,
September 2004, to appear.
reprint
(PDF).
-
Quan, G., J. Davis, and S. Devarkal,
"Pruning the design space for wide-bit multipliers: An estimation approach
for programmable logic-based embedded systems," submitted CASES-04,
preprint (PDF).
-
Buell, D., J. Davis, G. Quan, S. Akella, S. Devarkal, P. Kancharla,
E. Michalski, and H. Wake,
"Experiences with a reconfigurable computer,"
Proceedings ERSA-04: International Conference on the Engineering of
Reconfigurable Systems and Algorithms,
Las Vegas, NV, June 2004,
preprint (PDF).
-
Quan, G., L. Niu, and J. P. Davis, "Power aware scheduling for
real-time systems with (m,k)-guarantee,"
Proceedings CNDS-04: Communication Networks and Distributed Systems
Modeling and Simulation, The Society for Modeling and Simulation
International, 2004.
(PDF).
2003
-
Quan, G., D. A. Buell, J. P. Davis, and S. Devarkal,
"Design exploration of 192-bit elliptic curve adder on the
Star Bridge HC-36 system,"
in Katz, R. (ed.),
Proceedings MAPLD-2003 Military Applications of Programmable
Logic Devices,
NASA Office of Logic Programming, 2003.
(PDF).
-
Wake, H. A., and D. A. Buell,
"Congruential sieves on a reconfigurable computer,"
Proceedings IEEE Symposium on Field Programmable Custom Computing
Machines, pp. 11-18, 2003.
-
Akella, S., H. A. Wake, J. P. Davis, and D. A. Buell,
"Porting EDIF netlists to the Viva environment for integrated
custom computing applications," in Katz, R. (ed.),
Proceedings MAPLD-2003 Military Applications of Programmable
Logic Devices, NASA Office of Logic Programming, 2003.
(PDF).
-
Davis, J. P., S. Devarkal, and G. Quan,
"Architecting wide-bit multipliers on programmable logic devices,"
in Katz, R. (ed.),
Proceedings MAPLD-2003 Military Applications of Programmable
Logic Devices,
NASA Office of Logic Programming, 2003. Extended Abstract
(PDF).
Slides (PDF).
-
Davis, J. P., G. Bhadri, S. Bhoomireddy, and S. Sugavanam,
"Exploring nano-computing: Building computing systems on nanoscale devices,"
Poster Session,
USC NanoCenter Symposium V: Molecular Electronics,
November 21, 2003, University of South Carolina.
Extended Abstract (PDF),
Slides (PDF).
2002
-
Davis, J. P., D. A. Buell, and S. Akella.
"SoC methods and architecture for realizing cryptographic computing engines,"
preprint (PDF).
-
Buell, D. A., J. P. Davis, and G. Quan.
"Reconfigurable computing applied to problems in communications security,"
Military and Aerospace Applications of Programmable Logic Devices,
10-12 September 2002, Baltimore, Maryland.
Abstract (PDF),
Paper (PDF),
Slides (MS-PowerPoint).
-
Davis, J. P., S. Akella, and P. Waddell,
"Methods and architectures for realizing fast phylogenetic
computation engines using VLSI array-based logic."
Preprint
(PDF).
-
Akella, S., J. P. Davis, and H. Kobayashi,
"On the tradeoff between productivity and area optimization in VLSI design,"
Preprint
(PDF).
Pre-2002
-
G. Quan, X. (Sharon) Hu, and G.Greenwood,
"Preference-driven hierarchical hardware/software partitioning,"
IEEE/ACM Proceedings of International
Conference On Computer Design, pp 652-657, 1999.
(PDF).
|
|
Faculty contacts
|
|
Some Industry links
SRC Computers
SGI
Cray
Star Bridge Systems
Celoxica
Xilinx
Altera
Synopsys
Synplicity
ModelSim
Mentor Graphics
Exsedia
|
|
|
|
|