July 5, 2005
Journal Article

QsNetII : Defining High-Performance Network Design

Abstract

Cluster computers—parallel computers built from commodity processors—are becoming the predominant supercomputer architecture because of their combined scalable performance and attractive price. As of June 2005, 61 percent of the world’s top-500 supercomputers were clusters (http://www.top500.org). This is a significant paradigm shift from a few decades ago, when supercomputers were special purpose, like the Cray vector machines, and designers built them from expensive, custom components. Clusters that use commodity processors still require high-performance, low-latency networks, if their applications are fine-grained, or if the cluster has many processors. Clusters can use commodity networks, such as Gigabit Ethernet, but these fall short in many scalability and performance aspects.1 Consequently, the core of several successful cluster-based supercomputers is a highperformance network. On the one hand, this component interfaces with standard I/O buses, such as peripheral component interconnect (PCI), its extended version (PCI-X), and PCI-Express, thus everaging commodity computing nodes. On the other hand, it provides scalable erformance and cluster aggregation through specialized protocols.2 Thus, in a sense, the high-performance network in a cluster computer is the computer because it largely defines achievable performance, widening the range of the applications a cluster can efficiently execute, as well as defining its scalability, fault tolerance, system software, and overall usability. Because of their key performance-enhancing role, cluster computer networks must meet high standards in four design spects—performance, scalability, reliability, and programmability. The “Four Critical Design Criteria” sidebar describes these in more detail. QsNetII, the latest generation Quadrics interconnect, meets these standards, extending previous work on high-performance networks with an aggressive design to achieve ultra-low latency. At the design’s core are two ASICs: Elan4 and Elite4. Elan4 is a communication processor that forms the interface between a high-performance multistage network and a processing node with one or more CPUs. Elite4 is a switching component that can switch eight bidirectional communications links, each of which carrying data in both directions simultaneously at 1.3 Gbyte/s. Two virtual channels share the link bandwidth.

Revised: October 24, 2007 | Published: July 5, 2005

Citation

Beecroft J., D. Addison, D. Hewson, M. McLaren, D. Roweth, F. Petrini, and J. Nieplocha. 2005. QsNetII : Defining High-Performance Network Design. IEEE Micro 25, no. 4:34 - 47. PNNL-SA-52258. doi:10.1109/MM.2005.75