May 25, 2009
Conference Paper

Input-independent, Scalable and Fast String Matching on the Cray XMT

Abstract

String searching is at the core of many security and network applications like search engines, intrusion detection systems, virus scanners and spam ?lters. The growing size of on-line content and the increasing wire speeds push the need for fast, and often real- time, string searching solutions. For these conditions, many software implementations (if not all) targeting conventional cache-based microprocessors do not perform well. They either exhibit overall low performance or exhibit highly variable performance depending on the types of inputs. For this reason, real-time state of the art solutions rely on the use of either custom hardware or Field-Programmable Gate Arrays (FPGAs) at the expense of overall system ?exibility and programmability. This paper presents a software based implementation of the Aho-Corasick string searching algorithm on the Cray XMT multithreaded shared memory machine. Our so- lution relies on the particular features of the XMT architecture and on several algorith- mic strategies: it is fast, scalable and its performance is virtually content-independent. On a 128-processor Cray XMT, it reaches a scanning speed of ˜ 28 Gbps with a performance variability below 10 %. In the 10 Gbps performance range, variability is below 2.5%. By comparison, an Intel dual-socket, 8-core system running at 2.66 GHz achieves a peak performance which varies from 500 Mbps to 10 Gbps depending on the type of input and dictionary size.

Revised: March 15, 2010 | Published: May 25, 2009

Citation

Villa O., D. Chavarría-Miranda, and K.J. Maschhoff. 2009. Input-independent, Scalable and Fast String Matching on the Cray XMT. In IEEE International Symposium on Parallel & Distributed Processing. Rome:IEEE International Parallel & Distributed Processing Symposium. PNNL-SA-62783. doi:10.1109/IPDPS.2009.5161043