Implementing parallel graph algorithms in large, shared memory machines, such as the Cray XMT, can be challenging for programmers. Synchronization, deadlock, hot spotting, and others can be barriers to obtaining linear scalability. Alternative programming models, such as the bulk synchronous parallel programming model used in Google’s Pregel, have been proposed for large graph analytics. This model eases programmer complexity by eliminating deadlock and simplifying data sharing. We investigate the algorithmic effects of the BSP model for graph algorithms and compare performance and scalability with hand-tuned, open source software using GraphCT. We analyze the innermost iterations of these algorithms to understand the differences in scalability between BSP and shared memory algorithms. We show that scalable performance can be obtained with graph algorithms in the BSP model on the Cray XMT. These algorithms perform within a factor of 10 of hand-tuned C code.