NAS Parallel Benchmarks


What are Benchmarks?

a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. The term ‘benchmark’ is also mostly utilized for the purposes of elaborately-designed benchmarking programs themselves.

The NAS Parallel Benchmarks (NPB) are a small set of programs designed to help evaluate the performance of parallel supercomputers. The benchmarks, which are derived from computational fluid dynamics (CFD) applications, consist of five kernels and three pseudo-applications. The NPB come in several “flavors.” NAS solicits performance results for each from all sources.


These are the original “pencil and paper” benchmarks. Vendors and others implement the detailed specifications in the NPB 1 report, using algorithms and programming models appropriate to their different machines. Submitted results are verified by the NAS Division. NPB 1 implementations are generally proprietary.

Specification: NAS Parallel Benchmarks, RNR-94-007 (PDF-425KB)

Source Code: + Get Download Instructions



These are MPI-based source-code implementations written and distributed by NAS. They are intended to be run with little or no tuning, and approximate the performance a typical user can expect to obtain for a portable parallel program. They supplement, rather than replace, NPB 1. For the convenience of developers of parallelization tools a serial version derived from the MPI implementations is also made available (version 2.3, and also in version 3.0). The latest release, NPB 2.4, contains a new problem class (D), as well as a version of the BT (Block Tri-diagonal) benchmark that does significant (parallel) I/O. Each Class D benchmark involves approximately 20 times as much work, and a data set that is approximately 16 times as large, respectively, as the corresponding Class C benchmark. The Class D implementation of the IS benchmark is not available.

Reports on specifications and reference implementations:

Source Code: + Get Download Instructions


These are parallel implementations using OpenMP, High Performance Fortran (HPF), and Java, respectively. They were derived from the NPB-serial implementations released with NPB 2.3, after some additional optimization. These implementations, which include the improved serial codes, were previously known collectively as the Programming Baseline for the NAS Parallel Benchmarks (PBN).

A set of multi-zone benchmarks based on the single-zone NPB 3 has been added. The implementations include both serial and parallel versions. They are meant for testing the effectiveness of multi-level and hybrid parallelization paradigms and tools. The parallel implementation uses hybrid parallelism: MPI for the coarse-grain parallelism, and OpenMP for the loop-level parallelism.

Reports on specifications and implementations:

Source Code: + Get Download Instructions

Benchmark Name derived from[2] Available since Description[2] Remarks
MG MultiGrid NPB 1[2] Approximate the solution to a three-dimensional discrete Poisson equation using the V-cycle multigrid method
CG Conjugate Gradient Estimate the smallest eigenvalue of a large sparse symmetric positive-definite matrix using the inverse iteration with the conjugate gradient method as a subroutine for solving systems of linear equations
FT Fast Fourier Transform Solve a three-dimensional partial differential equation (PDE) using the fast Fourier transform (FFT)
IS Integer Sort Sort small integers using the bucket sort[5]
EP Embarrassingly Parallel Generate independent Gaussian random variates using the Marsaglia polar method
BT Block Tridiagonal Solve a synthetic system of nonlinear PDEs using three different algorithms involving block tridiagonal, scalar pentadiagonal and symmetric successive over-relaxation (SSOR) solver kernels, respectively
  • The BT benchmark has I/O-intensive subtypes[4]
  • All three benchmarks have multi-zone versions[13]
SP Scalar Pentadiagonal[6]
LU Lower-Upper symmetric Gauss-Seidel[6]
UA Unstructured Adaptive[11] NPB 3.1[7]
DC Data Cube operator[12]
DT Data Traffic[7] NPB 3.2[7]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s