What are Benchmarks?
a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. The term ‘benchmark’ is also mostly utilized for the purposes of elaboratelydesigned benchmarking programs themselves.
The NAS Parallel Benchmarks (NPB) are a small set of programs designed to help evaluate the performance of parallel supercomputers. The benchmarks, which are derived from computational fluid dynamics (CFD) applications, consist of five kernels and three pseudoapplications. The NPB come in several “flavors.” NAS solicits performance results for each from all sources.
NPB 1
These are the original “pencil and paper” benchmarks. Vendors and others implement the detailed specifications in the NPB 1 report, using algorithms and programming models appropriate to their different machines. Submitted results are verified by the NAS Division. NPB 1 implementations are generally proprietary.
Specification: NAS Parallel Benchmarks, RNR94007 (PDF425KB)
Source Code: + Get Download Instructions
Results:
 NPB 1 Results Report, NAS96018 (PDF119KB)
 NPB 1 Results Report, NAS95012 (PDF77KB)
NPB 2
These are MPIbased sourcecode implementations written and distributed by NAS. They are intended to be run with little or no tuning, and approximate the performance a typical user can expect to obtain for a portable parallel program. They supplement, rather than replace, NPB 1. For the convenience of developers of parallelization tools a serial version derived from the MPI implementations is also made available (version 2.3, and also in version 3.0). The latest release, NPB 2.4, contains a new problem class (D), as well as a version of the BT (Block Tridiagonal) benchmark that does significant (parallel) I/O. Each Class D benchmark involves approximately 20 times as much work, and a data set that is approximately 16 times as large, respectively, as the corresponding Class C benchmark. The Class D implementation of the IS benchmark is not available.
Reports on specifications and reference implementations:
 NPBMPI 2.0, NAS95020 (PDF227KB)
 NPBMPI 2.2 (PDF188KB)
 NPBMPI 2.4, NAS02007 (PDF152KB)
 NPBMPI 2.4 I/O, NAS03002 (PDF57KB)
Source Code: + Get Download Instructions
NPB 3
These are parallel implementations using OpenMP, High Performance Fortran (HPF), and Java, respectively. They were derived from the NPBserial implementations released with NPB 2.3, after some additional optimization. These implementations, which include the improved serial codes, were previously known collectively as the Programming Baseline for the NAS Parallel Benchmarks (PBN).
A set of multizone benchmarks based on the singlezone NPB 3 has been added. The implementations include both serial and parallel versions. They are meant for testing the effectiveness of multilevel and hybrid parallelization paradigms and tools. The parallel implementation uses hybrid parallelism: MPI for the coarsegrain parallelism, and OpenMP for the looplevel parallelism.
Reports on specifications and implementations:
 NPBOpenMP 3.0, NAS99011 (PDF328KB)
 NPBJava 3.0, NAS02009 (PDF191KB)
 NPBHigh Performance Fortran 3.0, NAS98009 (PDF352KB)
 NPBMultiZone 3.0, NAS03010 (PDF128KB)
Source Code: + Get Download Instructions
Benchmark  Name derived from^{[2]}  Available since  Description^{[2]}  Remarks 

MG  MultiGrid  NPB 1^{[2]}  Approximate the solution to a threedimensional discrete Poisson equation using the Vcycle multigrid method  
CG  Conjugate Gradient  Estimate the smallest eigenvalue of a large sparse symmetric positivedefinite matrix using the inverse iteration with the conjugate gradient method as a subroutine for solving systems of linear equations  
FT  Fast Fourier Transform  Solve a threedimensional partial differential equation (PDE) using the fast Fourier transform (FFT)  
IS  Integer Sort  Sort small integers using the bucket sort^{[5]}  
EP  Embarrassingly Parallel  Generate independent Gaussian random variates using the Marsaglia polar method  
BT  Block Tridiagonal  Solve a synthetic system of nonlinear PDEs using three different algorithms involving block tridiagonal, scalar pentadiagonal and symmetric successive overrelaxation (SSOR) solver kernels, respectively 


SP  Scalar Pentadiagonal^{[6]}  
LU  LowerUpper symmetric GaussSeidel^{[6]}  
UA  Unstructured Adaptive^{[11]}  NPB 3.1^{[7]}  
DC  Data Cube operator^{[12]}  
DT  Data Traffic^{[7]}  NPB 3.2^{[7]} 