An alternative form of multiprocessors to a shared memory multiprocessor can be created by connecting completer computers through an interconnection network. Each computer consists of a processor and local memory but this memory is not accessible by other processors.
The interconnection network provides for processors to send messages to other processors. The message carry data from one processor to another as dictated by the program. Such multiprocessor systems are usually called message-passing multiprocessor, or simply multicomputers, especially if they consist of self-contained computers that could operate separately.
Programming a message-passing multicomputer still involves dividing the problem into parts that are intended to be executed simultaneously to solve the problem. Programming could use a parallel or extended sequential language, but a common approach is to use message-passing library routines that are inserted into a conventional sequential program for message-passing. Often, we talk in terms of processes. A problem is divided into a number of concurrent processes than computers, more than one process would be executed on one computer, in a time-shared fashion.
Processes communicate by sending messages; this will be the only way to distribute data and results between processes.
The purpose of the interconnection network is to provide a physical path for message sent from one computer to another computer.
Key issues in network design are the bandwidth, latency and cost.
The bandwidth is the number of bits that can be transmitted in unit time, given as bits/sec.
The network latency is the time to make a message transfer through the network.
The communication latency is the total time to send the message, including the software overhead and interface delays.
Message latency, or startup time, is the time to send a zero-length message, which is essentially the software and hardware overhead in sending a message(finding the route, packing, unpacking, etc) onto which must be added the actual time to send the data along the interconnection path.
The number of physical links in a path between two nodes is an important consideration because it will be a major factor in determining the delay for a message. The diameter is the minimum number of links between the two farthest nodes(computers) in the network. Only the shortest routes are considered. How efficiently a parallel problem can be solved using a multicomputer with a specific network is extremely important. The diameter of the network gives the maximum distance that a single message must travel and can be used to find the communication lower bound of some parallel algorithms.
A two dimensional mesh can be created by having each node in two dimensional array connect to its four nearest neighbors.
The mesh and torus network are popular because of their ease of layout and expandability. If necessary, the network can be folded; that is, rows are interleaved and columns, are interleaved so that the wraparound connections simply turn back through the network rather than stretch from one edge to the opposite edge.
BlueGene/L uses a three-dimensional (3D) torus network in which the nodes (red balls) are connected to their six nearest-neighbor nodes in a 3D mesh. In the torus configuration, the ends of the mesh loop back, thereby eliminating the problem of programming for a mesh with edges. Without these loops, the end nodes would not have six near neighbors.
In a d-dimensional(binary) hypercube network, each node connects to one node in each of dimesions of the network. For example, in a three-dimensional hypercube, the connections in the x-direction, y-direction and z-direction form a cube.
Each bit is associated with one of the dimensions and can be gave a 3-bit address. Node 000 connects to nodes with address 001, 010 and 100. Note that each node connects to nodes whose addresses differ by one bit. This characteristic can be extended for higher-dimension hypercubes.
A notable advantage of the hypercube is that the diameter of the network is given by log_2 p for a p-node hypercube, which has a reasonable(low) growth with increasing p. The number of links emanating from each node also only grows logarithmically.
Hypercube are a part of a larger family of k-ary d-cubes; however, it is only the binary hypercube(with k = 2) that is really important as a basis for multicomputer construction and for parallel algorithms. The hypercube network became popular for constructing message-passing multicomputers after the pioneering research system called the Cosmic Cube was constructed at Caltech in the early 1980s.
As an alternative to direck links between individual computers, switches can be used in various configurations to route the messages between the computers.
The Crossbar switch provides exhaustive connections using one switch for each connection. It is employed in shared memory systems more so than message-passing systems for connecting processors to memories.
However a crossbar architecture has a small problem. When a crossbar switch serves multiple networks, and two frames enter the switch at the same time destined for different ports, one of the frames is blocked while the first frame is forwarded. This results in all frames being queued as they flow through the switch. If there is sufficient traffic and insufficient buffer space on the switch, packets are dropped.
This problem is called Head of Line Blocking, and is a common problem with crossbar switches. One device that suffers from just such a problem is the old DEC gigaswitch design.
Another switch configuration is to use a binary tree. A key aspect of the tree structure is that the height is logarithmic, there are log_2 p levels of switches with p processors. The tree network need not be complete or based upon the base two.
The tree network topology is ideal when the workstations are located in groups, with each group occupying a relatively small physical region. An example is a university campus in which each building has its own star network, and all the central computers are linked in a campus-wide system. It is easy to add or remove workstations from each star network. Entire star networks can be added to, or removed from, the bus. If the bus has low loss and/or is equipped with repeaters, this topology can be used in a wide area network (WAN) configuration.
Multistage Interconnection Networks
The Multistage interconnection network(MIN) is a classification covering a multitude of configurations with the common characteristics of having a number of levels of switches. Switches in one level are connected to switches in adjacent levels in various symmetrical ways such that a path can made from one side of the network to the other side(and back sometimes).
An example of a multistage interconnection network is the Omega network. This network has a very simple routing algorithm using the destination address. Inputs and outputs are given addresses. Each switching cell requires one control signal to select either the upper output or the lower output(0 specifying the upper output and 1 specifying the lower). The most significant bit of the destination address is used to control the switch in the first stage; if the most significant bit is 0, the upper output is selected, and if it is 1, the lower output is selected. The next most significant bit of the destination address is used to control the output of the switch in the next stage, and soon until the final output has been selected.
Parallel Programming – Techniques and Applications 2th Edition, Barry Wilkinson