Understanding the AccelerateAI architecture

AccelerateAI comprises six BullSequana X410-A5 nodes, connected to each other (and to the Supercomputing Wales SUNBIRD hardware) via NVIDIA Networking Infiniband EDR. Within each node is:

Node structure

Each CPU has four “chiplets” inside, each with eight cores and a dedicated NUMA node with 64GB of RAM. Each GPU, and the Infiniband adapter, are also associated with one of the eight NUMA nodes. Pairs of GPUs are located on the same NUMA node, and these pairs are also linked together with NVLink for higher-speed transfers.

This high-level structure is illustrated in the diagram below:

Schematic diagram of the node structure, illustrating the text above.

Consequences of this structure:

Interactive/testing node

One node is reserved for interactive usage on test problems. To maximise availability, since test jobs will use less GPU memory, each A100 in this node is split as follows:

These are not intended to be used in parallel.