Introduction to High Performance Computing: Glossary

Key Points

Introduction to HPC	High Performance Computing (HPC) typically involves connecting remotely to a cluster of computers. HPCs can be used to do work that would either be impossible or much slower in a Desktop environment. Typical HPC workflows involve submitting “jobs” to a scheduler which queues/priortises the “jobs” of all users. The standard method of interacting with HPC’s is via a Linux-based Command Line Interface called “The Shell”.
Connect to the HPC	We connect to remote servers using the terminal SSH is a secure protocol for connecting to remote servers To connect to a server, you need it’s address, an open port (usually 22 for ssh), and your user ID
Basic UNIX Commands	`scp` (The Secure Copy Program) is a standard way to securely transfer data to remote HPC systems. File ownership is an important component of a shared computing space and can be controlled with `chmod`. Scripts are mostly just lists of commands from the command line in the order they are to be performed.
Using a cluster: Introduction	A cluster is a set of networked machines. Clusters typically provide a login node and a set of worker nodes. Files saved on one node are available everywhere.
Using a cluster: Scheduling jobs	The scheduler handles how compute resources are shared between users. Everything you do should be run through the scheduler. A job is just a shell script. If in doubt, request more resources than you will need.
Using a cluster: Accessing software	Load software with `module load softwareName` The module system handles software versioning and package conflicts for you automatically. You can edit your `.bashrc` file to automatically load a software package. When using software on any HPC system, check the software documents for details on how to use it effectively.
Using a cluster: Using resources effectively	The smaller your job, the faster it will schedule. Don’t run stuff on the login node. Again, don’t run stuff on the login node. Don’t be a bad person and run stuff on the login node.

Glossary

FIXME