HPE Performance Cluster Management Administration (H8PE9) – Outline

Detailed Course Outline

Module 1: Install Cluster
  • Describe HPCM features
  • Define operating system slots
  • Build cluster from ground up
  • Provision node with GUI
  • Provision node with command line
  • Add nodes to the cluster
  • Explore auto installation tools
Module 2: Discover
  • Discover nodes
  • Interpret cluster configuration files
  • Review cluster services
Module 3: Data Networks
  • Describe technologies
  • Describe InfiniBand configuration
  • Describe Intel Omni-Path configuration
  • Describe software components
  • Use diagnostic commands
Module 4: Manage Images
  • Manage software repositories
  • List software repositories
  • Add software repositories
  • Remove software repositories
  • Create repository groups
  • Customize an image by using RPM lists
  • Create a compute node image
  • Create an ICE-compute node image
  • Manage image version control
  • Check in an image into version control
  • Compare differences between two versions of an image
  • List the versions of an image
  • Deploy a specific version of an image
  • Push an ICE-compute image to a rack
  • Use parallel tools and inbuilt functionality to check differences between nodes
  • Install batch scheduler server on a compute node
  • Install batch scheduler client on a compute node and in ICE compute node
  • Configure HPCM connectors to job schedulers
  • Capture an image from a node (golden)
  • Add RPMs to, remove RPMs from, and version control compute images
  • Add and remove RPMs from running compute nodes
  • Clone an ICE-compute image
  • Add RPMs to ICE compute image Compare when and when not to use tmpfs root
  • Determine which nodes use tmpfs root
  • Configure nodes to use tmpfs root
  • List tmpfs quota difference (rack leader quotas do not apply when ICE-compute nodes are in tmpfs)
  • Set tmpfs mode
  • Set disk mode
  • Show which mode a node has booted with
  • Show which mode a node is scheduled to boot into
Module 5: Automate Post Installation Tasks
  • Review conf.d scripts
  • Exclude a conf.d script
  • Use pre_reconf.sh
  • Use reconfig.sh
  • Develop post install and per-host customization scripts
Module 6: Configure Shared Filesystem, User Accounts, Applications, and Updates NFS Export a filesystem on a compute node
  • Mount an NFS filesystem and create a user on an ICE compute node
  • Manage user accounts
  • Synchronize UIDs and GIDs, LDAP, etc.
  • Run an application on compute and ICE compute nodes
  • Display BIOS settings
  • Upgrade firmware
  • Update kernel
  • Update distribution
  • Update HPCM
Module 7: Troubleshoot Cluster
  • Backup cluster configuration
  • Backup managed network switch configuration
  • Use the central log repository
  • Investigate log files
  • Gather system information
  • Interrogate iLOs, BMCs
  • Confirm resources
  • Create pdsh groups
  • Investigate bond devices
  • Inspect VLAN devices
  • Capture a node crash dump
  • Transfer an image from another slot or another system and confirm that the image can be used.
  • Inject faults