OS Upgrade on Longleaf: RHEL7 -> RHEL8

July 25, 2022 update: defaults have changed to RHEL8

The Longleaf cluster is in process of being upgraded from Red Hat Enterprise Linux version 7 (RHEL7) to RHEL8. The newest Longleaf nodes are unable to run the older (RHEL7) version of the Linux kernel.

Some codes/workloads will run unaffected on either OS. Some will not.

Current configuration:

  • All jobs submitted to slurm are (by default) directed to RHEL8 resources only
  • The majority of compute nodes are now RHEL8
  • Using submit script commands, it is possible to submit jobs targeting: 1) only RHEL7; 2) only RHEL8; 3) Either/Or, i.e. OS doesn't matter take first available

We recommend moving to RHEL8 at your earliest convenience.
It is also recommended to use the matching RHEL version of submit/login host.

Choosing RHEL8 or RHEL7

Compute node targeting in your submit script, SBATCH option

--constraint=rhel7
--constraint=rhel8

If neither are specified, jobs will land on RHEL8 resources


Login/Build hosts

  • ssh username@longleaf.its.unc.edu is RHEL8
  • ssh username@longleaf-rhel7.its.unc.edu
  • ssh username@longleaf-rhel8.its.unc.edu

It remains possible to explicity choose among {7 only, 8 only, either} using sbatch commands


How long will RHEL7 systems remain available? While the set of resources servicing RHEL7 jobs is decreasing rapidly, it will not go to zero during calendar year 2022. Some RHEL7 systems will remain available throughout CY22. If you have concerns about access to RHEL7 systems in 2023 and beyond, please contact us asap to discuss.


Which RHEL am I on right now?

cat /etc/redhat-release

can be included in submit scripts


Building code

We recommend explicitly choosing a version of gcc using modules. As of July 2022, the default gcc module version (module load gcc) is 11.2.0 for both RHEL7 and RHEL8 systems. Note this is different from the system gcc that will be used if no gcc module is loaded into the environment. With no gcc module loaded, the version will be different between RHEL7 and RHEL8 systems, and different from the default module version of 11.2.0.

"module avail gcc" lists all available versions.


Modules

While we have built codes for RHEL8 and have both RHEL7 and RHEL8 versions of key modules available, it is possible to experience module issues in a RHEL8 environment that requires our assistance. Please reach out to research@unc.edu with any questions or concerns about modules in RHEL8.

For example, the R modules r/4.1.0 and r/4.1.3 are available for both for RHEL7 and RHEL8. Simply load the module as usual, e.g.

module add r/4.1.3

However, to guarantee the correct version of R is run in your job's environment on a compute node, one should add the above module command to the job submission script itself, after the SBATCH options but before the R command is invoked.


Troubleshooting tip

A clue that the R version you are attempting to use does not match the OS version of the host your job runs on is an error such as:

/nas/longleaf/apps/r/4.1.3/lib64/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory

In this case, the job landed on a RHEL8 host but the RHEL7 version of R was used.

Another problem one may encounter is when personally installed third-party packages for the application under a home folder. It is possible that the package was installed using a RHEL7 host but the job calling the third-party package runs on a RHEL8 host. Again, in the case of R, the error might look something like

Error: package or namespace load failed for ‘robustbase’ in dyn.load(file, DLLpath = DLLpath, ...): unable to load shared object '~/R/x86_64-pc-linux-gnu-library/4.1/robustbase/libs/robustbase.so': libgfortran.so.3: cannot open shared object file: No such file or directory

A quick fix is to add the "--constraint=rhel7" SBATCH option to the job submission to target only RHEL7 hosts. However, it will be important to address underlying issues and rebuild code/dependencies due to a diminishing set of RHEL7 resources as more nodes transition to version 8.

 

Last Update 4/25/2024 12:17:32 AM