NVIDIA Tesla M60 - Server Install

 Installing NVIDIA Tesla M60's on my servers

In this post, I’ll go through how I upgraded my HP Gen8 servers to use NVIDIA Tesla M60 GPUs, including replacing the PCIe riser cards, verifying detection in BIOS, and preparing for driver installation on openSUSE Leap 15.6.

My servers

🖥️ Node 1 – HP DL385p Gen8

Component

Spec

CPU

2 × AMD Opteron 6380

Threads

64

RAM

128GB DDR3 ECC

GPU

NVIDIA Tesla M60 (16GB GDDR5 – dual GPU card, 2 × 8 GB)

Storage

11.7TB total

  ðŸ–¥️ Node 2 – HP DL380p Gen8

Component

Spec

CPU

2 × Intel Xeon E52665

Threads

32

RAM

256GB DDR3 ECC

GPU

NVIDIA Tesla M60 (16GB GDDR5 – dual GPU card, 2 × 8 GB)

Storage

512GB SSD

 ðŸ–¥️ Node 3 – HP DL380 Gen8

Component

Spec

CPU

2 × Intel Xeon E52260

Threads

24

RAM

48GB DDR3 ECC

GPU

NVIDIA Tesla M60 (16GB GDDR5 – dual GPU card, 2 × 8 GB)

Storage

2 × 512GB SSDs + 6 × 1TB HDDs = ~6.5TB total


I wanted to install NVIDIA Tesla M60 cards into each server to build a small Beowulf cluster capable of running AI workloads like Ollama.

My server already had the correct hardware and the graphics card just slotted right in, however the servers required the PCI riser card be changed.

Upgrading the Risers and adding the graphics card.

The issue I had was that the PCI risers contained only one x16 PCI lane and the graphics card is double height, so it would not fit.


I bought the correct part HP 634582-001 662525-001 DL380p G8 2Slot 2x16 PCI-E Riser Card

First I removed the screws holding the old card in place then I slide the card out.

The new card slid into place of the old and and the screws were tightened up.

I then put the graphics card in place and attached the power cable
I then put the riser back into the server

Put the lid back on and connected up the peripherals and powered it up. During boot i went into the bios to confirm that the server detected the new hardware

Which it did and was shown on slot 5 which is correct.
With the hardware installed and recognized in BIOS, the next step is to get the NVIDIA drivers set up correctly under openSUSE Leap 15.6

Installing NVIDIA Tesla M60 Drivers on openSUSE Leap 15.6 

I wanted to ensure that I had a clean, minimal install guide that just works as I had three of these to do.
The following is what I done to get the drivers working: -

First I opened up the terminal.

Step 1 — Update the system

Start by making sure everything is up-to-date.

sudo zypper refresh sudo zypper update sudo reboot

Reboot so the latest kernel and packages are active.

Step 2 — Disable the open-source Nouveau driver

The open-source nouveau driver conflicts with NVIDIA’s official one.
Create a small blacklist file:

sudo sh -c "echo 'blacklist nouveau' > /etc/modprobe.d/50-blacklist-nouveau.conf" sudo sh -c "echo 'options nouveau modeset=0' >> /etc/modprobe.d/50-blacklist-nouveau.conf" sudo dracut --force sudo reboot

After reboot, confirm Nouveau isn’t loaded:

lsmod | grep nouveau

No output means it’s disabled, which is exactly what we want.

Step 3 — Add the official NVIDIA repository

openSUSE makes this easy. For Leap 15.6:

sudo zypper addrepo --refresh https://download.nvidia.com/opensuse/leap/15.6 NVIDIA sudo zypper refresh

This repo contains all current proprietary driver builds. If your doing this yourself, ensure you have the correct repository for your version of Linux.

Step 4 — Install the Tesla M60 driver

Tesla cards use the G06 driver branch, designed for data-centre and compute GPUs.

sudo zypper install nvidia-compute-G06 nvidia-driver-G06-kmp-default nvidia-gl-G06 nvidia-video-G06

If you see an error about libglvnd, fix it with:

sudo zypper install libglvnd

…and rerun the install command above.

Step 5 — Reboot and verify

sudo reboot nvidia-smi

If everything worked, you’ll see a table like this:

That means the driver is active and the GPU is ready for CUDA work.

Step 6 - Install the Nvidia Config

Using YaST2 software package installer, we are now going to install the NVidia settings and xconfig so we can see this the graphics card settings within the GUI

I did a search for NVidia and selected the relevant packages to install
The packages I choose were nvidia-settings and nvidia-xconfig, the others wanted to downgrade the driver so I unselected.
With these packages installed, I can now see the settings for the graphics card.

I repeated this process for my other two servers and all worked great.
Now I can finally look at the next part of setting up my Beowulf cluster which I hope to document and show.

Comments

Popular posts from this blog

Math Behind Logic Gates

6502 - Part 2 Reset and Clock Circuit

Building a 6502 NOP Test