How to build your first Raspberry Pi cluster?

1. Gather the necessary components: You will need at least two Raspberry Pi boards, a power supply, a network switch, and a microSD card for each board.

2. Install the operating system: Download the latest version of the Raspberry Pi OS and install it on each microSD card.

3. Connect the Raspberry Pis: Connect the Raspberry Pis to the network switch using Ethernet cables.

4. Configure the network: Configure the network settings on each Raspberry Pi so that they can communicate with each other.

5. Install the cluster software: Install the necessary software to enable the Raspberry Pis to work together as a cluster.

6. Test the cluster: Test the cluster by running a few simple tasks to make sure everything is working correctly.

7. Enjoy your cluster: Once everything is working correctly, you can start using your Raspberry Pi cluster for whatever projects you have in mind.

Do you have two or more Raspberry Pi at home?
Do you want to try putting them together to make a cluster?
If so, you’re at the right place. When I bought my second Raspberry Pi, I immediately wanted to build a cluster.

On Raspberry Pi, a cluster can be created by installing the same operating system, apps and libraries to all nodes. To run commands on all nodes, MPICH is the only app required. There is also a Python library to improve the possibilities: MPI4PY.

As this can be a complex topic for beginners, I’ll start with a little introduction on clusters in general.
Then I’ll explain what I have done and how you can do the same on your side.

By the way, if you don’t have a budget to afford a course for now (if you just got your Raspberry Pi stuff, I completely understand), you can join Skillshare for a free 14-day trial. It includes courses from the best teachers in many categories, including IT (Linux, programming, …) and Raspberry Pi. I highly recommend getting this offer now so that you can already learn a bunch of things in the next fortnight. I’m a fan of their services personally ?

Cluster presentation

What’s a cluster?

Basically, a cluster is a group of computers in a single entity.
The goal is to make them work together to improve the global performance.
All of the computers in a cluster work on the same task, reducing the time needed to finish it.

Don’t confuse computer clusters with load balancing.
In load balancing architecture, each computer is working on a different task to decrease the master node’s load.
In a cluster, we take advantage of the total power of the cluster to run a task in parallel.

Cluster examples

Computer clusters find their origins in the 60s and are still used today (at the same time as the first works on networking).

arcnet

The first commercial computer cluster in the history is the Arcnet (see the image on the left).
Its goal was to connect groups of Datapoint 2200 computers.
It’s damn old in the computer history :).

At the time of writing this, the IBM Summit from the ORNL laboratory is the biggest supercomputer in the world.
With over 2 million CPU cores and 3000To of RAM, and always increasing, it will be tough to compete with.
Here is an illustration below if you want to know what it looks like:

summit computer cluster ibm
Summit from IBM (source: ornl.gov)

Raspberry Pi application

Let’s go back to our more realistic dimensions with the adaptation of that definition on our Raspberry Pi.

As you know, the Raspberry Pi is not very powerful, but it’s cheap.
So, it’s the perfect device to build a cluster.

We can make a Raspberry Pi run tasks faster on 4 devices instead of only one, for a reasonable price.

In this tutorial, I’ll show you how to build your first Raspberry Pi cluster.
You can build a cluster with two nodes to start and add others later if needed.

Prepare your Raspberry Pi cluster

Make a plan

It’s always a good idea to think about what you are building.
I’m doing this exercise for you, with two Raspberry Pi:

  • A Raspberry Pi 4B 4G: the master node that will control everything
  • A Raspberry 3B+: the second node, to increase global performance

As the preparation phase can be pretty long (especially if you are using many nodes), I’ll prep the 4B only.
Then I’ll copy the SD card to another one, to get a Raspberry Pi 3B+ almost ready without having to do the whole preparation phase on the 3B+.
Finally, there are extra steps for both Raspberry Pi devices to connect them together and run the first script.

If you have more than one node to add, repeat the same process for each node.

Prerequisites

To follow this tutorial, you’ll need:

  • 2 or more Raspberry Pi (any model, but I recommend the Raspberry Pi 4B)
  • 2 or more SD Cards (check my recommended product page if you need some)
  • A cheap 5-port gigabit switch to plug all Pi’s together
  • Power cables, or a power bank with 2 or more ports
  • A Network cable for each Pi (wireless is possible, but not optimal)
  • Optional: if you are serious about this project, this cluster case can be useful to stack Raspberry Pi and avoid a giant mess.
    The case will optimize your cabling, keep everything tidy, and also cool the nodes correctly. I highly recommend that kind of case if you’re keeping your cluster running frequently.

And for the software, I’ll explain everything in the following parts.

Note: it’s okay if you use different sized SD cards, but you need to install the master on the smallest SD card.
Otherwise, you’ll have an issue when flashing a 64G image to a 16G SD card :).

Prepare the Master

The first step in my scenario is to make the installation on one Raspberry Pi and then duplicate to the others.
Start with your powerful Raspberry Pi.

Basic installation

Like most projects, we will start with the Raspbian installation.
Download the Raspbian Lite from the Raspberry Pi Foundation website: link here.
Raspbian Desktop is okay, but we don’t need a GUI for this project.

Install Raspbian and boot for the first time (if you don’t know how to install Raspbian on a Raspberry Pi, follow my guide and come back later).

Then you’ll need to follow these additional steps:

  • Change a few settings with Raspi-config:
    • Run Raspi-config:
      sudo raspi-config
    • Change the pi user password in “System options > Password”.
      We’ll enable SSH, and it isn’t a good idea to use the default password with SSH running.
    • Enable SSH in “Interface options > SSH”.
    • Change the host name in “System options > Host name”.
      Choose something clear, like “Master”
  • Update your system:
    • As always, start any project with an up-to-date system to avoid any issue.
      Update the repository sources:
      sudo apt update
    • Upgrade all packages:
      sudo apt upgrade
  • Reboot to apply all changes:
    sudo reboot

The basic installation is now complete, we can move to specific software for this project.

MPICH installation

What’s MPICH?

MPICH is the main tool we need to run a cluster.
MPICH is a free implementation from the MPI standard.
MPI stands for Message Passing Interface and its goal is to manage parallel computing architectures.

In short, this is what will allow us to run a script on several Raspberry Pi simultaneously.

MPICH installation

We are now ready to start the MPICH installation process.
If you want the latest version, you can download MPICH from the official website and compile it from the sources, but it’s also available in the Raspberry Pi OS repository.

So, this is the easiest way to install it:
sudo apt install mpich

Once done, test to ensure everything is working well.
To do this, run this command for example:
mpiexec -n 1 date

If you get the current date from the master, the MPI installation is completed.

Create a basic Python script

Ok, now we’ll create a basic Python script to test it with MPI.

  • Go to your home folder and create a script:
    cd /home/pi
    nano test.py
    If you are not used to Nano, you can read my guide here for more details.
  • Paste this line inside (or whatever you want):
    print("Hello")
  • Make sure your script is working directly with Python:
    python test.py
    If you kept my script, this should display “Hello”
  • Then test it by running it on 4 threads with MPI:
    mpiexec -n 4 python test.py
    As you can see, this should now display “Hello” four times, so we can also run a Python script 4 times, by using all the processor cores available.

This is nice, but we don’t use the cluster for the moment, it’s just a way to run a script on several threads.

MPI4PY installation

What’s MPI4PY

To go further with our cluster, we need a library that we can use in a script. The goal of this library is to have communication between all the nodes, to run efficiently our programs.

On Raspberry Pi, MPI can be used directly in Fortran and C scripts only.
But as the Raspberry Pi runs with Python, we’ll add Python capability to our cluster.

To do this, we need to install a Python library: MPI4PY.

MPI4PY prerequisites

MPI4PY installation process is easy as it’s available with pip (the Python package manager).
But you need to install some Raspberry Pi OS packages before anything else:
sudo apt install python-pip python-dev libopenmpi-dev

That’s it, move to the installation process.

MPI4PY installation

We can now install the MPI4PY library with pip:
sudo pip install mpi4py
It can take more or less time depending on your Raspberry Pi model. Be patient.

If this is working correctly, your master installation is ready.
MPI can now run Python scripts, and we can start the node’s preparation.

Duplicate the master

The next step is to duplicate the master’s SD card into other cards – one for each node.
To do this, we’ll create an image from the SD card and flash it on the other cards.

If you are trying this with only two nodes, it might be faster to repeat the same procedure as on the master. In this case, you can skip this section.

Create the image

On Windows, you need a software like Win32DiskImager.
Click on the link, download and install it on your computer:

  • Start the program.
    win32diskimager
  • In the “Image file” field, choose a temporary directory and a filename such as “cluster_master.img”.
  • Then choose the Device letter corresponding to the SD card.
  • Finally, press the “Read” button to start the image creation.
    This process took about 15 minutes on my computer.
  • Once done, eject the master SD card and keep it safe.

On Linux, it should be something like:
sudo dd if=/dev/sdb > cluster_master.img
You need to make sure /dev/sdb is your SD card. You can easily find help for this command if needed (or use man dd to see all options).

Create SD Card for nodes

Once the image is ready, you need to create the SD card for each node of your cluster:

  • Insert the new SD card into your computer.
  • In Win32 Disk Imager, select the image filename and the device letter.
  • Click on “Write” to create the same SD card.

If you prefer, you can use Etcher to do this.
I typically use Etcher, but we are already in Win32 Disk Imager, so it’s the same.
As a reminder, you need to use an SD card larger than the first one.

flashing the cluster image

Once again, for Linux and macOS users, you can use the dd command if you don’t want to install Etcher.

At the end of this step, you have one SD card for each node you want to use.
All the SD cards contain the same image from the master we created before.

Nodes configuration

Start all Raspberry Pi

  • Insert an SD card in each Raspberry Pi you want to use.
  • Start them all.

If you want to use Wi-Fi for one or more nodes, there is an extra step.
For example, in my case I have a Raspberry Pi Zero, and it was easier for me to connect it to my Wi-Fi network.

  • Plug a screen and keyboard into the Raspberry Pi you want to use Wi-Fi on.
  • Use Raspi-config to configure the Wi-Fi:
    • Use the following command:
      sudo raspi-config
    • Go in System options > Wireless LAN.
    • Follow the wizard to select your network (country, SSID and pass phrase).

Find all IP addresses

Once all the Raspberry Pi are started and plugged in the network, we need to get all IP addresses to use it later:

  • Go back to the master node (directly or with SSH).
  • Install NMAP:
    sudo apt install nmap
    nmap is a free tool for network discovery (check the website here).
    We’ll use it to find all IP addresses.
  • Use this command to find all devices on your network with a host name containing “master” .
    For the moment, all the Raspberry Pi have the same host name:
    nmap -sP 192.168.1.* | grep master
    Change the network subnet if you are using another one.
  • You should get this kind of output:
    nmap scanning cluster ip
  • I now know my second node IP: 192.168.1.18

You should now have all the nodes’ IP.
If you don’t know the master one, you need to use this command:
sudo ifconfig

You’ll get something like this:

ifconfig

The IP address is on the second line after the “inet” keyword (192.168.1.200 in this screenshot).
The last step is to note these IP addresses in a text file on your Master node:

  • Create a new file in your home folder:
    cd /home/pi
    nano nodes_ips
  • In this file, add a node IP on each line (and only the IP).
    For example:
    192.168.1.15
    192.168.1.16
    192.168.1.17
    192.168.1.18
  • That’s all for this part.

Change the nodes host names

We’ll now change the host name on the new nodes to have a different one for each:

  • From the master node, connect to the first one with SSH:
    ssh [email protected]
    Answer “yes” to the question and enter the pi password.
  • Go into Raspi-config:
    • Use this command to access the tool:
      sudo raspi-config
    • Go in System options > Host name.
    • Set a new host name for this node, for example “Node1”.
  • Exit raspi-config and exit this node with:
    exit

Repeat these steps for each node you want to add in the cluster.

Do the SSH keys exchange

The last step is to allow the master to connect to each node via SSH without password.
To do this, you need to create an SSH key on the master, transfer them to all nodes to allow it.

  • On the master, create the SSH key with:
    ssh-keygen -t rsa
    Accept the default values (default path and no password).
  • This tool generates two keys in the /home/pi/.ssh folder:
    • id_rsa: your private key, keep it here
    • id_rsa.pub: the public key, you need to send it to peers you want to access without password
  • Transfer the public key to all nodes:
    scp /home/pi/.ssh/id_rsa.pub [email protected]:/home/pi/master.pub
    Do this for each node you want to use.
  • Then, go to each node and add the key to the authorized_keys file.
    This file contains all hosts allowed to access the system via SSH without password:
    ssh [email protected]
    cat master.pub >> .ssh/authorized_keys

    exit
    Do this for each node.
    If the folder doesn’t exist, just create it with:
    mkdir .ssh
  • Now, you should be able to connect each node without password.
    You can try it with:
    ssh [email protected]

That’s it, you cluster is ready. We’ll now test it.

Cluster usage

The cluster is now available, and we’ll use MPI to run commands simultaneously on each node.
As we already saw, MPI allows you to run basic commands and scripts through the cluster.

Basic command

The first thing we can try is to run the same command on each node.
Preferably something that doesn’t return the same thing :).

For example:
mpiexec -hostfile nodes_ips -n 8 hostname

nodes_ips is the file we created before with all IP addresses inside.
And “hostname” is the command we want to run on each node (more about the hostname command here).
8 is for the number of thread to start, in this case change it for the number of cores available in your cluster (Raspberry Pi 4B and 3B+ have 4 cores each, so I test with 8).

As a result, you’ll get one line for each node in the cluster, with all nodes host names.

Python script

Test script

If you followed this tutorial entirely, you should already have a test.py script on the home folder.
You can test to run it on each node with the same command:

mpiexec -hostfile nodes_ips -n 8 python test.py

This will display “Hello” two times, once for each node.
We are still not using MPI4PY, but we’ll get to this now.

A new script

Remember that after cloning all the SD cards, you need to have the new scripts on all nodes.
MPI simulates the execution of the script on each node, but it doesn’t copy the code automatically.

To do this, follow this short procedure:

  • Create the script on the master node.
  • Make sure it’s working as expected.
  • Then transfer this script on all nodes with scp:
    scp /home/pi/myscript.py [email protected]:/home/pi/
    It’s important to have the same script on each node, and with the same path.
  • Then you can run your script with MPI as explained before.

Go further with Python

As I told you, we didn’t add MPI4PY just to run basic python scripts 4 times instead of one.
MPI4PY is a Python library you can include in your scripts to use specific functions in your cluster.

Here is a quick example:

#!/usr/bin/env python

from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.rank

if rank == 0:
    data = {'a':1,'b':2,'c':3}
else:
    data = None

data = comm.bcast(data, root=0)
print 'rank',rank,data

The goal of this script is to send data from one thread or node to all the others.
In this script data is defined only for the first thread on the master (rank 0).
And then we sync this data with all running instances with the broadcast function (comm.bcast).

Here is the command to run to try this:
mpirun.openmpi -np <threads> -machinefile nodes_ips python mpi4py.py

When you run this script, all nodes and ranks display the same message:

cluster python script

It just an example to show you that you can add more functions in your Python script to take advantage of your cluster.
I’m not an expert on this.
You can find more information here.

Related questions

Can I add more nodes in my cluster now? You can add more nodes to your existing cluster at any time (that’s what they do with supercomputers). You just need to create a new SD card, follow the node configuration steps for the new node and add the new IP address in the nodes_ips file.

The IP addresses are changing every day, what can I do? Yes, it’s a problem. For the test I didn’t do this step, but if you want to keep your cluster you need to do this. Depending on your network, you can either set a reservation in your DHCP server (so each Pi will always get the same IP on boot). Or you can set manually a static IP address in your network configuration (I explain how to do this at the end of this article).

What kind of usage do I really need a cluster for? In this tutorial, it was mainly the technology and the installation process that interested me. Not the possibilities that are now available with this cluster. This is another topic and I can’t fit all in only one article. If you want to go further, you can find more projects about clusters on Hackaday

If you are looking for exclusive tutorials, I post a new course each month, available for premium members only. Join the community to get access to all of them right now!

Conclusion

That’s it, you know how to build your Raspberry Pi cluster from two nodes to an infinity :).

I really liked writing this tutorial for you.
It’s interesting to have an overview on how supercomputers are working.
And the technology seems to be stable as I had no issues while creating my cluster (and it’s rare in computing ^^). I hope you’ll like that too.

By the way, you can check my related article here about what can be the real usage for a Raspberry Pi cluster.

If you have any questions or experiences to share, leave a comment in the community.
I would like to know what you do after this first steps in the supercomputer world ?

Additional Resources

Not sure where to start?
Understand everything about the Raspberry Pi, stop searching for help all the time, and finally enjoy completing your projects.
Watch the Raspberry Pi Bootcamp course now.

Master your Raspberry Pi in 30 days
Don’t want the basic stuff only? If you are looking for the best tips to become an expert on Raspberry Pi, this book is for you. Learn useful Linux skills and practice multiple projects with step-by-step guides.
Download the e-book.

VIP Community
If you just want to hang out with me and other Raspberry Pi fans, you can also join the community. I share exclusive tutorials and behind-the-scenes content there. Premium members can also visit the website without ads.
More details here.

Need help building something with Python?
Create, understand, and improve any Python script for your Raspberry Pi.
Learn the essentials step-by-step without losing time understanding useless concepts.
Get the e-book now.

You can also find all my recommendations for tools and hardware on this page.

How to Build Your First Raspberry Pi Cluster

Building a Raspberry Pi cluster is a great way to learn about computer networking and distributed computing. With a cluster, you can run multiple Raspberry Pi computers together to create a powerful computing system. Here’s how to build your first Raspberry Pi cluster.

Step 1: Gather Your Materials

Before you can build your Raspberry Pi cluster, you’ll need to gather the necessary materials. You’ll need at least two Raspberry Pi computers, a network switch, and a power supply. You’ll also need an Ethernet cable for each Raspberry Pi, as well as a microSD card for each Raspberry Pi.

Step 2: Set Up the Network

Once you have all the necessary materials, you’ll need to set up the network. Connect the network switch to your router using an Ethernet cable. Then, connect each Raspberry Pi to the network switch using an Ethernet cable. Finally, connect the power supply to the network switch.

Step 3: Install the Operating System

Next, you’ll need to install the operating system on each Raspberry Pi. You can use the Raspberry Pi OS, which is a Linux-based operating system. Download the Raspberry Pi OS image file and write it to the microSD card using a program like Etcher. Then, insert the microSD card into each Raspberry Pi and power it on.

Step 4: Configure the Network

Once the operating system is installed, you’ll need to configure the network. Each Raspberry Pi should be assigned a unique IP address. You can use a program like Advanced IP Scanner to scan the network and find the IP addresses of each Raspberry Pi. Once you have the IP addresses, you can use SSH to connect to each Raspberry Pi and configure the network.

Step 5: Install Software

Finally, you’ll need to install the necessary software on each Raspberry Pi. You can use a program like Ansible to automate the installation process. Once the software is installed, you can start using your Raspberry Pi cluster.

Conclusion

Building a Raspberry Pi cluster is a great way to learn about computer networking and distributed computing. With a cluster, you can run multiple Raspberry Pi computers together to create a powerful computing system. Follow the steps outlined above to build your first Raspberry Pi cluster.

Jaspreet Singh Ghuman

Jaspreet Singh Ghuman

Jassweb.com/

Passionate Professional Blogger, Freelancer, WordPress Enthusiast, Digital Marketer, Web Developer, Server Operator, Networking Expert. Empowering online presence with diverse skills.

jassweb logo

Jassweb always keeps its services up-to-date with the latest trends in the market, providing its customers all over the world with high-end and easily extensible internet, intranet, and extranet products.

Contact
San Vito Al Tagliamento 33078
Pordenone Italy
Item added to cart.
0 items - 0.00
Open chat
Scan the code
Hello 👋
Can we help you?