Browse Source

Fix merge conflict.

main
Jason J. Gullickson 4 months ago
parent
commit
1fbf93732a
  1. 14
      journal.md
  2. 26
      software/README.md

14
journal.md

@ -1033,6 +1033,20 @@ One really important step that I forgot was to copy the HPL.dat to each node aft
Another problem was name resolution. As much as the zeroconf stuff works for most tools, for some reason the `.local` domain gets dropped when HPL (or mpi?) does some things which causes communications breakdowns. After switching to using IP addresses for running HPL this went away. This is another thing I'd like to simplify in the future but is good enough for now.
Adding the last node today:
```
slot name DHCP ip static ip
0 rain-psp-0 10.1.10.133 10.1.10.50
1 rain-psp-1 10.1.10.184 10.1.10.51
2 rain-psp-2 10.1.10.169 10.1.10.52
3 rain-psp-3 10.1.10.188 10.1.10.53
4 rain-psp-4 10.1.10.243 10.1.10.54
5 rain-psp-5 10.1.10.241 10.1.10.55
6 rain-psp-6 10.1.10.145 10.1.10.56
```
## 01072022
Hardware work is paused because as I pushed HPL beyond 10Gflops, the power connector on the Clusterboard began to malfunction (I guess it can't handle the sustained amps at full-power?). I'm going to take this as an opportunity to wire-up the new power switches, etc. and as such will need to put this off until I can spend some time in "hardware mode".

26
software/README.md

@ -10,6 +10,32 @@ Firmware for custom hardware may live here as well, but I'm not 100% sure about
The operating system currently consists of Debian-based distros (Raspbian for the Rpi0W front-end, Armbian for the rest of the cluster). Right now it looks like the user environment will be based on [IPython](http://ipython.org/documentation.html) as this provides both a rich terminal-based user interface as well as tooling for parallel computing.
### New node setup
1. Burn Amrbian image to SD
2. Boot and locate pine64so's IP address using DHCP server
3. `ssh root@new.node.ip.addr` (password: 1234)
4. Complete Armbian setup (root password, new user, etc.)
5. Use `armbian-config` to configure
+ Personal -> Hostname -> set hostname (i.e. rain-psp-0.local)
+ System -> Avahi -> enable
+ System -> CPU -> set min to min, max to max and mode to `ondemand`
+ Set static IP (do this **last** because it will break the ssh connection)
6. `ssh-copy-id -i ~/.ssh/id_rsa.pub user@new-host-name.local`
7. `ssh new-host-name.local`
8. Edit `/etc/default/armbian-ramlog` and increase `SIZE=50` to `SIZE=100`
8. Install packages
+ `sudo apt update`
+ `sudo apt install -y libnss-mdns libnss-mymachines openmpi-bin openmpi-common libopenmpi-dev libatlas-base-dev gfortran`
9. Install HPL
+ From new node:
+ `mkdir -p ~/hpl/bin/aarch64-linux`
+ From head node:
+ `cd ~/hpl/bin/aarch64-linux``
+ `scp * new-host-name.local:/home/jason/hpl/bin/aarch64-linux/`
10. Add new node to `machinefile`
### References

Loading…
Cancel
Save