It’s been a long time since I wrote here. The earlier content you see here actually came from my previous blogs at veejoe-dot-com-dot-au and later viccross-dot-com, both of which for various reasons are… no longer available. So why relaunch? Well, mostly because I still have things to say! I have not decided exactly what yet, but let’s see where it goes.
To get things started, and to create a bridge to the earlier content, I’d like to revisit the last post. I wrote about my “Large-Scale Cloning Grid experiment”, and some programming choices I made back then. The first thing that might surprise you is that, four years after that post and nearly 10 years after my first tilts at the experiment, it still lives! It still runs on z/VM, which is still the only hypervisor that can support it, and it still is driven by a Perl IRC bot. I never got around to fixing some of the scaling issues that were exposed in the transition to PoCo::JobQueue, but I have reused the HTML status-graph functions in another project for work.
The experiment started out on a z/VM system on a z9BC in Brisbane. At first it used SCSI disk rather than ECKD, because that’s where I had the greatest amount of disk available. I was pushing very high page I/O rates through the IBM SVC we had in our lab, taking advantage of the ability that was unique to SCSI on z/VM at the time: multipath I/O for paging. Even after I had ECKD available in Brisbane I left the Grid on SCSI — my ECKD was provided by a modest little IBM DS6800, which I didn’t think would be able to withstand the I/O rate of 4000 Linux guests fighting over 16GB of z/VM memory. Only with the move of the installation to Melbourne, where we eventually had DS8870 storage, did I rebuild the system as a two-member SSI cluster on z/VM V6.3.
After the release of z/VM V7.1 however, I thought it was time to update. I chose to rebuild and migrate rather than use Upgrade-In-Place [1]. I copied the directory entries for the management guests, and reinstalled products. The critical components of the system were the two Discontiguous Saved Segment (DCSS) spool files that were the Linux filesystem for the grid “drones”. These I saved out from the V6.3 systems using the DCSSBKUP tool, ready for using DCSSRSAV on z/VM V7.1 to bring them back.
The dump of the DCSSes went fine. The restore of the smaller read-write DCSS (used by Linux for /var) also went perfectly. The restore of the root FS DCSS, however, failed with a DMS109S error, “Insufficient free storage available”. I focussed on the word “insufficient”, and set about trying to allocate more memory to my z/VM guest for the DCSSRSAV to run, but this was not successful.
After much time and research, I discovered that the word I should have been focussing on was “free”! The problem seemed to be that there was a bunch of memory just under the 2GB bar in my CMS virtual machine that was in use, and that was exactly the end of where my DCSS was defined to reside. I tried to use ZCMS, the 64-bit-enabled version of CMS, but it too put something in the same location in memory.
So I decided to shuffle the segments, and move the root filesystem DCSS to further below the bar — far enough away to clear whatever was sitting up high in CMS storage. This is supported by using the NEWADDR option on DCSSRSAV when a saved segment is being restored to a different segment definition than the range defined in the save file. This worked, and allowed DCSSRSAV to successfully restore the DCSS. However, when I tried to IPL a guest using the DCSS filesystem, it didn’t work — perhaps the way the Linux filesystem is kept in the DCSS maintains actual memory locations, and it can’t be relocated in the way that other DCSSes might be. I haven’t been back to fix that problem; since then we’ve had a few other issues around the environment the grid is hosted on.
I originally created the Experiment as a “proof-of-capability”, to demonstrate that z/VM is indeed capable of some extremely large-scale virtualisation. When it was running on SCSI, I successfully had over 4000 Linux guests running on a z/VM system with 16GB of central storage and 2GB of expanded storage. I had around two-dozen 10GB page devices, and the paging I/O rate was pretty eye-watering, but it worked. Today however, running large numbers of virtual guests has given way to running larger numbers of containers with micro-services.
So when I had the issues with running the Experiment using z/VM V7.1 as the base, I gave serious thought to retiring it. Arguably the sun has set on large virtual machine grids, the world has moved on, and as my colleague Rob van der Heij wrapped up in his Penguins on a Pin Head, “it does not provide practical value”. But, being one of mainframe background who never deletes anything, I brought up the previous z/VM V6.3 cluster just to see if the Experiment would work on our z13s.
It did!
So now I have a path to (possibly) fixing it under z/VM V7.1: on the z/VM V6.3 system I will define a new DCSS in the location that was successful on 7.1 and copy the contents of the original. Due to the segment ranges overlapping I’ll need two different guests to do it (one attached to each DCSS), or maybe I’ll dump-disconnect-connect-restore. More on that in a future post.
As mentioned, I’m not sure where this next blog adventure will lead but I’m looking forward to the journey!
—-
[1] Rather I was forced to, having chosen “Yes to SMAPI” when I installed the z/VM 6.3 system. I read that this option is gone from z/VM V7.2…