CentOS/CloudLinux 7 on Xen hanging at startup

Previously I posted on how to recover from failing to boot a new kernel. However, the issue still stands as to why a given kernel decides to hang. The platform I am working on is OnApp 5 LTS running Xen hypervisors on cloud boot and integrated storage infrastructure. Something has to be causing this. CloudLinux and CentOS7 both seem affected.

Mostly the server will start up and fall over with a Segmentation Fault before I manage to get a console open and screenshot it – occasionally, thankfully, it will just hang. It hangs with the following (courtesy of the nice HTML5 console window you can only open when the VM is running – so BE QUICK!):

Having raised this with OnApp for review, and with CloudLinux for a heads up – I await feedback.

The previous (vulnerable to Meltdown) kernel was fine – and the new kernel (not) 3.10.0-714.10.2.lve1.4.79.el7 – bombs.

I am seeing the same behaviour with vanilla CentOS 3.10.0-693.11.6.el7 .

If you are seeing similar I would be interested to hear from you.

[Update Tuesday 9th January 2017]

OnApp have come back suggesting that this is a known issue with that kernel. Regretfully that does not render for me here – but hey – found plenty of what I should imagine are similar out there:

https://lists.centos.org/pipermail/centos-virt/2018-January/005721.html

Their response was one of ‘hopefully a fix will be out soon’.

I had sent it over to CloudLinux in a “so you are aware” as opposed to ‘please fix’ – only to find they picked it up and ran with it with their script to gather information:

curl -s https://www.cloudlinux.com/clinfo/cldoctor.sh |bash

be un on the HyperVisor. Then passed on to the dev team. Sure, I may not hear again, but it’s there to be drawn on and used if needs be. Rather than just ‘yup’ and ‘someone should release a fix soon’. I am grumpy like that.

I WILL UPDATE THIS AS I FIND OUT MORE

 

[Update Thursday 8th February 2018]

While the answers and questions went back and forth a number of times it boils down to the following:

“Having clarified this information from our developers I can say that yes it will not work on XEN-PV, but will work on HVM.  As for the kernel currently there are some works there and we can’t provide an ETA , but I can say that it will be available soon.” — Cloud Linux

Given that CentOS on OnApp’s implementation of Xen will run in PV mode. So as of right now – those are dead in the water. This is suboptimal. However, we have a standpoint on that now. Xen yes, but only in HVM mode, not PV. OnApp, CentOS, PV mode. Fail. Upside – this gives an element of implied protection from Meltdown – despite what the test script suggests.

PV is paravirtualization (as in half). This means that this is a lighter weight virtual machine, with services provided by the host. Drives and partitions will present as /dev/XVDx1 for example as opposed to /dev/sda1.

HVM is full virtualization – drives are presented as /dev/sda for example – relying on the hardware implementations – CPU features and so on.

There is no chop and change as the drives will just plain not be there, not work, pain, failure, stuff. However – understanding is half the problem / solution / matter.

So… about that… bugger.

I may be looking at a migration away then. Much sadness. Wow.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this:
Skip to toolbar