Hi,
As you may or may not be aware, it is becoming increasingly popular
for large email providers to use DMARC settings with failure
policies of "quarantine" or even "reject".
Domain-based Message Authentication, Reporting and Conformance
(DMARC) is a mechanism which builds upon SPF and DKIM to try to
ensure that emails that show a given domain name in the From:
address came from mail relays which are able to sign headers to
prove they can send such emails.
As Mailman takes your posts and modifies them (the subject line is
altered and list footers may be added) yet retains your specified
From address, this causes a DMARC failure and for receiving sites is
indistinguishable from a forged email.
There are people posting to these lists right now from domains which
have strict DMARC failure policies. As a result whenever they post
to the BitFolk lists, many recipient sites do as instructed and
reject or quarantine their email. Furthermore a rejection causes the
list software to consider the *recipient* as having bounced the
email, and if this happens several days in a row the recipient
address will be considered undeliverable and will be automatically
unsubscribed.
In short, posters whose domains have strict DMARC failure policies
can cause other subscribers to be unsubscribed from the list.
There is no good way to avoid this for typical mailing lists. I've
decided that one of the least bad ways is to enable the setting that
rewrites the sender from address (only) for domains which have
strict DMARC failure policies.
So, if you see something like:
From: Joanne Bloggs via users <users(a)lists.bitfolk.com>
instead of the more usual:
From: Joanne Bloggs <jbloggs(a)example.com>
then this is the reason why. The purpose is to have the messages
that we know will break DMARC come instead from the
lists.bitfolk.com domain.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
TL;DR version: We're not patching and rebooting for this because
it's best fixed in your guests. If you had bare metal you'd have no
choice and would be doing that anyway.
Long version:
There is a Xen Security Advisory today which is yet more fallout
from the same class of CPU security issues as "Spectre" and
"Meltdown":
<https://xenbits.xen.org/xsa/advisory-263.html>
Usually there is a 2 week embargo on these things but as I
understand it there is no embargo this time because the discoverers
did not agree to one.
This issue is a hardware / design flaw which affects almost every
CPU in the world (all Intel, many AMD, some ARM). The potential
impact is unprivileged processes being able to read arbitrary
memory.
The Xen developers do not believe that it is possible for this to go
between guests nor between guest and hypervisor, so this restricts
the issue to processes within your guest.
As this also affects bare metal and almost every other configuration
of Linux, it will be addressed in software by your operating system
vendors by means of package update.
Some of the software fixes require updated firmware, and these
firmware updates have already been applied so that is ready for you
when you need it.
The patches supplied by Xen for this XSA do allow us to fix the
issue at a higher level in the hypervisor, thus not requiring any
changes in your VPSes, but at a cost of having to schedule another
round of reboots.
At this stage I am not inclined to enforce a reboot for this; I
think it's best fixed in the guests.
In the near future we will deploy one new host that has this bug
addressed at the hypervisor level and anyone who for whatever reason
cannot update their VPS can have it moved to that host.
This could be subject to change if there are further discoveries
about this particular bug, and I also doubt we have heard the last
of security bugs in this class. There could well be another XSA
along soon that requires reboot, in which case we may end up turning
this mitigation on as well at that time.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hello,
Sadly today we've been made aware of another security issue that's
going to have to be patched, which means once again we have to
reboot everything.
It is a pity they couldn't have held the last one back for a week in
which case we'd have been able to roll the two patches together, but
they have a schedule for disclosure that they have to stick to.
So, this set of reboots is most likely to take place in the early
hours of the morning on 5/6/7 May. In the next couple of days we
will send out direct mails to all customers to let you know exactly
when your one hour maintenance window will be.
We should be able to use suspend/restore on this one so those who
have indicated that they want that should see that happen. You can
set that from:
<https://panel.bitfolk.com/account/config/>
For the benefit of new customers joining us what all this means is
that some time during the one hour maintenance window that you'll be
informed of individually by direct email, we will:
- Suspend to disk all the VPSes that have opted for that
- Shut down the rest cleanly
- Reboot the server into the patched hypervisor
- Restore the suspended VPSes
- Boot the rest
Although the maintenance window for each server is one hour long,
the work generally takes about 15 minutes.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
Unfortunately a flaw has been discovered in the Xen hypervisor that
has security implications, so we're going to have to apply a fix and
reboot everything. This is likely to take place between 21 and 23
April. In the next couple of days we will send out direct mails to
all customers to let you know exactly when your one hour maintenance
window will be.
We should be able to use suspend/restore on this one so those who
have indicated that they want that should see that happen. You can
set that from:
<https://panel.bitfolk.com/account/config/>
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
I've added an Ubuntu 18.04 LTS installer to our Xen Shell, so it's
now available for self-install. More info about self-install:
<https://tools.bitfolk.com/wiki/Using_the_self-serve_net_installer>
So, the command is "install ubuntu_bionic". If you don't see it,
make sure you are running version v1.48bitfolk46 of the Xen Shell as
the Xen Shell stays running if you connected to it before.
Please note:
- Obviously this is still pre-release for 18.04. I have only tested
it so far as installing it, booting it and connecting to it with
SSH. I would be interested to know of your progress if you use it.
- If you already are running Ubuntu you could just
do-release-upgrade into this as normal.
- As ever, if you'd like to perform a self-install but need to keep
your existing VPS running for a while, we can offer a new account
free for 2 weeks for you to perform your migration:
<https://tools.bitfolk.com/wiki/Migrating_to_a_new_VPS>
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
The level of SSH scanning is getting ridiculous.
Here's some stats on the number of Fail2Ban bans across all Xen
Shell hosts in the last 7 days:
# each ∎ represents a count of 46. total 4653
59.63.166.104 [ 2037] ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (43.78%)
58.218.198.142 [ 998] ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (21.45%)
59.63.166.105 [ 641] ∎∎∎∎∎∎∎∎∎∎∎∎∎ (13.78%)
58.218.198.146 [ 352] ∎∎∎∎∎∎∎ (7.57%)
58.218.198.161 [ 272] ∎∎∎∎∎ (5.85%)
59.63.188.36 [ 145] ∎∎∎ (3.12%)
192.99.138.37 [ 61] ∎ (1.31%)
103.99.0.188 [ 40] (0.86%)
218.65.30.40 [ 15] (0.32%)
202.104.147.26 [ 13] (0.28%)
42.7.26.15 [ 8] (0.17%)
163.172.229.252 [ 8] (0.17%)
42.7.26.91 [ 8] (0.17%)
198.98.57.188 [ 8] (0.17%)
58.242.83.26 [ 8] (0.17%)
58.242.83.27 [ 8] (0.17%)
182.100.67.82 [ 6] (0.13%)
217.99.228.158 [ 5] (0.11%)
218.65.30.25 [ 4] (0.09%)
117.50.14.83 [ 4] (0.09%)
46.148.21.32 [ 4] (0.09%)
178.62.213.66 [ 3] (0.06%)
116.99.255.111 [ 3] (0.06%)
165.124.176.146 [ 1] (0.02%)
101.226.196.136 [ 1] (0.02%)
First three octets only:
# each ∎ represents a count of 61. total 4653
59.63.166.0/24 [ 2678] ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (57.55%)
58.218.198.0/24 [ 1622] ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (34.86%)
59.63.188.0/24 [ 145] ∎∎ (3.12%)
192.99.138.0/24 [ 61] ∎ (1.31%)
103.99.0.0/24 [ 40] (0.86%)
218.65.30.0/24 [ 19] (0.41%)
42.7.26.0/24 [ 16] (0.34%)
58.242.83.0/24 [ 16] (0.34%)
202.104.147.0/24 [ 13] (0.28%)
163.172.229.0/24 [ 8] (0.17%)
198.98.57.0/24 [ 8] (0.17%)
182.100.67.0/24 [ 6] (0.13%)
217.99.228.0/24 [ 5] (0.11%)
46.148.21.0/24 [ 4] (0.09%)
117.50.14.0/24 [ 4] (0.09%)
116.99.255.0/24 [ 3] (0.06%)
178.62.213.0/24 [ 3] (0.06%)
165.124.176.0/24 [ 1] (0.02%)
101.226.196.0/24 [ 1] (0.02%)
That is with Fail2Ban adding a 10 minute ban after 10 login
failures. If there was no ban this would be 100s of thousands of
login attempts instead of 4,653 bans.
Yes I can send an abuse report to Chinanet's "Jiangxi telecom
network operation support department". Yes I can just firewall it
off. But that relies on periodic log file auditing.
There is already an SSH listening on port 922 that is not subject to
Fail2Ban. I would rather not have SSH on port 22 at all but in the
past I have been told this would not be acceptable because some
people are sometimes on networks where they can't connect to port
922. If that would be fine with you then no need to comment but it
might be interesting to hear from anyone who would still find this a
problem.
What are the feelings about setting port 22 Xen Shell access to
require SSH public key auth (while leaving 922 to allow password
authentication as well)?
Do those of you who've added SSH keys want an option to *require*
SSH keys even on port 922?
At the very least the Fail2Ban ban time is going to have to go up
from 10 minutes to let's say 6 hours.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
Around 04:00Z I received alerts that host "snaps" had unexpectedly
rebooted. Upon investigating it had indeed reset itself for reasons
unknown starting at about 03:51Z. It wasn't a full power cycle nor a
graceful shutdown, it just reset itself with no useful log output.
Whilst all VPSes did seem to boot up okay, unfortunately it soon
became clear that "snaps" had booted into an earlier version of the
hypervisor - one without the recent Spectre/Meltdown (and
other) security fixes that were deployed last week.
At this point customer VPSes on "snaps" were operating normally
again but things could not be left in that insecure state, so after
some time spent investigating things, between 06:17Z and 06:37Z I
did a clean shut down and booted into the correct version of the
hypervisor again.
I have since established why the incorrect boot entry was
automatically chosen¹ and have fixed that problem. I have not
worked out what caused "snaps" to reset itself. We have been having
some stability issues with "snaps" over the last 6 months and I
think we are going to have to decommission it.
I will come up with a plan and contact customers on "snaps" directly
later today, but in the mean time if your VPS is on "snaps" and you
wish for it to be moved to another server as a priority please
contact support(a)bitfolk.com and we'll get that done. It will involve
shutting your VPS down and booting it a few seconds later on the
target server. None of the details of your VPS will change. Please
indicate what sort of time of day would be best for that to happen.
Apologies for the disruption this will have caused you.
Cheers,
Andy
¹ The newer hypervisor package ships an override to make sure that
the server boots into the hypervisor by default at the next boot.
This is meant to make it easier for people, but all it did was
override my actual intentionally-set default boot option with one
that wasn't suitable. This was not noticed in testing because the
testing machines had no other versions of the hypervisor present.
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
If you are running a memcached server please make sure that it
either doesn't listen on UDP or else that it is properly firewalled.
Publicly available memcached servers can provide a 50,000x traffic
amplification:
<https://blog.cloudflare.com/memcrashed-major-amplification-attacks-from-por…>
As there is no authentication in the memcached protocol, having it
publicly available is generally a misconfiguration anyway.
We will start scanning for and nagging about this soon.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
On Sat, Jan 06, 2018 at 08:34:48AM +0000, Mike Zanker wrote:
> On 4 Jan 2018, at 17:57, Andy Smith <andy(a)bitfolk.com> wrote:
> > In the mean time you can use the kernel package from the CentOSPlus
> > repository which does have this fix and the KPTI one.
> >
> > https://wiki.centos.org/AdditionalResources/Repositories/CentOSPlus
> >
> > All of this was researched by a customer having the problem today
> > and it resolved it for them.
>
> This was fine until CentOS updated the CentOSPlus kernel
> overnight. Now the updated one fails to boot in exactly the same
> way as the standard CentOS kernel.
Just a quick note that someone has built a CentOS 7 kernel with a
fix for this problem, as described here:
<https://kevandrews.uk/centos-7-xen-pv-guests-failing-boot-kernel-3100-69317…>
Following the link to the bug report at
<https://bugs.centos.org/view.php?id=14347>, it says that the next
CentOS Plus kernel update will include this fix.
So if you're using CentOS 7 and are affected by this then it sounds
like the way to go would be to install that kernel and get future
kernels from Plus also.
Hopefully at the next point release of CentOS 7 they would include a
fixed kernel in the main repository.
As far as I am aware the kernel package in CentOS 6 has already had
this problem fixed.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
Notifications have just been sent out letting you know the hour long
maintenance window during which the host that your VPS is on will be
rebooted for security patches.
If you have not received the notification please check your spam
folders etc. and if still having no luck please contact
support(a)bitfolk.com.
Apologies for the short notice of this, but now that I feel I have a
reasonable plan covering a large part of the problem space, it is
best to get this done as soon as possible.
I've deliberately left most of the technical detail out of the
reboot notification. The technical details are overwhelming. If you
aren't a particularly technical person then my advice would be:
- Make sure you are running a security-supported release of your
chosen Linux distribution and that you keep it up to date. Between
them and us you should eventually get to safety, just bear in mind
that this is an evolving situation and not many Linux vendors are
willing to push out the latest fixes without extensive testing.
For those who really want them, here are some more technical details.
The newly-deployed hypervisor will:
- support Page Table Isolation, similar to the Linux kernel's KPTI,
to protect against Meltdown.
This feature will protect BitFolk's hypervisor from Meltdown
attacks from all customers.
At the moment all BitFolk VPSes are paravirtual (PV) guests. For
64-bit VPSes, this Xen-level PTI also protects against Meltdown
attacks from within their own kernel or user space. Thus, although
your kernel will report that KPTI is disabled, you will be
protected by Xen's PTI.
It is thought that 32-bit Xen PV guests could still use Meltdown
on themselves; protecting against this requires use of the KPTI
feature inside the Linux kernel. As far as I am aware 32-bit KPTI
is lagging behind the 64-bit version so those with 32-bit VPSes
may wish to consider switching to a 64-bit kernel or upgrading to
a new VPS, if they can't wait.
- be compiled with gcc's new retpoline feature, which prevents CPUs
from doing insecure branch prediction, therefore protecting you
against variant 2 of Spectre.
This is a complete protection for BitFolk's hypervisor against
Spectre variant 2 attacks from guest kernels. It will not protect
guests from attacks from inside their own VPS. For this you will
need to make sure that your own kernel is compiled with retpoline
support, with a compiler that understands the feature.
I am not aware of the situation in other distributions but as I
type this, Debian has already pushed out a version of gcc that has
the retpoline feature to its stable ("stretch") and oldstable
("jessie") releases. A binary kernel package built with it is not
yet available except in unstable, though.
- have working PVH mode support.
This is another way to run Xen virtual machines. It's
faster/simpler than the PV mode that we currently use, it's also
more secure, and it doesn't require use of qemu like HVM does.
IIRC qemu is about 1.2 million lines of code, many times larger
than Xen itself, and I've always been uncomfortable about it.
Converting you all to PVH mode would provide the best protection
against Meltdown and it would actually be more performant than PV
mode, but sadly it requires some fixes in the guest kernel and in the
bootloader that have only just gone in (like, late 4.14 kernel,
early 4.15). We can't convert people to that mode until Linux
distributions are shipping with new enough kernels, but it will be
useful to have it available for early testing.
- A couple of other unrelated security patches which will come out
of embargo later.
What's yet to come:
- Any sort of mitigation for variant 1 of Spectre.
People are still working on it, both in the Linux kernel, in Xen,
and in other software. It's possible that fixes may only come in
the Linux kernel rather than in Xen.
- Updated Intel microcode.
Intel released some updated microcode which features new CPU
instructions to help avoid these problems, and/or reduce the
performance impact of the techniques used. Shortly after release
amid many reports of system instability they withdrew the update
again, and are not currently recommending its use except for
development purposes.
So, we're still waiting for a stable release of that, and at the
moment it's looking like decent fixes can be done in software so
the urgency of a reboot just for this microcode update is low and
I am inclined to roll it in with the next maintenance. That could
change though.
After the maintenance, what you need to think about:
- If you're 32-bit you need to make a decision about Meltdown,
whether you will wait for a 32-bit kernel fix or look at going to
64-bit by some means.
- The retpoline-compiled hypervisor only protects BitFolk from you. To
protect your own VPS against Spectre variant 2 attacks coming from
within itself (like if it was tricked into running something
malicious) you need a kernel that is compiled with retpoline
support.
These are pretty new, but they are out there. Debian pushed out a
version of gcc with retpoline support to its stable ("stretch")
and oldstable ("jessie") releases recently, but as I write this
the only way to get a binary kernel package that was compiled with
it is to use linux-image-amd64 from the unstable repository.
Presumably that is going to filter through to stable etc in due
course. Until then, you could use the package from unstable, or
use the new gcc package to rebuild a kernel package…
- If you are compiling C/C++ software, do you need to be doing it
with a retpoline-aware compiler?
- Look out for Spectre variant 1 fixes - they may be in your
applications and/or kernel too. Although we can expect more
hypervisor changes, after this (and the microcode) I expect the
bulk of it to be in the kernels and applications.
Solutions that are not being pursued:
- Xen have some other mitigation options. They involve running PV
guests inside either a PVH or HVM container. I've investigated
these and they're just too complicated, they remove some useful
functionality, and they still have performance implications about
the same as XPTI.
Longer term I'd like to be moving guests to PVH mode, and perhaps
optionally HVM. That can't happen in a production capacity until a
person can install a stable release of their favourite Linux and
not have to know what PVH mode is, though.
Regarding <https://github.com/speed47/spectre-meltdown-checker>:
- It will always report that you are running under Xen PV and are
vulnerable to Meltdown. It doesn't do any actual proof of concept
exploit, it just detects PV mode and gives up. Once the new
hypervisor has been deployed 64-bit guests will be protected by
its PTI feature. As mentioned, 32-bit guests will still need to
get PTI from their kernel.
- Its reporting of Spectre variant 2 is accurate, so once you're
running a retpoline-compiled kernel it will detect that.
- Its reporting of microcode and new CPU instructions is, as far as I am
aware, accurate. It is my understanding that once there is new
microcode, guests will see it and be able to use these
instructions. This could change though.
That's all I can think of right now. I appreciate this is a lot to
take in. If you have any questions please ask on or off the list.
Once I get a sense of what is unclear I can perhaps make a wiki
page that helps make things clearer.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting