Re: [bitfolk] BIND9 not authorised - Master zone

Top Page
Author: Andy Smith
Date:  
To: users
Subject: Re: [bitfolk] BIND9 not authorised - Master zone

Reply to this message
gpg: Signature made Wed Jul 24 12:43:58 2019 UTC
gpg: using DSA key 2099B64CBF15490B
gpg: Good signature from "Andy Smith <andy@strugglers.net>" [unknown]
gpg: aka "Andrew James Smith <andy@strugglers.net>" [unknown]
gpg: aka "Andy Smith (UKUUG) <andy.smith@ukuug.org>" [unknown]
gpg: aka "Andy Smith (BitFolk Ltd.) <andy@bitfolk.com>" [unknown]
gpg: aka "Andy Smith (Linux User Groups UK) <andy@lug.org.uk>" [unknown]
gpg: aka "Andy Smith (Cernio Technology Cooperative) <andy.smith@cernio.com>" [unknown]
On Wed, Jul 24, 2019 at 12:05:56PM +0000, Andy Smith wrote:
> All of those processes (16870, 11064, 27705, 20717) thought they were listening
> on the main IP port 53, even while systemd thinks there is no bind9 service running.


Here's where 27705 came into being:

Jul 23 07:23:25 westnorfolk named[26462]: shutting down
Jul 23 07:23:25 westnorfolk named[26462]: stopping command channel on 127.0.0.1#953
Jul 23 07:23:25 westnorfolk named[26462]: stopping command channel on ::1#953
Jul 23 07:23:25 westnorfolk named[26462]: no longer listening on ::#53
Jul 23 07:23:25 westnorfolk named[26462]: no longer listening on 127.0.0.1#53
Jul 23 07:23:25 westnorfolk named[26462]: no longer listening on 85.119.82.237#53
Jul 23 07:23:25 westnorfolk named[26462]: exiting
Jul 23 07:23:25 westnorfolk rndc[27676]: rndc: connect failed: 127.0.0.1#953: connection refused
Jul 23 07:23:25 westnorfolk systemd[1]: bind9.service: Control process exited, code=exited status=1
Jul 23 07:23:25 westnorfolk systemd[1]: bind9.service: Unit entered failed state.
Jul 23 07:23:25 westnorfolk systemd[1]: bind9.service: Failed with result 'exit-code'.
Jul 23 07:23:29 westnorfolk named[27705]: starting BIND 9.10.3-P4-Debian <id:ebd72b3> -c /etc/bind/named.conf

So since 07:23 yesterday this has been running and snarfing up all
the transfer requests, which is why none of the later configuration
changes seemed to make any difference.

The normal named command line when run under ssytemd looks like
this:

andy@westnorfolk:~$ ps awux | grep named
bind     10022  0.0  0.7 288940 23924 ?        Ssl  12:47   0:00 /usr/sbin/named -f -u bind


so I suspect that these other processes were started in some other
way - command line maybe? So that would be why systemd doesn't know
about them. I still think that bind should have complained about not
being able to bind to, e.g. 85.119.82.237#53 but perhaps it didn't
know it was unable to (silent failure)?

Turn off bind9 and run something that will hold the port (a socat
server copying everythign it receives to terminal):

andy@westnorfolk:~$ sudo systemctl stop bind9
andy@westnorfolk:~$ sudo socat -v tcp-l:53,fork -

It's defintiely got the port:

andy@westnorfolk:~$ sudo lsof -p 10688
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
socat   10688 root    5u  IPv4             642785      0t0     TCP *:domain (LISTEN)


And it sees traffic:

[another machine]
$ nc 85.119.82.237 53
hello
^C

[back on Keith's VM]
> 2019/07/24 13:35:41.546776 length=6 from=0 to=5

hello
hello

Start bind9 again while socat is holding the port:

andy@westnorfolk:~$ sudo systemctl start bind9
andy@westnorfolk:~$ sudo systemctl status bind9
● bind9.service - BIND Domain Name Server
   Loaded: loaded (/lib/systemd/system/bind9.service; enabled; vendor preset: enabled
   Active: active (running) since Wed 2019-07-24 13:39:02 BST; 5s ago
     Docs: man:named(8)
  Process: 10592 ExecStop=/usr/sbin/rndc stop (code=exited, status=0/SUCCESS)
 Main PID: 10780 (named)
    Tasks: 5 (limit: 4915)
   CGroup: /system.slice/bind9.service
           └─10780 /usr/sbin/named -f -u bind


andy@westnorfolk:~$ sudo grep 85.119.82.237#53 /var/log/syslog
Jul 24 13:39:02 westnorfolk named[10780]: listening on IPv4 interface eth0, 85.119.82.237#53

Why didn't it cry about not being able to bind 85.119.82.237:53?

So right now we have bind9 thinking it's running fine but it will
never see a zone transfer request because this socat process is
hogging port 53.

Is this normal? I am used to daemons giving up when they can't
exclusively bind.

Interestingly if I kill the socat, other servers now see
"connection refused", i.e. named hasn't tried to bind port 53 again
(or never did).

Cheers,
Andy

--
https://bitfolk.com/ -- No-nonsense VPS hosting