Keeping SSH Sessions Alive NATs & Firewalls

  • strict warning: Non-static method view::load() should not be called statically in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/views.module on line 906.
  • strict warning: Declaration of views_handler_argument::init() should be compatible with views_handler::init(&$view, $options) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/ on line 744.
  • strict warning: Declaration of views_handler_filter::options_validate() should be compatible with views_handler::options_validate($form, &$form_state) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/ on line 607.
  • strict warning: Declaration of views_handler_filter::options_submit() should be compatible with views_handler::options_submit($form, &$form_state) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/ on line 607.
  • strict warning: Declaration of views_handler_filter_boolean_operator::value_validate() should be compatible with views_handler_filter::value_validate($form, &$form_state) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/ on line 159.
Leeland's picture

Working often requires opening terminal sessions or VPN connections. It can be very annoying when these connections fail after being left idle for a few minutes, or with VPNs in the middle of a work day in spite of traffic. In regard to SSH connections many people (even many experts) incorrectly assume that the SSH server (sshd) has some restrictive session auto-timeout setting. If you look carefully at the manual for sshd configuration (man sshd_config) you will see that there is not even a setting to enable a session timeout behavior. The only timout control sshd has is on how long to wait for a login to be completed (LoginGraceTime). More interestingly is that sshd only has settings to help keep sessions open, specifically TCPKeepAlive, ClientAliveInterval, and ClientAliveCountMax for sshd and ServerAliveInterval for the ssh client. If the number of servers to number of clients ratio is small controlling these settings is easier from the servers, and inversely if the ratio is large (many servers to for a few clients) then control is simpler on the client side. Of course it can be configured on both (which can cause a net zero change if the screws are applied too tight).

So what is causing all the disconnections? The control points are NAT devices, routers, firewalls, and host operating system settings. All of the control points have settings for idle session auto-timeout. This is a scarce resource conservation of usage risk mitigation. Essentially NAT devices, routers, firewalls, and hosts have limited memory and bandwidth which if over used (even by idle connections) will noticeable reduce performance. Unfortunately many of the people who set the policies on session limits are often not the same ones who need to work with the results. The result is some networks are easy to work with with allowed idle times of up to a day or more; while others are extremely hard lined police states with session idle auto-terminate limits of only 5 minutes.

As with all things this knowledge can be abused. Please be reasonable, it only takes a few seconds to login at the start of a work session. However, any interruption of flow can cause the loss of a half-hour to an hour or more of time to reestablish the thought processing and flow. Therefore, in using the below settings one must also be responsible enough to logout when finished for the day. It can be a serious security risk to leave sessions open, especially with accounts that have administrative rights.

== User Level / Client Side ==

From the user account level ssh session idle timeouts can be generally blocked by use of in band application layer keep alive mechanism built into the ssh protocol. Further the settings can be specifically configured on a per-destination basis with wildcard matching. This is done by creating a ${HOME}/.ssh/config file with the following. Note a asterisk ('*') by itself will match any host, a specific domain could be matched like *

$ cat ~/.ssh/config
Host *
ServerAliveInterval 120

The ServerAliveInterval causes ssh to send an application level data packet (which cannot be distinguished from normal user usage) at the time interval specified in seconds (120 = 2 minutes). The TCPKeepAlive setting turns on the keep alive option (which is off by default).

The ServerAliveInterval can also be added to all the ssh clients configuration on a given host by adding it to the /etc/ssh/ssh_config file. Doing this would affect all client connections from that specific host.

=== PuTTY users ===

: PuTTY can also be configured to send keep alive packets for an ssh session. To do this change the following settings for the connection configuration:

:: -> 'Connection'
::* check "Disable Nagle's algorithm (TCP_NODELAY option)"
::* check "Enable TCP keepalives (SO_KEEPALIVE option)"
:: -> 'Connection' -> 'SSH' -> 'X11'
::* check "Enable X11 forwarding"
::* Select "Enable MIT-Magic-Cookie-1"
:: Save the session

== Server Daemon Level ==

The second solution is on the server side and requires root privileges to edit the sshd configuration file (/etc/ssh/sshd_config). This solution can limit the number of keep alive traffic requests (which is a means to prevent abuse of the client side settings). Note that on the server side the keep alive counter increments for any keep alive packet sent regardless of if sent from the server or client. Configuring the server is done by adding the setting ClientAliveInterval to the server's /etc/ssh/sshd_config. Similar to the client side setting the ClientAliveInterval set in number of seconds. Again the TCPKeepAlive turns on the keep alive option for connections.

$ sudo grep Alive /etc/ssh/sshd_config
ClientAliveInterval 120

Don't forget to SIGHUP the sshd process so it will re-read the configuration if you had to change it.

== Tightening Screws ==

So now the clients are connecting to the server and staying logged in for months. Which is an obvious abuse. To limit the connections the server has a max keep alive packets allowed setting. When the keep alive packet count reaches the max the server will instruct the ssh client to stop all keep alive packets and will also stop sending its own keep alive packets. This essentially will cause the connection to go truly idle when not in use giving the network devices the opportunity to time out the connection.

$ sudo grep Alive /etc/ssh/sshd_config
ClientAliveInterval 120
ClientAliveCountMax 600

Now for some math. Assuming the clients have a keep alive packet coming every two minutes, and the server is sending one too every two minutes, that is a total of 60 packets every hour (two packets every two minutes multiplied by 30 two minute intervals in a single hour). Now to set it so sessions will time out "off hours" for someone it is reasonable to assume a 10 hour work window. 60 packets an hour over 10 hours is 600 packets. This allows new sessions to be established, used for a normal work day period plus some stretch and then idle time out might happen when not in use. If the client has set the ServerAliveInterval keep alive to something ridiculously smaller then the max keep alive packet count will get hit sooner. Conversely if the client is NOT sending any keep alive packets or has a 10 minute pulse setting then the max time will at worse get doubled meaning 20 hours before the max is hit. That still gives a reasonable 4 hour period before the "start of the next day's work" based on the "start of the previous day's work".

To essentially disable ClientAliveCountMax either do not set it. Some sshd builds have built in defaults, in this case the ClientAliveCountMax can be set to 999999999 (its generally a 32 bit int which might be signed and lets face it even if sending a keep alive packet every second from both ends 999999999 / 2 packets every second = 499999999.5 seconds = / 8333333.325 minutes = 138888.88875 hours = 5787 days until keep alive packets will be stopped).

== User Login Shell Level ==

If session keep alive are working (the server doesn't have TCPKeepAlive=no set) and the sessions are still getting terminated it is likely the Host operating system. If the environment variable TMOUT is set for bash then bash will terminate when no activity is detected for TMOUT seconds. To see if it is set use the command

$ echo $TMOUT

This is easily removed with the command 'unset TMOUT' added to the ${HOME}/.bashrc.user file:

$ grep TMOUT ~/.bashrc.user
unset TMOUT

From man bash:

TMOUT  If  set  to  a  value greater than zero, TMOUT is treated as the
       default timeout for the read builtin.  The select command termi‐
       nates if input does not arrive after TMOUT seconds when input is
       coming from a terminal.  In an interactive shell, the  value  is
       interpreted  as  the  number  of seconds to wait for input after
       issuing the primary prompt.  Bash terminates after  waiting  for
       that number of seconds if input does not arrive.

Thread Slivers eBook at Amazon