Tuning TCP ports for your Elixir app
By DevOps on Fri 24 May 2019
inElixir is great at handling lots of concurrent connections. When you actually try to do this, however, you will bump up against the default OS configuration which limits the number of open filehandles/sockets. You may also run out of TCP ephemeral ports.
The result is poor application performance, e.g. timeouts. If you are running behind Nginx, you may see it as 503 errors, with your application taking five seconds to respond. When you look at the logs, however, the application response time is fine.
What is happening is that the client talks to Nginx, then Nginx talks to your app, but there are not enough filehandles available, so Nginx queues the request. You may start with 1024 by default, which is pitifully small. You will need to raise that at each step in the config, e.g. systemd unit file, Nginx, and Erlang VM.
First, make sure that open file limits are increased at each step in the chain.
OS account limits
The OS account running the app has default limits:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 3841
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 3841
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Here we can see the open file limit of 1024 by default.
You can see the same for a running process by looking up limits for the process id,
cat /proc/<pid>/limits
:
# cat /proc/800/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 3841 3841 processes
Max open files 65535 65535 files
Max locked memory 16777216 16777216 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 3841 3841 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Create a /etc/security/limits.d/foo-limits
file for the account running the app:
foo soft nofile 1000000
foo hard nofile 1000000
Systemd
Systemd doesn't use the limits file, though, so you need to set the limit with variables like LimitNOFILE in the unit file for your app:
LimitNOFILE=65536
Erlang VM options
In rel/vm.args
for your release, increase the number of ports on the command
line. In newer Erlang versions, the setting is +Q 65536
. In older ones, use
-env ERL_MAX_PORTS 65536
.
Nginx open files
If you are running Nginx in front of your app, make sure it has enough open ports. See "Serving your Phoenix app with Nginx".
Ephemeral TCP ports
After that, you may run into lack of ephemeral TCP ports. This hits you very hard running behind Nginx as a proxy, but can also hit you on the outbound side when you are talking to a small number of back end servers.
In TCP/IP, a connection is defined by the combination of source IP + source port + destination IP + destination port. In this situation, all but the source port is fixed: 127.0.0.1 + random + 127.0.0.1 + 4000. There are only 64K ports. The TCP/IP stack won't reuse a port for 2 x maximum segment lifetime, which by default is 2 minutes.
Doing the math:
- 1024 ports / 120 sec = 8.53 requests per second with default file handle limit
- 60,000 / 120 = 500 requests per sec
If you are getting limited talking to back end servers, then it's useful to give your server multiple IP addresses. Tell your HTTP client library to use an IP from a pool as its source when making the request. Then the equation turns into "source IP from pool" + random port + target IP + 80.
You may be able to reuse outbound connections, with HTTP pipelining, if the back ends support it. At a certain point, the back end servers may be the limit. They may benefit from having more IPs as well.
DNS
DNS lookups can become an issue on outbound connections. We have had hosting providers block us because they thought we were doing a DOS attack on their DNS. Google DNS rate limits to 100 requests per second by default. Run a local caching DNS for your app.
Kernel tuning
You can tune the OS kernel settings to reduce the maximum segment lifetime.
In /etc/sysctl.conf
(or a file in /etc/sysctl.d/
):
# Decrease the time default value for tcp_fin_timeout connection
net.ipv4.tcp_fin_timeout = 15
# Recycle and Reuse TIME_WAIT sockets faster
net.ipv4.tcp_tw_reuse = 1
Load the new settings:
sudo sysctl -p
There are other kernel TCP settings you can tune as well, e.g.
sysctl -w fs.file-max=12000500
sysctl -w fs.nr_open=20000500
ulimit -n 20000000
sysctl -w net.ipv4.tcp_mem='10000000 10000000 10000000'
sysctl -w net.ipv4.tcp_rmem='1024 4096 16384'
sysctl -w net.ipv4.tcp_wmem='1024 4096 16384'
sysctl -w net.core.rmem_max=16384
sysctl -w net.core.wmem_max=16384
See https://phoenixframework.org/blog/the-road-to-2-million-websocket-connections and https://www.rabbitmq.com/networking.html#dealing-with-high-connection-churn