Tuning TCP ports for your Elixir app
By DevOps on Fri 24 May 2019in
Elixir is great at handling lots of concurrent connections. When you actually try to do this, however, you will bump up against the default OS configuration which limits the number of open filehandles/sockets. You may also run out of TCP ephemeral ports.
The result is poor application performance, e.g. timeouts. If you are running behind Nginx, you may see it as 503 errors, with your application taking five seconds to respond. When you look at the logs, however, the application response time is fine.
What is happening is that the client talks to Nginx, then Nginx talks to your app, but there are not enough filehandles available, so Nginx queues the request. You may start with 1024 by default, which is pitifully small. You will need to raise that at each step in the config, e.g. systemd unit file, Nginx, and Erlang VM.
First, make sure that open file limits are increased at each step in the chain.
OS account limits
The OS account running the app has default limits:
$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 3841 max locked memory (kbytes, -l) 16384 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 3841 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
Here we can see the open file limit of 1024 by default.
You can see the same for a running process by looking up limits for the process id,
# cat /proc/800/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 unlimited bytes Max resident set unlimited unlimited bytes Max processes 3841 3841 processes Max open files 65535 65535 files Max locked memory 16777216 16777216 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 3841 3841 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us
/etc/security/limits.d/foo-limits file for the account running the app:
foo soft nofile 1000000 foo hard nofile 1000000
Systemd doesn't use the limits file, though, so you need to set the limit with variables like LimitNOFILE in the unit file for your app:
Erlang VM options
rel/vm.args for your release, increase the number of ports on the command
line. In newer Erlang versions, the setting is
+Q 65536. In older ones, use
-env ERL_MAX_PORTS 65536.
Nginx open files
If you are running Nginx in front of your app, make sure it has enough open ports. See "Serving your Phoenix app with Nginx".
Ephemeral TCP ports
After that, you may run into lack of ephemeral TCP ports. This hits you very hard running behind Nginx as a proxy, but can also hit you on the outbound side when you are talking to a small number of back end servers.
In TCP/IP, a connection is defined by the combination of source IP + source port + destination IP + destination port. In this situation, all but the source port is fixed: 127.0.0.1 + random + 127.0.0.1 + 4000. There are only 64K ports. The TCP/IP stack won't reuse a port for 2 x maximum segment lifetime, which by default is 2 minutes.
Doing the math:
- 1024 ports / 120 sec = 8.53 requests per second with default file handle limit
- 60,000 / 120 = 500 requests per sec
If you are getting limited talking to back end servers, then it's useful to give your server multiple IP addresses. Tell your HTTP client library to use an IP from a pool as its source when making the request. Then the equation turns into "source IP from pool" + random port + target IP + 80.
You may be able to reuse outbound connections, with HTTP pipelining, if the back ends support it. At a certain point, the back end servers may be the limit. They may benefit from having more IPs as well.
DNS lookups can become an issue on outbound connections. We have had hosting providers block us because they thought we were doing a DOS attack on their DNS. Google DNS rate limits to 100 requests per second by default. Run a local caching DNS for your app.
You can tune the OS kernel settings to reduce the maximum segment lifetime.
/etc/sysctl.conf (or a file in
# Decrease the time default value for tcp_fin_timeout connection net.ipv4.tcp_fin_timeout = 15 # Recycle and Reuse TIME_WAIT sockets faster net.ipv4.tcp_tw_reuse = 1
Load the new settings:
sudo sysctl -p
There are other kernel TCP settings you can tune as well, e.g.
sysctl -w fs.file-max=12000500 sysctl -w fs.nr_open=20000500 ulimit -n 20000000 sysctl -w net.ipv4.tcp_mem='10000000 10000000 10000000' sysctl -w net.ipv4.tcp_rmem='1024 4096 16384' sysctl -w net.ipv4.tcp_wmem='1024 4096 16384' sysctl -w net.core.rmem_max=16384 sysctl -w net.core.wmem_max=16384
See https://phoenixframework.org/blog/the-road-to-2-million-websocket-connections and https://www.rabbitmq.com/networking.html#dealing-with-high-connection-churn