SWITCH Cloud Blog


1 Comment

Buffering issues when publishing Openstack dashboard and API services behind a HTTP reverse proxy

At SWITCH we operate SWITCHengines, a public OpenStack cloud for Swiss universities. To expose our services to the public Internet, we use the popular open source nginx reverse proxy. For the sake of simplicity we show in the following figure a simplified schema of our infrastructure, with only the components relevant to this article. Every endpoint API service and the Horizon Dashboard are available behind a reverse proxy.

SWITCHEngines reverse proxy

SWITCHEngines reverse proxy

The Problem:

Our users reported not being able to upload images using the Horizon web interface when images were large files over 10GB.

We did some tests ourselves and we noticed that the image upload process was too slow. Looking at log files, we noticed that the upload process for an image of 10GB was slow enough to make the Keystone auth token expire before the end of the process.
Why was uploading so slow ?

The analysis:

The usual scenario for the reverse proxy is load balancing to a pool of web servers. There is a slow network, like the Internet, between the users and the proxy, and there is a fast network between the proxy and the web servers.

Typical Reverse Proxy Architecture

Typical Reverse Proxy Architecture

The goal is to keep the Web Servers busy the smallest possible amount of time serving the client requests. To achieve this goal, the reverse proxy buffers the requests, and interacts with the web server only when the request is completely cached. Then the web server interacts only with the proxy on the fast network and gets rid of the latency of the slow network.
If we look at the default settings of Nginx we note that proxy_buffering is enabled.

When buffering is enabled, nginx receives a response from the proxied server as soon as possible, saving it into the buffers set by the proxy_buffer_size and proxy_buffers directives. If the whole response does not fit into memory, a part of it can be saved to a temporary file on the disk. Writing to temporary files is controlled by the proxy_max_temp_file_size and proxy_temp_file_write_size directives.

However the proxy_buffering configuration directive refers to the traffic from the web server to the user, the HTTP response that is the largest traffic when a user wants to download a web page.

In our case the user is uploading an image to the web server and the largest traffic is in the user request, not in the server response. Luckily in nginx 1.7 a new configuration option has been introduced: proxy_request_buffering

This is also enabled by default:

When buffering is enabled, the entire request body is read from the client before sending the request to a proxied server.
When buffering is disabled, the request body is sent to the proxied server immediately as it is received. In this case, the request cannot be passed to the next server if nginx already started sending the request body.

But what happens if the user’s network is also a very fast network such as SWITCHlan? And does it make sense to have such large buffers for big files over 10GB ?

Let’s see what happens when a users tries to upload an image from his computer to Glance using the Horizon web interface. You will be surprised to know that the image is buffered 3 times.

Components involved in the image upload process

Components involved in the image upload process

The user has to wait for the image to be fully uploaded to the first nginx server in front of the Horizon server, then the Horizon application stores completely the image again. At this point the public API Glance is again published behind a nginx reverse proxy and we have to wait again the time to buffer the image and then finally the last transfer to Glance.

This 3 times buffering leads to 4 upload operations from one component to another. A 10GB images then requires 10GB on the Internet and 30GB of machine to machine traffic in the OpenStack LAN.

The solution:

Buffering does not make sense in our scenario and introduces long waiting times for the buffers to get filled up.

To improve this situation we upgraded nginx to 1.8 and we configured both proxy_buffering and proxy_request_buffering to off. With this new configuration the uploaded images are buffered only once, at the Horizon server. The process of image upload with web interface is now reasonably fast and we don’t have Keystone auth tokens expiring anymore.