the development server

This document sets up a development server for willshake.

The approach used here is also suitable for production, and has been used in production without any problems. For the reasons why this is no longer used on the live site, see issues with production use.

1 motivation

Deploying the web site describes a process for setting up a web server to run willshake. I use it to ship willshake.net, and it’ll work on your computer, too. So what’s this all about?

This is about an alternative: an ad-hoc server that you can create on the fly without affecting the rest of your system, which may be preferable in some situations.

An ad-hoc server is particularly useful for development. You can run it from any directory, i.e. wherever you have a copy of the project. And it doesn’t require any messing with the permissions of that directory (to give Apache access). You also don’t have to mess with your Apache configuration, which is nice if you happen to be using it for anything else. And it’s easy to run more than one copy of the site at a time (on different port numbers), which can be useful for comparing versions.

It’s also useful for pre-rendering the site, since you can create and destroy it automatically.

1.1 caveat

Development is a form of rehearsal for production. Ideally, the development environment would be exactly the same as the production environment. Every difference that you introduce is just another opportunity for things to go wrong. And this technique introduces a lot of differences (in the form of generated Apache configuration).

For that reason, it’s important to have a staging area for testing closer to production.

2 `mod_wsgi-express` : an ad-hoc server

If you have mod_wsgi-express installed, then you can run the web site with a little script. The environment variables that are assumed by the configuration files need to be set as actual environment variables. Setting NO_WSGI is important here, since to prevent “our” WSGI configuration from conflicting with the one that express creates.

start-site
sh next

Run the web site

export SITE_ROOT=`pwd`
export DOCUMENT_ROOT="${SITE_ROOT}/site"
export DOMAIN=florizel
export NO_WSGI=true

mod_wsgi-express start-server 'server/getflow/getflow.wsgi' \
				 <<mod wsgi express options>>
				 --working-directory 'site' \
				 --document-root 'site' \
				 --log-to-terminal \
				 $*

This will start a new Apache server on port 8000, which should be fine for testing. log-to-terminal is a useful development option, but not necessary. The final $* allows you to pass additional options when you call the script.

If you really want to use port 80, you need to specify that, and also set www-data as the user and group. You’ll also need to run the command as root, and first turn off the “real” Apache server. See deployment.

Most of the beauty of “express” is the fact that it generates a working Apache configuration for you. But you can also include your own customizations.

mod wsgi express options next
previous sh next

--include-file 'server/httpd.conf' \

To make the ad-hoc environment as close as possible to other environments, the base configuration is included.

Note that these rules are themselves added to the base Apache configuration that mod_wsgi-express creates. When you start the server, you can see its location, which will be something like /tmp/mod_wsgi-localhost:8000:1000/.

A convenient option, especially for development, specifies that the site should be restarted automatically whenever the Python code changes.

previous mod wsgi express options next
previous sh next

--reload-on-changes \

This does not apply to the httpd.conf. When that changes, the site needs to be restarted.

mod_wsgi-express has an option for compressing responses.

previous mod wsgi express options
previous sh next

--compress-responses \

The web performance configuration already tells what types should be compressed. But this option is needed with express because otherwise the deflate module it not loaded at all.

3 production utilities

This was for using the express in production, which I’m not doing anymore. It’s basically the same as the development version, with a few different options.

3.1 starting the site

With that, I can use mod_wsgi-express to start the site.

server/start-site
previous sh next

mod_wsgi-express start-server 'server/getflow/getflow.wsgi' \
				 <<https options>>
				 <<mod wsgi express options>>
				 --working-directory 'public' \
				 --document-root 'public' \
				 --port '80' \
				 --user 'www-data' \
				 --group 'www-data' \
				 $*

It’s mostly the same as the one above used for development. The main difference is in the directory names. Also, it uses an explicit port number.

3.2 stopping the site

I have to restart the site every time I make a change to the extended Apache configuration.

#!/bin/bash
server/stop-site
previous sh next

pid=$(ps -aux | grep '[m]od_wsgi-express' | head -n1 | awk '{print $2}')
ps -aux | grep "$pid"
echo "Kill $pid?  Kill this script to cancel."
read
sudo kill $pid

4 nice things about express

You can ship everything you need. Once the server has express installed, you can deliver everything that’s needed just by sending files. You don’t need to do any external configuration.

A corollary to that is, you can be agnostic of the location of things, since express generates the Apache configuration for you. I don’t think there’s any way to do that when using a plain Apache configuration. This makes it easy to use the same scripts in both development and production.

5 issues with production use

Graham Dumpleton, the creator of mod_wsgi and mod_wsgi-express, says that the latter is suitable for production use, and I used it in production for several months. A few factors keep me from continuing to use it in production.

5.1 port number

The main problem is that you can’t run other public sites from the same server. This is because each site creates an new Apache instance, and only one application is allowed to listen on port 80 at a time. Indeed, to use express for a public site, you have to turn off the “main” Apache service.

Fair enough, but this means that even the staging site couldn’t be hosted on the same server. Whereas, with regular old Apache virtual hosts, it’s easy to host multiple sites on the same server.

5.1.1 reverse proxying (failed)

But what if you reverse proxy the public sites to internal sites that are running on other ports?

The short answer is, don’t bother. But by all means, read on.

So, for example, if willshake is running on port 8000, it would simply pass requests for the “real” address through to localhost:8000. This is easy to configure in a virtual host using mod_proxy.

conf

Proxying a site from Apache

<VirtualHost *:80>
  ServerName willshake.net
  ProxyPass        / http://localhost:8000/
  ProxyPassReverse / http://localhost:8000/
</VirtualHost>

This requires the “proxy” module to be enabled.

previous sh

a2enmod proxy_http

You can proxy any number of sites this way.

This setup is (potentially) simpler in that you don’t need to run any of the mod_wsgi-express instances on port 80, so they don’t need root access, and in turn all of that permissions business would be moot.

But there are at least two problems with the proxy, and they’re both showstoppers.

First, it does not play well with persistent connections. If nothing else, the proxy caused the Connection keep-alive to be removed from responses, no matter what I tried. I assume that’s because it knew that persistent connections were not in fact available for some reason.

Worse, when I used that setup locally, I would get intermittent 502 errors, saying

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request.

Reason: Error reading from remote server.

This would happen seemingly randomly across all requests, including static files. Even ones that hadn’t changed recently.

I think that both of these issues have something to do with how the proxy handles concurrent requests. I don’t know anything about how mod_proxy actually works, but it clearly adds multiple points of failure in this scenario.

So I abandoned this approach.

5.2 letsencrypt

It’s also not straightforward to use express with letsencrypt, which does know about “normal” Apache usage. (See transport level security.) To make letsencrypt work with an express site, I had to make it temporarily a normal Apache site. I think this would cause a problem for the auto-renewal (though I haven’t tried it yet).