the development server
This document sets up a development server for willshake.
The approach used here is also suitable for production, and has been used in production without any problems. For the reasons why this is no longer used on the live site, see issues with production use.
1 motivation
Deploying the web site describes a process for setting up a web server to run willshake. I use it to ship willshake.net, and it’ll work on your computer, too. So what’s this all about?
This is about an alternative: an ad-hoc server that you can create on the fly without affecting the rest of your system, which may be preferable in some situations.
An ad-hoc server is particularly useful for development. You can run it from any directory, i.e. wherever you have a copy of the project. And it doesn’t require any messing with the permissions of that directory (to give Apache access). You also don’t have to mess with your Apache configuration, which is nice if you happen to be using it for anything else. And it’s easy to run more than one copy of the site at a time (on different port numbers), which can be useful for comparing versions.
It’s also useful for pre-rendering the site, since you can create and destroy it automatically.
1.1 caveat
Development is a form of rehearsal for production. Ideally, the development environment would be exactly the same as the production environment. Every difference that you introduce is just another opportunity for things to go wrong. And this technique introduces a lot of differences (in the form of generated Apache configuration).
For that reason, it’s important to have a staging area for testing closer to production.
2 mod_wsgi-express
: an ad-hoc server
If you have mod_wsgi-express
installed, then you can run the web site with a
little script. The environment variables that are assumed by the configuration
files need to be set as actual environment variables. Setting NO_WSGI
is
important here, since to prevent “our” WSGI configuration from conflicting with
the one that express
creates.
export SITE_ROOT=`pwd`
export DOCUMENT_ROOT="${SITE_ROOT}/site"
export DOMAIN=florizel
export NO_WSGI=true
mod_wsgi-express start-server 'server/getflow/getflow.wsgi' \
<<mod wsgi express options>>
--working-directory 'site' \
--document-root 'site' \
--log-to-terminal \
$*
This will start a new Apache server on port 8000, which should be fine for
testing. log-to-terminal
is a useful development option, but not necessary.
The final $*
allows you to pass additional options when you call the script.
If you really want to use port 80, you need to specify that, and also set
www-data
as the user
and group
. You’ll also need to run the command as root,
and first turn off the “real” Apache server. See deployment.
Most of the beauty of “express” is the fact that it generates a working Apache configuration for you. But you can also include your own customizations.
--include-file 'server/httpd.conf' \
To make the ad-hoc environment as close as possible to other environments, the base configuration is included.
Note that these rules are themselves added to the base Apache configuration that
mod_wsgi-express
creates. When you start the server, you can see its location,
which will be something like /tmp/mod_wsgi-localhost:8000:1000/
.
A convenient option, especially for development, specifies that the site should be restarted automatically whenever the Python code changes.
--reload-on-changes \
This does not apply to the httpd.conf
. When that changes, the site needs to be
restarted.
mod_wsgi-express
has an option for compressing responses.
--compress-responses \
The web performance configuration already tells what types should be compressed.
But this option is needed with express
because otherwise the deflate module it
not loaded at all.
3 production utilities
This was for using the express
in production, which I’m not doing anymore. It’s
basically the same as the development version, with a few different options.
3.1 starting the site
With that, I can use mod_wsgi-express
to start the site.
mod_wsgi-express start-server 'server/getflow/getflow.wsgi' \
<<https options>>
<<mod wsgi express options>>
--working-directory 'public' \
--document-root 'public' \
--port '80' \
--user 'www-data' \
--group 'www-data' \
$*
It’s mostly the same as the one above used for development. The main difference is in the directory names. Also, it uses an explicit port number.
3.2 stopping the site
I have to restart the site every time I make a change to the extended Apache configuration.
pid=$(ps -aux | grep '[m]od_wsgi-express' | head -n1 | awk '{print $2}')
ps -aux | grep "$pid"
echo "Kill $pid? Kill this script to cancel."
read
sudo kill $pid
4 nice things about express
You can ship everything you need. Once the server has express
installed, you
can deliver everything that’s needed just by sending files. You don’t need to
do any external configuration.
A corollary to that is, you can be agnostic of the location of things, since
express
generates the Apache configuration for you. I don’t think there’s any
way to do that when using a plain Apache configuration. This makes it easy to
use the same scripts in both development and production.
5 issues with production use
Graham Dumpleton, the creator of mod_wsgi
and mod_wsgi-express
, says that the
latter is suitable for production use, and I used it in production for several
months. A few factors keep me from continuing to use it in production.
5.1 port number
The main problem is that you can’t run other public sites from the same server.
This is because each site creates an new Apache instance, and only one
application is allowed to listen on port 80 at a time. Indeed, to use express
for a public site, you have to turn off the “main” Apache service.
Fair enough, but this means that even the staging site couldn’t be hosted on the same server. Whereas, with regular old Apache virtual hosts, it’s easy to host multiple sites on the same server.
5.1.1 reverse proxying (failed)
But what if you reverse proxy the public sites to internal sites that are running on other ports?
The short answer is, don’t bother. But by all means, read on.
So, for example, if willshake is running on port 8000, it would simply pass
requests for the “real” address through to localhost:8000
. This is easy to
configure in a virtual host using mod_proxy
.
<VirtualHost *:80>
ServerName willshake.net
ProxyPass / http://localhost:8000/
ProxyPassReverse / http://localhost:8000/
</VirtualHost>
This requires the “proxy” module to be enabled.
a2enmod proxy_http
You can proxy any number of sites this way.
This setup is (potentially) simpler in that you don’t need to run any of the
mod_wsgi-express
instances on port 80, so they don’t need root access, and in
turn all of that permissions business would be moot.
But there are at least two problems with the proxy, and they’re both showstoppers.
First, it does not play well with persistent connections. If nothing else, the
proxy caused the Connection keep-alive
to be removed from responses, no matter
what I tried. I assume that’s because it knew that persistent connections were
not in fact available for some reason.
Worse, when I used that setup locally, I would get intermittent 502
errors,
saying
The proxy server received an invalid response from an upstream server. The proxy server could not handle the request. Reason: Error reading from remote server.
This would happen seemingly randomly across all requests, including static files. Even ones that hadn’t changed recently.
I think that both of these issues have something to do with how the proxy
handles concurrent requests. I don’t know anything about how mod_proxy
actually
works, but it clearly adds multiple points of failure in this scenario.
So I abandoned this approach.
5.2 letsencrypt
It’s also not straightforward to use express
with letsencrypt, which does know
about “normal” Apache usage. (See transport level security.) To make
letsencrypt work with an express
site, I had to make it temporarily a normal
Apache site. I think this would cause a problem for the auto-renewal (though I
haven’t tried it yet).