images

He falls to such perusal of my face
As he would draw it.
Hamlet

The image of it gives me content already; and I
trust it will grow to a most prosperous perfection.
Measure for Measure 3.1

Images are a special type of media. In “the collection,” we talk about the general considerations about media resources. Here we talk about usage and processing specific to images.

Open questions:

where do we set compression settings?
how do we set up special and batch conversions?

Be it so, that willshake shall include images of the plays, integrated with the text.

While the central subject of this project is a text, its mission cannot be carried out with text alone. In particular, there are centuries of beautiful artwork inspired by the works (and especially the plays) of Shakespeare, which were made to be enjoyed. If not here, then where?

Moreover, there are many illustrations of the plays, which were created not only to engage readers and enliven the print editions, but also to aid in comprehension of the story. It is our view that understanding fuels pleasure; therefore we shall incorporate images into this edition wherever and however they may best serve.

Another way to put it is that images use another human capability.

Indeed, the value of images is not limited to how it can help in Shakespeare-related matters. For all the same reasons (and many more), images are a crucial part of any communication that would engage someone fully. And that is just the aim of these very documents. It should be just as easy for a programmer to add a picture to the program as to add a variable. That’s part of our aim here.

3 image records

Image records are stored in the catalog. Each image gets its own file.

3.1 naming images

Elsewhere, we talk about the naming of resources in the collection. We assign to each image a unique name, called its key (or sometimes identifier).

Why? Because we’ll need to reference the image, of course. While most of the images come from external systems that themselves have unique identifiers for the images, these vary wildly in their format and level of detail.

We observe the following guidelines when assigning an image key. It’s not an exact science.

The main images in our “catalog” are those related to Shakespeare’s plays. The general form for these is {source}-{target},

Where {source} is, in order of preference (as applicable):

Artist last name (use spelling in English Wikipedia address)
Actor last name
Production company short name

And so on. Also,

use both artist last names when there is, e.g.,
- {engraver}-{painter}
- {painter}-{actor}
- but not for a photographer and actor, then just use the actor

and {target} is, likewise,

{play}, if the image is to the play generally
{play}-poster, for posters
{play}-{section}, where a particular section is depicted
{role}, where a particular role is depicted in isolation (that is, where you can’t tell which scene it is)
{role}-{role}, where two roles are likewise depicted
{play}-{role}, for the obscurer case (e.g. the “Juliet” in MM) where above would be ambiguous
{target}-{version} where version specifies medium or otherwise disambiguates from a similar entry

Note that some keys begin with digits. This means that you can’t use them at the beginning of a DOM identifier without causing problems.

if it’s a production, shouldn’t I include the year? I’ve started doing this, but where does it stop?

3.1.1 for people

Many of the images in the catalog are of people. We define some special guidelines for assigning keys to these images.

The general form is {person}{more},

Where {person} is the person key and {more} is, in order

{-year} when available (prefix ‘c’ for circa)
- prefer the year when the image was made over the year it was published
  - but use publication year to disambiguate if nothing else is available
{-artist} when available (as with play images, but
- include photographers
- natural order (i.e., photographer before painter before engraver)
{-version} such other disambiguation as is needed (e.g., NPG id)

3.2 image metadata

Image records contain some standard and some special metadata.

Here, for example, is catalog-Abbey-Ham-3.2.jpg.xml:

<by>
	<artist person="Edwin_Austin_Abbey" />
</by>
<of play="Ham" section="3.2" anchor="Do_you_think_I" />
<from commons-file="The_Play_Scene_in_Hamlet.jpg" />
<note>
	interesting but lower-res version at The_Play_Szene_in_“Hamlet”.jpg
</note>

I have RelaxNG schemas for all this stuff, which is yet to be imported. Here would be the place to use those, and to apply validaton rules.

3.3 filenames

The whole purpose of the image records is to store information about the image. But since we use a file-based build system, there are certain things that we want to know about the image a priori—without even opening the record. We encode those bits in the filename, like so:

{source type}-{name}.{format}.xml

So, an image record like

catalog-Abbey-Ham-3.2.jpg.xml

would break down as shown in this figure.

We will match on these patterns to route the files through the build process. At each point we make sure that the files have an extension representing their actual content.

4 getting images

The workflow for getting resources is generally discussed in the collection. Here, we implement that workflow for images.

This section deals with “local” and “catalog” type images, that is, images from a remote system with an API. The other type (internet) will be added later.

4.1 ship all local images

Local images are the easiest to deal with. Not only do we already have the file handy, but we’re also going to make the assumption that they’re all going to be used as-is (i.e. without projections).

tup next

: foreach $(ROOT)/assets/images/* \
|> !link_from \
|> $(SITE)/static/images/%b

4.2 get metadata from the source catalog

For “catalog” type media, we don’t store the download location directly. Instead, we store the name that uniquely identifies the image’s record in the remote catalog system. That system will have some kind of API that allows us to look up their metadata for that image, which, among other things, will include its ultimate download location.

So we start by getting the other system’s metadata for each catalog image.

previous tup next

Get the remote system’s metadata for all catalog images

: foreach $(IMAGE_RECORDS)/catalog-*.xml \
  | $(PROGRAM)/get-resource \
    $(PROGRAM)/images/* \
|> ^o get image metadata %B^ \
   $(PROGRAM)/images/get_image_metadata \
   $(PROGRAM)/get-resource \
   $(PROGRAM)/images/commons-query-imageinfo "%f" "%o" "%g" \
|> $(IMAGE_METADATA)/%g.xml

This build rule breaks down the image filename structure that we described earlier. First, it only deals with catalog- images. The image’s name and format are captured by the globbing star (*), which is made available in the %g flag. The %g that we pass to the command is actually not used by the system, but supports aliasing in the internet cache; see get-resource.

In principle, the rule could be agnostic of the records’ file extension by saying catalog-*.* and making the output file %g.%e. But as of its latest version, this causes Tup to say

tup error: %e is only valid with a foreach rule for files that have extensions.
 -- Path: './database/images/catalog-1629-Puck.jpg.xml'
tup error: Unable to parse :-rule from run script: ': foreach $(IMAGE_RECORDS)/catalog-*.*   '

It seems to me that the file does have an extension, but no matter.

Getting the metadata means

figuring out the exact API call to make (that is, what URL)
calling it and storing the result

Step 2 will be handled by get-resource, so that we can take advantage of its caching. This is especially important for remote systems that may limit the number of calls you can make (which Wikimedia Commons does).

program/images/get image metadata
python next

import sys
from io import BytesIO
from subprocess import call
from lxml import etree
from urllib import parse

(__, get_resource, query_file,
 image_record, out_file, name) = sys.argv

API_URL = "http://commons.wikimedia.org/w/api.php"

# Would be less roundabout if only you could parse a fragment!
with open(image_record, 'r', encoding='utf-8') as f:
    record = ('<r>' + f.read() + '</r>')
    context = etree.iterparse(BytesIO(record.encode('utf-8')), tag='from', events=['start'])
    __, element = next(context)
    commons_file = element.get('commons-file')

# We store the query in a one-parameter-per-line format.  The values may need
# escaping, but we need to leave the equal signs alone.
with open(query_file, 'r') as f:
    query = '&'.join(parse.quote(line.strip(), safe='=') for line in f)

url = API_URL + '?' + query + parse.quote(commons_file)

call([get_resource, url.encode('utf-8'), out_file, name + '__metadata'])

Right now, the only remote catalog system we deal with is Wikimedia Commons. The build rule passes the following file, containing the API parameters for a basic metadata request:

no next
program/images/commons-query-imageinfo
text next

Wikimedia query for getting some basic image metadata

action=query
prop=imageinfo
format=xml
iiprop=comment|parsedcomment|url|size|dimensions|mime|thumbmime|mediatype|metadata|archivename|bitdepth
iilimit=10
titles=File:

It expects the filename you’re inquiring about to be appended at the end. (It’s also formatted for readability; see the script.)

If all goes well—like, the file actually exists, for example—the query returns an XML document containing certain information about the image. Specifically, it will include the fields we requested with the iiprop parameter. This is kind of “meta-metadata” in the sense that it’s mostly about the remote system’s record of the image (not so much about the image itself). (You can see an example here.) The only thing relevant to our present purpose is url—the URL of the actual image file— which can be easily extracted with a simple XPath expression.

program/images/extract commons image url.xsl
xsl next

Get the download URL of a Wikimedia Commons image

<xsl:output method="text" />
<xsl:template match="/">
	<xsl:value-of select="(//ii)[1]/@url"/>
</xsl:template>

So we can download now, right? Sure. But first we write the download location to its own file.

previous tup next

Write download URL to the build graph

: foreach $(IMAGE_METADATA)/*.xml \
  | $(PROGRAM)/images/* \
|> ^o get catalog image url %B^ \
   xsltproc $(PROGRAM)/images/extract_commons_image_url.xsl "%f" > "%o" \
|> $(IMAGE_LOCATION)/%B.txt

This apparently overzealous step is very effective in preventing needless calls to get-resource, since downstream build actions will not be triggered unless the download location actually changes, regardless of what refactoring we may do upstream.

4.3 get the pictures

Now we know where the images live. Let’s go get them one by one.

previous tup next

: foreach $(IMAGE_LOCATION)/* \
  | $(PROGRAM)/get-resource \
|> ^o get image resource %B ^ \
   $(PROGRAM)/get-resource "`cat '%f'`" "%o" "%B" \
|> $(IMAGES)/%B

4.4 extract metadata

This section is not used right now. All of the image records were written manually by using information from web sites, mainly Wikimedia Commons. The objective of the code in this section is to attempt to automate the collection of such metadata using the MediaWiki API. Mapping records their catalog to ours is a somewhat messy process, requiring judgement calls and lateral research, so the intention is not to fully automate the process. However, this could be useful as a tool for helping to reduce or audit the manual work.

This is where we get the things people usually think of as “metadata"—the title, artist, year, medium, and so on. It’s currently here for reference and not formally used at this time. The metadata that we have in our image records was entered manually from the web site.

A Wikimedia query for getting image content?

action=query
prop=revisions
format=txt
rvprop=content
rvlimit=1
rvexpandtemplates=1
rvgeneratexml=1
titles=File:

4.4.2 script

program/images/extract-metadata.sh
sh next

This is a script to run a given MediaWiki query.

out_dir="out/commons-query-content-xml"
extract_transform_win="$(cygpath -wa extract-metadata.xsl)"

script_dir="$(cd "$(dirname "$0")" && pwd)"
pushd $script_dir > /dev/nul

find $out_dir -iname '*.xml' | while read fullpath; do
	file="${fullpath##*/}"
	key="${file%.xml}"
	echo "$key" 1>&2
	echo
	echo "<image-ex key='$key'>"
	sed -n "/<root\>/,/<\/root\>/p" "$fullpath" \
		| xslt "$extract_transform_win"
	echo
	echo "</image-ex>"
done

popd > /dev/nul

4.4.3 query for templates

previous no
program/images/commons-query-templates
previous text

Another Wikimedia query for getting templates?

action=query
prop=templates
format=xml
tlnamespace=0
tllimit=max
tltemplates=Template:Information
export=
exportnowrap=
titles=File:

This script runs extract-metadata transform against the output from the content-xml query.

This script updates a local mirror of Wikimedia Commons API results for imageinfo (or provided) query on all images in the image index.

4.4.4 transform

This transform takes as its input the output from content-xml query against the MediaWiki API.

program/images/extract-metadata.xsl
previous xsl

Get the metadata from the MediaWiki template content.

<xsl:output omit-xml-declaration="yes" indent="yes" />

<xsl:variable name="upper" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="lower" select="'abcdefghijklmnopqrstuvwxyz'"/>

<xsl:template match="node()|@*" />

<xsl:template match="/">
	<xsl:choose>
		<xsl:when test="root">
			<xsl:apply-templates select="root/*" />
		</xsl:when>
		<xsl:otherwise>
			<error>No root!</error>
		</xsl:otherwise>
	</xsl:choose>
</xsl:template>

<xsl:template match="h" />


<xsl:template match="value//node()">
	<xsl:copy>
		<xsl:apply-templates select="node()|@*" />
	</xsl:copy>
</xsl:template>

<xsl:template match="value//text()">
	<xsl:value-of select="normalize-space()"/>
</xsl:template>
<xsl:template match="value//text()[normalize-space() = ' ']" />

<xsl:template match="value//template">
	<xsl:element name="{translate(normalize-space(title), ' ', '-')}">
		<xsl:apply-templates select="part" />
	</xsl:element>
</xsl:template>

<xsl:template match="root/template" />
<xsl:template match="root/template[starts-with(title, 'Information') or starts-with(title, 'Artwork')]">
	<xsl:apply-templates />
</xsl:template>

<xsl:template match="template/part">
	<xsl:variable name="name">
		<xsl:apply-templates select="name" mode="get-name" />
	</xsl:variable>

	<xsl:element name="{translate(normalize-space($name), concat(' ', $upper), concat('-', $lower))}">
		<xsl:apply-templates select="value" />
	</xsl:element>
</xsl:template>

<xsl:template match="template/part[contains('12', name)]">
	<xsl:value-of select="value"/>
</xsl:template>

<xsl:template match="part[contains(translate(name, 'S', 's'), 'source')]" />
<xsl:template match="part[contains(translate(name, 'P', 'p'), 'permission')]" />
<xsl:template match="part[contains(translate(name, 'O', 'o'), 'other_versions')]" />

<xsl:template mode="get-name" match="name">
	<xsl:value-of select="."/>
</xsl:template>
<xsl:template mode="get-name" match="name[.='']">noname</xsl:template>

<xsl:template match="part/value">
	<xsl:apply-templates select="node()|@*" />
</xsl:template>

<xsl:template match="value[template/title[.='en']][count(template/part) = 1]">
	<xsl:apply-templates select="template/part/value" />
</xsl:template>

<xsl:template match="template[contains(title, ':')]" priority="1">
	<xsl:element name="{substring-before(title, ':')}">
		<xsl:value-of select="substring-after(title, ':')"/>
	</xsl:element>
</xsl:template>

<xsl:template match="template[string-length(title) = 2]">
	<xsl:value-of select="value"/>
</xsl:template>

<xsl:template match="template[string-length(title) = 2 and contains('ru ja', title)]">
	<skipping-template named="{title}" />
</xsl:template>

4.4.5 run extract metadata

program/images/run-extract-metadata.sh
previous sh next

out_file="out/image-metadata.xml"

echo "<image-metadata>" > "$out_file"
echo >> "$out_file"
./extract-metadata.sh >> "$out_file"
echo >> "$out_file"
echo "</image-metadata>" >> "$out_file"

5 projections

All of the images on willshake come from physical forms: paintings, etchings, prints, drawings, statues. Some remain physically unique (such as most of the paintings) and some were made for reproduction (such as most of the illustrations). In all cases the dimensions were fixed ahead of time; that is, where adjustment was necessary, it was done once per edition, and within a setting (usually a book) whose properties were known a priori.

We enjoy no such assumptions. It is expected that the same software should work equally well regardless of the shape and size of the display. As such, the traditional rigid layout must give way to a “programmatic grid” that is defined less by metrics and more by logical rules.

To this end, we cannot simply place an image in the edition. The image must adapt as necessary to constraints on the available space, as well as other considerations like pixel density and bandwidth.

Although, as we’ve said, the collection is motivated by usage, adding an image to the collection still does not by itself do anything, other than cause it to be downloaded.

In order to actually use an image, at least two more steps are necessary.

The first thing to consider is that you might want to use the same image in different ways. For example, you might want a large image to use in a full-screen setting. But you might also want to use that same image in a smaller setting, like in the margin of a document. It would be very inefficient to use the same image in both cases. Or you might want a detail version of an image, which crops out only a portion of it. Of course, you might still want the unaltered version for use elsewhere. Bottom line, these are all distinct images. You’re always dealing with some projection of an original image.

You can see that even though an image has only one name in the catalog, when it comes time to use it, it takes on another property, which becomes part of its identity. So just as an image has an identifier, an image projection is identified by the name of the image plus the name of a projection. The projection also specifies a format, since some formats will be more suitable than others for certain images.

This is what we’ll use when referring to it.

So how does it get created? And what does the “name” of the projection mean?

Adding projections should be trivial.

Each image projection requires a build step, and since we have a file-based build system, it’s possible to set things up so that adding a projection is just a matter of adding a file with that name (once the projection types are set up). That’s what we’ll do here.

Any further details about the projection are contained in the file itself.

But we might also want to generate image projections. The only reason for this I can think of is that you want a set of projections to be part of a feature, and thus to travel with the document where they’re defined. I’m not 100% sure that this is warranted, but it’s accomplished easily enough. Now, since we don’t want managed files to mingle with generated files, we’d have to define two locations for projections.

previous tup next

: foreach $(ROOT)/database/image-projections/* \
|> !link_from \
|> $(ROOT)/image-projections/%b

The image-projections folder is our clearing house for all image projections. If you really must generate (or tangle) a projection, it should go there. And any projections that we’re content to leave out of documents, can be written to the database/image-projections folder. They will be pooled with the others, and from that point be indistinguishable.

The projection file can be empty, or it can contain some details about how the projection should be done. Such instructions might include scaling, cropping, and re-encoding. Basically, each type of projection will have a program to implement it.

Very simply, we take every file in the image-projections directory and run it through a program whose job is to carry it out.

previous tup next

: foreach $(ROOT)/image-projections/*.txt \
  |  $(ROOT)/program/images/<all> \
     $(ROOT)/program/images/* \
     $(IMAGES)/* \
|> ^ project image %B^ \
   $(ROOT)/program/images/project-image \
       %f %B %o \
       $(IMAGES) \
       $(ROOT)/program/images \
|> $(SITE)/static/images/%B

That program has access to the image folder, of course. And of course, it can read the file itself, which can give processing instructions. It also has access to the program/images directory, which contains the program itself. But giving it access to the whole directory will allow us to define common projections which are used by many images.

So, what does that program do?

#!/bin/bash
program/images/project-image
previous sh next

projection_file="$1"
request="$2"
out_file="$3"
image_dir="$4"
projections_dir="$5"

pattern="(.*)@(.*)\.(.*)"
[[ $request =~ $pattern ]]

image_key=${BASH_REMATCH[1]}
projection=${BASH_REMATCH[2]}
format=${BASH_REMATCH[3]}

image_root="$image_dir/$image_key"
projection_program="$projections_dir/$projection"

"$projection_program" "$image_root".* "$out_file" \
					  `head -n1 $projection_file`

This is pretty nice. This means that all we have to do to set up a type of projection is to put a program in the program/images folder. The program should expect the input and output files as its first two arguments, and the remaining arguments (if any) will be supplied by the first line of the projection file.

5.1 medium-sized images

Not sure what to call this projection. For historical reasons, it’s called margin, but I’m inclined to change it to something more general.

For example, suppose you wanted to use an image on the web site. Sure, you can set the image’s display size using CSS, but the original image could be several megabytes. It’s much more efficient to pre-scale images. A projection that limits images to 700 pixels on either side can be written in a few lines.

program/images/margin
previous sh

in="$1"
out="$2"
convert "$in" -resize '700>' -format jpg "$out"

So touching the file database/image-projections/Sargent-Terry-Lady_Macbeth@margin.jpg.txt will create a scaled-down version of that image.

5.2 references

Second, you have to reference it somewhere. The first step will cause the image to be distributed with willshake, but it still won’t appear anywhere unless you point to it!

It might be nice if simply referencing an image from a document would cause it to be shipped. That’s not my concern at the moment, but certainly it’s essential to making the use of images as frictionless as possible.

5.3 image formats

Like any digital media, images have to be encoded in some format.¹

The two main classes of image formats are vector and raster. Here we are only dealing with raster images, although the process should be the same for all types.

For our purposes, we consider all of the various image formats to be interchangeable.

As noted above, we indicate the source format of an image in the filename of its record, and we preserve this throughout the process of retrieving it. But once we have the image, we reserve the right to convert it to any other format. In practice, we never use images directly in their original format, but convert them to JPEG. We use JPEG everywhere, both for the convenience of uniformity and for the efficiency of superior compression.

6 cropping / detailing

Great artworks have great composition. In such works, an off-center figure is a feature, not a bug. Whitespace, likewise, is a welcome aid. Balance—of color, rhythm, line—is a function of the whole. Cropping and scaling, in effect, create a new composition, and must be done with care.

That said, there are good reasons for detailing. In some cases, it can be a concession to space constraints. While it is possible to scale images down to any size, they become unreadable at some point (even on high-density displays). In such cases a close-up of the subject can be more effective.

Use of a detail may also be motivated by the image’s context. If an image of several people is being used to show depictions of only one character, it can be more useful, when feasible, to show that person only.

6.1 specification

With all this in mind, we set out to define the “programmatic grid” whereby such detailing can be effected.

6.1.1 related: avatars

As a starting point, consider “avatars,” which we use to show the faces of the dead people often quoted on the site. We have a simple system for selecting avatars from an image based on three parameters (horizontal offset, scale, and vertical offset). This system is specially designed so that

the output image is square
the default parameters provide usable output for typical portraits
the parameters map to the most typical adjustments
the parameters comprise an irreducible language for expressing the selection of any square from a rectangle, agnostically of the latter’s absolute size

We want something similar for general detailing, except that

the output need not be square.
the output will not be pre-rendered. In other words, you may get different results in different contexts.

6.1.2 STUB requirements

The following is OBE, I think. I’ve added an “aspect ratio” option to the image shrinker, and started detailing images using the “golden ratio” (for lack of a better idea).

Given an image and a set of constraints, produce markup that will achieve the desired constraints in the production environment. In other words, this is not a matter of creating a static image, but of rendering the HTML and CSS—necessarily with some knowledge of the environment.

Notwithstanding the above, output may include a pre-adjusted version of the image file. If we know, for example, that the image will always be cropped, this will prevent wasteful transfer. Likewise, if we know that the image need never exceed a certain absolute size (in width or height), it may be pre-scaled accordingly.

The output must meet the following constraints:

The result must fill the container. We never want to have a “gap”, that is, offset an image beyond the signal, so that you end up with an undefined area.
We must never scale images up beyond their original size. This would seem to conflict with the above constraint, unless we can ensure that the container itself will adjust as needed.

Following are constraints on the implementation. They do not affect the input and output per se.

Layout must not depend on retrieval of image. In other words, the dimensions necessary for sizing the container must be pre-rendered (although the final dimensions will vary based on the context).
It has to be fast, for tooling. It can’t take more than a second to see the result of a parameter change.

6.1.3 parameters

x offset: the proportion along the source image’s width where the center of the output image will fall
size ratio: the proportion of the available space to be used by the output image’s width

6.1.4 open

I’m pretty sure that in no case do we want the container’s height to be governed by the image itself, even if we knew that we had room for the whole thing. As such, this may be a separate consideration, but it certainly bears on the output.

WHAT are the constraints/targets/rules, whatever.

In other words, you’re going to say, given this image, I want:

first note that these things apply to a particular image
does this need to be split into different contexts? or can you necessarily write a set of I want’s that is the for all cases?

Take an example. What are the constraints you’d put on the “Beatrice and Benedick” image?

for narrowest screen
for narrow screen
for wide screen. No detailing. Well, I’ve set a height constraint on the box, separately. But… that’s the whole point here. That constraint knows nothing about the image, and in some cases, it makes bad results.
How will this integrate with annotation of images?
- consider that annotations must be positioned against the non-detailed image
- yet their placement must respect any cropping and scaling
- although this does not necessarily mean that the annotation system must “know” about the adjustments
Does the output have an “aspect ratio” as such? Can it? does it need to be the same as the input image?
Are we talking about constraints on width, height, either, or both?
Should we use an img tag or a background-image property? The problem with img (at least, presently) is that you can’t serve different sizes based on the device. That is a more important concern presently than how “semantic” is the markup.
Can we govern the size of the container? In that case, how do we control its aspect ratio? Put another way, should/can the container be of a variable size… based on the detail? For if it can’t be based on the image itself (i.e. from flow positioning), then it either has to be based on a prior knowledge of the image’s size or some combination of this with the detail area.

6.2 the Spirit images

The “Howard” (or “Spirit”) images, i.e. the 500 or so plates from The Spirit of the Plays of Shakespeare by Frank Howard, which cover the whole canon, deserve special consideration. The full plates are a perfect size for a landscape-oriented “handheld” display (of the smallest size we consider), but they become illegible at portrait size due to their nature as line drawings (since the pixel size becomes smaller than many of the lines, although note that I haven’t tried this on a high-resolution device).

6.3 do detailing

This ad-hoc thing is now superceded (in principle) by projections.

Without further ado, let’s make some details.

We’re going to define a usage type called detail. Each image will get to decide certain things about how it gets detailed, but the output size will always be the same.

This will work by writing the image’s specific preferences to a file like

$(IMAGE_USAGE)/detail/{image}

Doing it this way means that the image-usage directory will have a mixture of authored and generated files. We don’t want that, right? Right now I’m just separating it by type.

For the moment, let’s just scale them down without reading the details of the details.

previous tup

: foreach $(IMAGE_USAGE)/detail/* \
  | $(IMAGES)/* \
|> convert $(IMAGES)/%b.* \
	   -resize '512' \
	   -format jpg %o \
|> $(SITE_IMAGES)/detail/%b.jpg

Does ImageMagick have a Python API?

This program will crop an image given a rectangle that is defined by reference to the original image’s size. In other words, the inputs are ratios, not pixels.

previous python

import sys
from subprocess import call, check_output, CalledProcessError

if len(sys.argv) < 7:
    sys.exit('usage: in-image out-image dx dy dw dh');

in_image = sys.argv[1]
out_image = sys.argv[2]
dx, dy, dw, dh = map(float, sys.argv[3:7])

try:
    size = check_output(
	['convert', in_image,
	 '-format', "%w %h", 'info:']
    ).decode('utf-8').strip()

except CalledProcessError as e:
    sys.exit("error: couldn't get image dimensions: "
	     + str(e.output))

width, height = map(int, size.split(' '))

px, py = width * dx, height * dy
pw, ph = width * dw, height * dh

# print("width = {}".format(width))
# print("height = {}".format(height))
# print("px = {}".format(px))
# print("py = {}".format(py))
# print("pw = {}".format(pw))
# print("ph = {}".format(ph))

call(['convert', in_image,
      '-crop', '{}x{}+{}+{}'.format(pw, ph, px, py),
      '+repage', out_image])

# That is equivalent to the less-explicit
# 
# call(['convert',
#       '{}[{}x{}+{}+{}]'.format(in_image, pw, ph, px, py),
#       out_image])

This is a general cropping program that doesn’t use the bizzare specification that I define above.

7 annotation of images

Most of the images are associated with particular passages from the works. As such, an important part of the image metadata is an identification of the relevant point in the text.

We want a way to annotate images so that it is possible to

add text to the image at a specific place, with constraints on that container (such as that it won’t go outside of that area)
integrate with cropping/scaling described above

Footnotes:

The answer to “PNG vs. GIF vs. JPEG vs. SVG - When best to use?” provides a good summary of the most common formats with examples. http://stackoverflow.com/a/7752936