An isometric projection is defined by the property, that all the axes have the same metric (isometric, Greek: "equal measure"). Let's draw a cube to visualize what this means (fig. 1).

All the cube's edges have the same length.
Furthermore, because the angles between sides all are

It is easy to see, that the angles at the other side of the edges measure

But the angle

For the relationship between

However, in computer graphics this real isometry is not truely liked. The reason for this becomes apparent, when we magnify one of the surface edges in the x/z-plane (fig. 2): The lines just don't look good.

30°-lines appear blocky and unsmooth. Mainly for this reason, it is usual in computer graphics to not use real isometric projections and apply lines which truely look straight instead (fig. 3), even if that means to depart from the nice 30°, 60° and 120° angles.

This line grows twice as fast horizontally than it does vertically.
Therefore its angle

Because of this, our

Technically, the departure from a representation with three equal 120° angles
makes our rendering no longer an isometric projection:
the accurate term for this kind of drawings is

Because the angles changed,

There's another and more important reason to apply the 2:1 ratio besides the one dealing with the optical appearance, though.

It makes the design of structures parallel to the ground plane a lot easier and oftentimes saves much time in calculating the proper lengths without really affecting the optical impression too much.

Imagine a small landscape subdivided into 4

After applying the principles of isometric rendering as outlined above, our landscape will look like this (fig. 6):

This type of rendering has some issues, though. They can be revealed if we look at a magnified tile. Fig. 7 shows an individual tile from our landscape:

The problem becomes apparent when we attempt to connect such tiles. We can not simply place them next to each other: it is absolutely impossible to tile them, no matter how hard we try (fig. 8).

Notice, however, what happens when we shift the tiles two pixels to the left and one pixel upwards: the two adjacent edges of any two tiles are shared by each other (fig. 9).

Alas, it is not really possible to share an edge between different tiles: even if we drew the tiles this way, it still would be true that a pixel can only be displayed once (although it could be drawn multiple times in the same screen location). For this reason we declare, that a tile's lower edges (i.e. the SW edge and the SE edge) end already with the pixel just above the so far common edge (fig. 10). Notice, that the green edges then do not any longer form part of our 3 tiles: they indicate the top (NW and NE) edges of the next adjacent tiles.

That said, it is easy to see what our isometric tiles eventually must look like (fig. 11): whatever edge length you ultimately need to represent your grid, just reduce it by a two pixel wide element and duplicate the Western and Eastern corner heights.

There are a few more things to keep in mind when working with tiles.

First of all, remember that the lower edges of the tiles are not treated as a part of that tile any longer: although they still do exist, they are rendered by the adjacent SW and SE tiles, which share those edges with our tile in question. Therefore, when it is desired to draw outlines as we did in the initial example, then only the two upper edges should be drawn (the black ones in fig. 12):

Secondly, since the perceived lower edges in fact are not yet the real lower edges, care must be taken when the center point of a tile is of importance: the middle of the edge's length is not just the middle element of the two pixel wide elements which we see (in our example there are 6 such elements, see fig. 13), but the middle element of those elements plus 1, because the shared lower edge must be incuded (i.e. 7 elements in our example, making the 4th the middle element). Note, that the lower end points of the middle lines do not fall on a 2-pixel-element of the (apparent) lower edges, simply because they are not the lower edges yet.

Before we go one step further and seriously deal with the third dimension (elevation), let's have a look at our cube again and properly define the edges' coordinates for the representation of its surface area:

All the edges labeled

Now, we can let *not* equal to

Notice, however, that

Now imagine several such cubes attached to each other. Let's define, that the x axis runs along the NE edges and the z axis along the NW edges. Labeling the "rows" with capital letters and the "columns" with Roman numbers, we are able to name the individual cubes. Fig. 15 shows the identifiable coordinates:

Still assuming an edge length of

Let's have a look at the cubes'

The

There's a third axis, though, which we silently ignored until now.
In fact, so far it is not at all an interesting axis, as it stays
the same all the time, namely

Let's try to fill in the two-dimensional screen coordinates into a table.
Let's refer to the screen coordinates as _{Sn}_{Sn}*Screen*, and _{Wn}_{Wn}_{Wn}*World* and
distinguishes this set of coordinates from the 2D-coordinates.
Be careful to not to confuse the two

We recall the following observations:

Assuming a coordinate origin of _{S0}=0_{S0}=0

It can be seen, that the _{Wn}_{Wn}

The above exercise immediately leads to a generalization on how to obtain screen formulas from world coordinates:

_{S} = _{W}_{W}_{W}_{W}

_{S} = _{W}_{W}_{W}_{W}_{W}_{W}_{W}_{W}_{W}

As a first example let's calculate the screen coordinates of the topmost corner
of the cube labeled "C0" in fig. 15. The topmost corner of C0 is
at world coordinates _{W}_{W}_{W}

_{S} = (_{W}_{W}

_{S} = ((_{W}_{W}_{W}

Whereas the value _{S}_{S}_{W0}, z_{W0}

With the origin pointed out
(labeled "Origin", at _{W0}, z_{W0}

As a second example let's look at the coordinate _{W}=1_{W}=1_{W}=1

_{S} = (_{W}_{W}

_{S} = ((_{W}_{W}_{W}

5 pixels above the origin seems about right.

Our formulas work well, but in our previous examples we had to deal with negative numbers.
Of course, there is nothing wrong with negative numbers per se,
but because we were claiming to calculate screen coordinates,
we nevertheless would face some difficulties in applying our results.
After all, we can not render a pixel at the coordinate

The problem is due to choice of the origin. In our previous examples, this was the base elevation
of the topmost point of the cube A0. Being the origin inherently has the notion
of being located at *screen* coordinates

This is fine, but it has the drawback, that only a quarter of all world coordinates (more precisely, a quarter of the points at base level) can be transformed into screen coordinates.

What we need to do is to define a center point in the world system. That center point is to become the center of the screen, and relative to that point all the other transformations from the world coordinate system to the screen coordinate system are going to be calculated.

Whereas it is no problem to pick such a point in the world coordinate system
(it can be virtually any arbitrarily chosen point which we declare to be "the center point"),
the definition of the center of the screen (to which the center point needs to be transformed)
requires some elaboration. The complication which arises is due to the fact,
that the screen has two dimensions with *precisely defined lenghts*
(which is a good thing - otherwise we would not be able to calculate a center).

So let's focus to the "screen" for a moment. What exactly is the screen?
The most appropriate answer might be: it depends what you want it to be.
In essence it is just a two-dimensional rectangle into which you want
to represent three-dimensional points applying our found formulae.
This may be the whole display area which is surrounded by the physical ends of your device,
or it may just be a rectangle of arbitrary size which you dedicate to your product
(in most operating systems called a "Window"). All in all, however, it does not really matter:
the only thing which matters is the *length of the two dimensions*:
we need to know those in order to calculate a center point within your screen.

So let's, somewhat arbitrarily, define an example screen.
Let's say, it has a width of _{S}=250 pixels_{S}=100 pixels

We will need these coordinates later on, therefore let's give them a dedicated name
(whereby the subscript *center*):

_{C} = ScreenWidth / 2

_{C} = ScreenHeight / 2

To this Center we want to render an arbitrary point from our world coordinate system. Now, "arbitrary" does not mean, that it does not matter what point we choose: it does matter, because this is the point which an observer of the screen focusses instinctively, virtually perceiving it as the "center of the world" or, maybe better, the "relative center of his world around which the whole world revolves". Or, in yet other terms, the point, where "the action happens".

Therefore it is typical for isometric projections to associate this center with the observer (most prominently featured in computer games, where it is this point at which the player's avatar is located). This is an important property, because it implies, that although the screen's center is fix (at least as long as the dimensions of the screen themselves don't change), the world's center of interest is dynamic and may change (for example will our aforementioned avatar not stand still all the time, but move in one way or the other, thus relocating the world's center of interest while still being displayed at the screen's center).

The quintessence of this is, that it is the world which seems to move.
This directly implies, that we need to manipulate our calculated _{S}_{S}

Let's say, that we wanted the center of the top surface of cube B1 to be the world's center
(for example because a player's avatar is standing on this particular spot).
Let's call this point *Focus* from now on to clearly distinguish it from
the term *Center*, which we will exclusively use when we mean the screen's center.

To match the (world's) Focus with the (screen's) Center is a 2-step process. Firstly, we need to calculate the focus' coordinates within the world coordinate system (this is, because the center of the top surface area is not a given point already: only the four corners of the surface are known à priori). Secondly, we need to "move" this point in some way, so that its coordinates eventually match the Center (this second operation, known as a Translation, will turn out to consist of two substeps, so that we might as well speak of a 3-way process).

First things first. The calculation of the Focus might appear to be a trivial operation at first glance, but I can assure you, that this impression is false. It looks such trivial just because the surface of our cube is "flat", i.e., all corners have the same elevation and thus the plane is parallel to "ground zero". Things will turn out to be slightly more complicated, when the corners do not all have this facilitating property. In particular will we not be able anymore to treat the surface as a rhombus like we are doing for now, but need to look at it as compound triangles. More of that later, though.

When the surface is flat, the Focus' coordinates can easily be calculated:
just take the topmost corner's coordinates and go downwards half the distance
toward its opposite corner, or take the leftmost corner and go halfway
toward that one's opposite corner.
Putting these two options together, using one property of each, we could easily achieve our goal:
just use the _{W}_{W}

Well, of course, we know

So, for the moment being, it suffices to add _{S}*Origin*.
Remember, however, that *Focus* to indicate,
that we mean a special point within the world coordinate system. Then the formulae look like this:

_{O} = _{F}_{F}

_{O} = ((_{F}_{F}_{F}_{F}_{F}_{F}_{F}_{F}_{F}

It turns out, that after factoring in the sine the formula is not really
much more time-consuming than before: we just need to add

For the Focus of our cube's surface this results in:

_{O} = (_{F}_{F}

_{O} = ((_{F}_{F}_{F}

Now compare this with the coordinates we got in an earlier example for the topmost corner of our cube:
they were _{S}=0_{S}=-4.9632

Recall, that so far our formulas for_{S}_{S}_{O}_{O}_{O}_{O}

And by now also the final step is fairly obvious:
we don't want the Focus to be displayed at the screen's origin, but at its center,
and so we need to add the screen coordinates _{C}_{C}

Therefore our final formulas will look like this
(of course, here we did not factor in the sine as was done for the Focus,
because we want to refer to the really mentioned points, and not to the center of an area surface;
hence, the term _{S}

_{S} = _{W}_{W}_{O}+x_{C}

_{S} = _{W}_{W}_{W}_{O}+y_{C}

Let's check how this all works out for the topmost corner of our singled out cube B1.
The corner's coordinates are
_{W}_{W}_{W}*Center*
(the screen still being assumed to have a width of 250 pixels and a height of 100 pixels)
and the *Origin* as per our cube's *Focus* point:

_{C} = ScreenWidth / 2 = 250 / 2 =

_{C} = ScreenHeight / 2 = 100 / 2 =

_{O} = (_{F}_{F}

_{O} = ((_{F}_{F}_{F}

Then we can calculate any desired point within the world coordinate system,
for example the top corner of cube B1, by applying the formulas for _{S}_{S}

_{S} = (_{W}_{W}_{O}+x_{C} = (

_{S} = ((_{W}_{W}_{W}_{O}+y_{C} = ((

Because the topmost corner of the surface of cube B1 also is
the bottommost corner of the surface of cube A0 (which latter happens to be our "Focus cube"),
we expect this point to be *Center*.
Since

Now let's apply the final formulas _{S}_{S}_{C}_{C}_{O}_{O}

Eventually, let's plot the calculated coordinates to our screen of dimensions 250*Center* (representing the *Focus*):

By now we are able to move the "action center" from one spot to another simply
by redefining the Focus, i.e. by racalculating the coordinates
_{O}_{O}_{W}=2_{W}=0_{W}=1

_{O} = (_{F}_{F}

_{O} = (_{F}_{F}_{F}

It can be observed, that _{O}*Focus* optically is on the same height as A1's *Focus*.
But, x_{O} did change by *Focus*
is 84 pixels more to the right now (which means, that the "world moves" to the left by that amount).
Note, that

The screen's *Center* does not change, of course
(unless we also changed the screen's dimensions).
Therefore the only remaining task is to recalculate the new coordinates
_{S}_{S}

Note, that in a real application we probably would not want to just "jump" to a new *Focus*,
but scroll smoothly from one location to another.
Also, we wouldn't recalculate points which are both visible before and after the transition,
we would merely move them to their new position,
only calculating new points as they are "moved in" at the according edges.

From time to time the need may arise to redefine the screen's *Center* as well.
This is needed, when the screen's dimensions are changed
(for example, when the user resizes the window in which your application renders the world).

Let's assume, that your user wishes to make the window taller and resizes its height
from 100 to 150 pixels. Although he most likely will do so by modifying either
the window's top or bottom edge (but not both at the same time), the impact is such,
that the height difference is applied to both the top and bottom edges simultaneously,
half the height difference at each side: this behaviour keeps the *Center* centered.

The changed dimensions lead to a recalculation of the
_{C}_{C}

_{C} = ScreenWidth / 2 = 250 / 2 = 125

_{C} = ScreenHeight / 2 = 150 / 2 = 75

Recalculating all coordinates _{S}_{S}

Also note, that oftentimes it is not necessary to really recalculate all the world coordinates: the points which are already present usually would simply be moved to the new center by shifting them horizontally or vertically as required (in our case 25 pixels downward). Then we only need to calculate the points which are in the newly exposed window parts (and even this part can be omitted if the dimension change results in a shrinking window).

So far we were working with flat surfaces, i.e., elevations of all points having
the same height (namely

First of all there is a point of view that each point on the grid has a dedicated elevation. Let's call such a point a grid point. Grid points are shared by usually 4 surrounding tiles (2 only if the grid point is alog the whole landscape's edge, 1 if it's one of the landscape's corners). And then there's the possibility to define an individual elevation for each of all the tiles' 4 corners. Let's call those tile corners. The following pictures visualize the two points of view:

These two different point of views both have their advantages and disadvantages. If we do work with grid points, the most obvious advantage is, that we only need to store a minimum amount of elevation data: in particular, storing 1 elevation per tile suffices, as the other 3 corners can be derived from the elevations of the 3 neighboring tiles (since they are shared points). The downside is, that with grid points we can not handle true vertical structures (e.g. cliffs): to define a true vertical structure at a given point in the x/z plane, we in one way or the other need to provide 2 different heights for the y dimension. The tile corners approach is one such way.

Catching up the previous statement that we "need to provide 2 different heights for the y dimension" could attempt one to think, that we do not really need to store an elevation for each corner of every tile, because this only would duplicate information available anyway from elsewhere. In fact, at first glance it seems to suffice to take the grid point approach, but to store 2 elevations for a single corner of each tile: a lower and an upper elevation. When there is no vertical structure at that point, then the two values will be identical, else their difference tells about the height of the vertical structure. Let's call those two elevations upper elevation and lower elevation. The following picture shows the two elevations for the topmost corner of cube B1 (and with that implicitely also the elevations of the according corners of the surrounding cubes):

There's a major flaw with this approach, though: the interpretation is ambigeous. Let's define some elevations to make this clear:

It is clear, that B0 has an elevation of 1 and B2 a such of 2. But how shall B1 be interpreted? There are two possibilities. Let's connect the lines along the x axis:

The same ambiguity occurs along the

Therefore, if one needs to represent true vertical structures (such as cliffs in a landscape), there is no way around the tile corners approach (fig. 26).

However, note that there is not always the need to represent true vertical structures.
This is particularly the case, when our DEM delivers just a single elevation for any point.
Among the DEMs having this property belong all techniques,
which obtain their data by measuring "as far as they can see"
(i.e. the closest obstacle defines the elevation), but not beyond that point.
A prominent example is the NASA's

So far, when working with elevations in a flat landscape, we always implied that the base of the elevation was the bottom of our cubes. More precisely: we assumed that there was a base elevation of 0, upon which we erected structures (cubes). This assumption won't change in the further discussion, but it might be worth to point out some properties of the elevation base to have it defined properly:

Probably the most interesting single aspect of these definitions is, that an elevation can be negative. If, however, this property is not desired (for instance because our data structure only allows for positive values), then it is trivial to redefine the base such, that its lowest point translates to 0. Of course, this translation comes at a cost, as we have to find the lowest point first, usually requiring to scan the whole DEM in a preparatory step.

Now that we have aquired a clearer understanding about the term elevation, let's have another look at the formulae to represent any given DEM coordinate.

_{S} = (_{W}_{W}_{O}+x_{C}

_{S} = ((_{W}_{W}_{W}_{O}+y_{C}

The blue term _{W}*Focus*?
That elevation is hidden in the _{O}*Origin*):

_{O} = (_{F}_{F}

_{O} = ((_{F}_{F}_{F}

In _{S}_{O}

_{S} = ((_{W}_{W}_{W}_{F}_{F}_{F}_{C}

We want to concentrate on the two blue terms. Simplifying the formula by highlighting the for the moment irrelevant parts, and then substituting them by capital letters:

_{S} = (_{W}_{W}_{W}_{F}_{F}_{F}_{C}

_{S} = (A-_{W}_{F}

Expanding the formula

_{S} = (A-_{W}_{F}_{W}_{F}_{W}_{F}

shows, that because of
_{W}_{F}_{W}_{F}_{W}_{F}

This is what we wanted to verify. So, are we done?
Well, recall the procedure to find the focus (fig. 19):
there we assumed, that the surface area of our focus tile was flat
(i.e. parallel to the elevation base plane).
Because of this, the vertical center was just

_{O} = ((_{F}_{F}_{F}

(Note, that _{O}

Unfortunately, the surface of the focus tile does not necessarily need to be flat. So, how do we find out the true value in order to redefine our simplified formula? The next section deals with this last aspect in our quest of deriving the final formula.

Let's go back to our cube and examine its surface appearance when given different elevations to its 4 corners. We examine 3 cases with the following elevations:

Although we currently just are looking for a generalization of the focus point, it can safely be assumed, that any non-flat landscape features a multitude of tiles which can not be interpreted unambigeously (i.e. are "type C" tiles). Hence it is imperative to solve the problem not only for the focus tile, but for all tiles which need to be rendered.

We already mentioned, that "Case C" can be interpreted in at least two ways.
This is because it is possible to take better approaches than just guessing
which one of the two diagonals to consider.
For example could we define, that the elevation at the center of the surface
is the arithmetic average of the elevations at the 4 corners.
In our case, this would result in an elevation of

What exactly does this mean now for our formula calculating the focus' center? Let's look at its vertical coordinate component again, the part in question still highlighted:

_{O} = ((_{F}_{F}_{F}

One could argue now, that because the tile was flat and parallel to the base elevation plane,
each corner being at an elevation of

Well, no. First of all, the term _{O}_{F}_{F}_{F}

_{O} = (_{F}_{F}

_{O} = ((_{F}_{F}_{F}

Of particular interest is not only the expression _{F}

Because we want to use the average height of all 4 corners now,
we will replace the elevation _{F}

The quintessence is, that nothing changes but the interpretation
of what _{F}_{F}

What about _{F}_{F}

So, our final formulas for the *Focus* read:

_{O} = (_{F}_{F}

_{O} = ((_{F}_{F}_{F}

Note, that _{F}_{F}

Also note, that the formulae for x_{S} and y_{S} do not change:
they always were referring to a tile's corner, which is assumed to have a dedicated elevation anyway.

The only new element regarding tiles is the fact, that also for them a center point exists now, acting as the common corner of the 4 triangles constituting that tile's surface. The elevation of that point is the average of the elevations of all 4 points.

Thus for the 4 corners of any tile (even the focus tile!) remains valid:

_{S} = (_{W}_{W}_{O}+x_{C}

_{S} = ((_{W}_{W}_{W}_{O}+y_{C}

And since this section is titled "The Final Formulae", let's repeat the (unchanged) formulas
for _{C}_{C}

_{C} = ScreenWidth / 2

_{C} = ScreenHeight / 2