Focus Stacking

Whenever you take a photograph (this part also applies to movies and video), some things are in focus and some aren't. The range (from foreground to background) of what's in focus is referred to as "Depth of Field". It's a concept as old as photography. Pinhole cameras were actually fairly good. The bigger the aperture (hole) where the light comes in, the smaller the depth of field. If you strive to have everything in focus like I do the usual approach is to use a smaller aperture. But, first of all you get less light, which means the shutter has to be open more time to receive the same amount of light. If something is moving or might move that's a problem. Also every lens has a "sweet spot" in aperture, usually about in the middle of its range, where everything is sharper. At small apertures diffraction (bending) of light causes a problem.

So in this age of digital photography what becomes practical is to take multiple photographs and combine them. There are 2 ways of doing this, either take them at different focus settings, or move the camera or subject closer and farther apart. If you move the subject the lighting on it is going to change, so that's the least practical approach. If you change the focus you're also changing the focal length of the lens, or how zoomed in is, to some degree. So the best approach for small subjects is to have the camera mounted on something that lets you move it forwards and backwards. But this is expensive since mostly a screw or motor has to move the camera, and it's only practical for small subjects, since the range of movement has to be at least equal to the depth of the subject. For really small subjects (think of a dead bug) with a lot of magnification you get a lot less depth of field so you may need 100 or more exposures.

Sometimes a picture has a natural boundry like the edge of a cliff, and you can painstakingly select everything up to that boundry from one image (with feathered edges) and combine it with everything beyond the boundry from the other image. Usually it's not perfect and you can spend hours getting that selection made, working at high magnification picking a few pixels at a time. Think of an image that's on the order of 6000 by 4000 pixels and you'll see what I mean.

We have (as of 2016) several ways of turning colors to numbers so computers can deal with them. The most common for monitors is RGB or red green blue. In 24-bit color, like most JPEGs, 8 bits is used for each of those primary colors so it can range from 0 to 255. But computers are mostly arranged in multiples of 32 bits, so 24 is awkward. Somebody hit on the idea of having 32 bit color with the fourth position taken by an alpha channel to use the space efficiently. Alpha in this context refers to transparency, different pixels have a color plus a transparency value from 0 to 255.

Part of the challenge in manual focus stacking is deciding what's in focus. In statistics there's a number called Standard Deviation which boils down to being the average deviation from the average. For something out of focus the pixels will be closer to being the same than for example a well focused picture of a tree. So we come up with an algorithm that converts standard deviation to alpha. The more blurred together pixels are, the more transparent we make them. The well-focused areas (mostly) will have more contrast, or difference from one pixel to another, so we make them less transparent. Put many layers together in a stack and the least transparent areas will be the ones that show up most. When you consider that some parts of each layer are more in focus than others what happens is almost like a voting process. When it works well, the whole resulting picture appears to be in focus.

It doesn't always work perfectly. Smooth objects that are all solid colors won't have a very high standard deviation anywhere so it's hard for the algorithm to decide how transparent to make it. But having the whole process automated not only saves a lot of time but it means the spacings between the camera or focus positions isn't very important. Because the standard deviation is judging the resulting images you can focus on a few areas at approximately equal spacings and the result has a good chance of being correct. And as for the zoom factor caused by changing the focus or distance, that's easy enough for software to correct, it just resizes what it needs to.

In the unix world, as of 2016, we don't have handy Photoshop plugins because we don't have Photoshop1. So some us us make do with what free software is available. dcraw is the standard program for converting RAW images. Given an argument of -T it will make TIFFs which are excellent. align_image_stack, which is part of Hugin is probably the best method for aligning the images prior to joining, it can also handle the minor resizing needed for magnification differences. enfuse is the program that actually combines the aligned images. Align_image_stack and enfuse are also used in creating panoramas with Hugin, as of 2016 the Hugin GUI doesn't handle focus stacking (yet).

So does this mean you have to spend hours learning command line syntax? Get serious, you just script it. What I've come up with in the last few days is:

# Call like ./process stack4
dcraw -T *.nef
align_image_stack -p $1.pto -v -m -a algn_ dsc*.tiff
enfuse --output=$1.tiff -v --compression=DEFLATE --exposure-weight=0 --saturation-weight=0 --contrast-weight=1 algn*.tif
Which everyone will probably want to make changes to, so go ahead. I just got tired of looking stuff up so I put these commands together. Stick copies of the RAW files that are parts of the stack into a directory by themselves. Add a copy of this script, I called it process. Pick an equally brilliant name like stack1. Run it like "./process stack1". The name you use replaces the $1 in the script so you'll end up with that name .tiff. Just don't put files from a different stack in the same directory or you'll have a mess. If you shot multiple stacks you could write something to call this script in different directories and do your processing overnight, no time spent clicking and waiting. All of the commands have their own man pages. I put fullsize 6000 x 4000 RAW files through it and it's not fast, but my computer's 14 years old.

These 8 little images aren't hyperlinked, the bigger ones on the page are.

I did actually create this simple script next, I was focus stacking 10 scenes for a panorama. It ran for an hour and a half, successfully except there had been too much wind that day so I had trouble with tree branches moving. 3.2 GHz Pentium 4, single core but hyperthreaded, 3.5 GB of RAM, 6000x4000 images. 3-4 images per rotation angle, watch your hard drive free space if you do this. The TIFFs coming out of dcraw were about 70 megabytes each and there were 30+ of them, plus more big TIFFs made by align_image_stack, and a final TIFF for each rotation angle. I was watching and deleted the files it was done with every few minutes. 10 angles in the panorama * 3 pictures per angle * 70 MB each = 2.1 GB. In most cases align_image_stack made modified versions of the TIFFs, so double that. Add another 70 MB for each rotation angle and it's close to 5 GB.

cd angle1
./process angle01
cd ../angle2
./process angle02
cd ../angle3
./process angle03
cd ../angle4
./process angle04
cd ../angle5
./process angle05
cd ../angle6
./process angle06
cd ../angle7
./process angle07
cd ../angle8
./process angle08
cd ../angle9
./process angle09
cd ../angle10
./process angle10
cd ..
echo Done

I copied my process script into each directory, putting one copy somewhere in my path might have worked instead. But notice how small and simple these scripts are compared to megabytes of slow GUI programs. And you can start it then do something else while it runs, you aren't stuck sitting there waiting for it because you need to click all the time.

I do see a problem with walking away though. In some cases align_image_stack doesn't make a set of modified TIFFs so if the script tries to use the algn*.tif output by that and they don't exist there won't be any output file. But that doesn't happen often and you can just re-run that set.

The original way of getting more depth of field was to use a smaller aperture (bigger F stop) like this. But there's a second disadvantage, which is that it lets in less light so you need a slower shutter speed. And at least in the old way of doing things one f-stop click one way was equivalent to one shutter speed click the other way. If you were using 1/125 second at f/8 you could get the same amount of light by using 1/60 second at f/16. Each click doubled or halved the amount of light but with digital and oddball stops and speeds like 1/13 second that's probably not quite true anymore.

Shooting a ruler like this demonstrates depth of field nicely and oddly it just worked out, that is the frame was tall enough to see the lack of focus at even f/36. I just set up camera, tripod, ruler and shot 6 times with different f-stops then selected, copied and pasted to get them into one picture and added the numbers at the bottom.

F-stop is defined as the ratio of the diameter of the aperture to the focal length of the lens. It's the fact that it's a ratio that explains the / in numbers like f/8. See  With most camera lenses the "sweet spot" will be in the center of the f-stop range. Counting clicks then using the center one has a good chance of working. You can vary it of course but often you won't have a strong preference. I normally set the camera to aperture priority and use a tripod so I'm not worried about long exposure times. I used my Nikon D5200 with kit 18-55mm for a year mostly on f/8 then discovered that f/10 is sharper.

Focus stacking is best done indoors where there's no wind. The image on the left was a stack of 8 images but there was movement between them. The right image has just the back and front images.

1There once was a Photoshop for Linux about 1990 but it evidently didn't earn much money so it was discontinued. Somewhere I have a copy of Photoshop 3 which fits on one floppy disk too. We have the GIMP which tries to serve the needs of photographers and artists in one program. Some features it doesn't have, but I've been using it exclusively for 7 years and barely remember how to run Photoshop. ImageJ is also an option.

AB1JX / calcs