The article doesn't put enough emphasis on the need for gamma correction. If you dither a pure gray area of level 128, it will become approximately 50% white and 50% black. But when you display that back on the screen, because of gamma effects it will look closer to a level of 186!
A few years ago there was a need for 15-bpp or 16-bpp color images on phones rather than the 24-bpp images we usually work with, and dithering was a great way of producing them. No idea how much need there is today though.
Lots, if you're into printing. Banding that's invisible on a monitor can become a monster when printed. An expensive monster mistake if you're printing a poster sized giclee.
This is a big reason why Photoshop dithers by default when going from a higher range down to 8 bits per channel.
Another reason the need for dither might be even increasing today is that digital cameras have only very recently gotten better than an 8 bit signal to noise ratio. We've had 16 bit RAW formats for a long time, but most cameras still had noise that was greater than 1/256. I tested one of my old Canon EOS cameras and found even in very well lit conditions, it's signal was a little over 5 bits, even though I could get a 16 bit image file out.
With the newest cameras, the signal to noise is sometimes exceeding 12 stops, so photographers might be running into banding issues in RAW images more often. Except a lot of them never notice because Photoshop quietly handles it.
It would be neat if more people used “hexagonal” (i.e. in an equilateral triangle grid) pixels for capture, processing, display, printing, etc.; with higher and higher resolutions and fast GPUs (so that high-quality resampling can be done on the fly) it should be entirely possible to swap out individual parts of the pipeline without losing quality or performance (at the expense of a bunch of work to implement something nonstandard).
A lot of artifacts are easier to deal with on a hexagon-pixel grid. Halftoning and error-diffusion dithering work really well.
Ooh, interesting thought! I think I might have used a hex image to make a game at some point, but other than that I have little experience with them. Do you think the overall benefits of hexels would make it worth trying to standardize the pipeline?
I'd guess that the need for error diffusion and half-toning might wane in the future. (Not sure about that, but that's what I'd guess.) Are there other benefits you get with hex images?
The big advantages are (a) they’re much more isotropic, (b) there is less effect from grid artifacts (reduced moiré patterns, etc.), (c) they are noticeably more efficient in terms of sampling rate for a given quality of signal. If everything is done on such a grid end-to-end, you get quite noticeably better quality results for just about everything (with the exception of slightly worse results for drawing sharp grid-aligned rectangles). Even swapping in such a grid in one or another part of the pipeline can usually net some advantage.
The downside is that there are a zillion implementations of everything you can imagine w/r/t a square grid, whereas implementations on a triangular grid of hexagonal pixels are fewer and further between, often some crappy matlab code written by a grad student, so you’d maybe need to reimplement, optimize, debug, etc. a whole pile of tools, or even invent some new algorithms for doing stuff that hasn’t been tried on such a grid before.
It’s similar to the advantages you get from tri-axial fabric: it’s better but requires custom machinery, so in practice is only used for bullet-proof stuff, space suits, and art projects. Hexagonal pixels end up getting used in medical imaging, astronomical imaging, and the like, where pixels are individually expensive.
I think Fuji makes some cameras with a hexagonal grid sensor. All our algorithms are based on a square grid though, so it makes it very tough to work with unless you do a conversion.
Dithering is still needed for most of sRGBs range, Frontier needed to do it in Elite Dangerous where banding was noticeable.
This paper: https://www.cl.cam.ac.uk/~rkm38/pdfs/boitard14hdr_colorspace...
compares the bit depth necessary for contouring due to quantisation to become imperceptible across a few encodings, most of them in the 10-11 bits per channel area.
Thanks for the link. Some years ago I was shown a simple gradient that had noticeable banding on an 8-bit display; until that point I had believed that 8 bits was just beyond our ability to discriminate.
24bpp is not enough for shallow gradients which don't come from a photographic source (especially in dark images). I frequently dither 48bpp material for 24bpp displays.
I typically use Sierra Lite.
I've recently been experimenting with error diffusion along continuous space-filling curves (I forget where I saw this technique first), and using color difference functions (CIEDE2000) to gauge quantization error. These are overkill, but it sure beats doing it the same boring way forever.
Frankly, I would like it if the CRTCs attached to modern GPUs had some ability to apply dither (even just ordered dither) to buffers as they're scanned out, while exposing more precision to applications.
Unless you are integrating many pixels (how many and with what weights depends on the future processing, intended output, and viewing conditions for the image) I would not expect using a fancy color difference formula for comparing individual pixels to give a particularly useful result. Indeed, the whole point of dithering (especially with high resolution output) is that we can adjust the average light coming from a several-pixel-wide blob by brightening or darkening individual pixels, and thereby make finer distinctions in light level, since the individual pixels are not really noticed. I’m also pretty skeptical about the space-filling curve thing. Those both sound like placebo features to me.
The Butteraugli score between the standard lena and the Riemersma dither is about 0.5 better than the standard Floyd-Steinberg one.
Approaches like this come closer to back-propagation of errors, the lack of which is where some of the artifacts of unidirectional dither come from. In practice, any decent (full bleed) error diffusion dither works perfectly fine going between 16 and 8 bit planes, but why shouldn't I try some things out which (as you are quick to point out) may have very little value.
Heck, even gamma correction isn't strictly necessary when you're going from 16 to 8 bit sRGB.
I don’t know enough about Butteraugli (in particular in the range of very very tiny differences in the image) to know whether it’s a reasonable metric for comparing dithering algorithms.
Why do you think this Riemersma method does better at that metric? Maybe it just keeps the diffused errors spatially closer to the original pixel?
I could see it making a difference in average color of very noisy regions, especially brightly colored ones. I suspect you could deliberately construct an image where the difference could be seen by human observers.
You don't need error diffusion when your target color space is greater than 8 bits per channel. You can just use a random number generator (or sequence).
Why exactly? Is there a threshold of bits per channel / resolution, why at 8 bits?
I can understand that it would become less useful, because error diffusion does less if you have more levels. But (I expect) it's still useful at 6 bits (64 values), is 8 not just an arbitrary treshold?
It's because with 8 bits you can't really see the dither pattern itself once a 1-bit difference is at the human threshold of perception.
8 bits is not just an arbitrary threshold, it's the number of bits you need so that a 1 bit difference is just about the "just noticeable difference", assuming you've handled gamma and color reasonably well. 9 bits is more lightness resolution than you can normally see, and 7 bits is too few. If you want to geek out super hard, check out figure 238 here: http://www.telescope-optics.net/eye_intensity_response.htm
The JND of Weber's law is 0.4 - 0.5% for print and monitors most of the time, which is between 1/200 and 1/250. This is why 8 bits is so super popular, it has a 1-bit step of 1/256, which is normally just barely below what we can see with our eyes.
You may need to dither at 6 bits, banding could start to become pretty obvious. Whether you need error diffusion, I don't know, I doubt it would always useful at 6 bits, but I'm sure it would sometimes. I'd say try it on some images and compare to random dithering. Random dithering only sucks on really small palettes like 2 or 16 total colors. It's not that bad on a 256 color gif, and it becomes hard to tell the difference when you're above 4 bits per channel or 12 bits per pixel. 6 bits per channel is decent color resolution, and random will sometimes be good enough. If it's a photograph from a digital camera that's older than a year or two, chances are pretty good you can threshold the image without even dithering, because it's already noisy enough.
There's a really good technique for getting rid of the wormy or snake-like textures in Floyd-Steinberg: add another term to the threshold proportional to (some monotonic function of) the distance to the nearest preceding dot. This tends to make the dots very nicely spaced. These ideas were, among other places, used in Gutenprint.
Dithering trades sample resolution for sample frequency to convey the same information. For images, the underlying sample frequency usually doesn't change but the apparent spatial sample frequency becomes lower in order to achieve more than one effective bit per (coarser) sample.
For one-dimensional signals like audio, the underlying sample frequency is usually increased while decreasing the sample resolution (often to 1 bit per sample). This keeps the effective Nyquist frequency where it needs to be while pushing noise much higher in frequency where it's very easy to remove with an analog filter. Delta-sigma modulation is perhaps the most common method of audio dithering (although the delta-sigma literature rarely uses the D word).
The reason images and audio usually use dithering in opposite ways is that images are usually post-processed for lower true sample resolution (and higher effective sample resolution) after sampling, while audio is often sampled initially at much higher than the Nyquist frequency because dithering is a planned part of the audio processing chain. But not always! Those are merely common use cases.
You can trade sample accuracy for sample rate. This is how I directly interpret ordered dithering. The following line on the wikipedia article communicates (maybe) this:
"The size of the map selected should be equal to or larger than the ratio of source colors to target colors. For example, when quantizing a 24bpp image to 15bpp (256 colors per channel to 32 colors per channel), the smallest map one would choose would be 4x2, for the ratio of 8 (256:32). This allows expressing each distinct tone of the input with different dithering patterns."
Same idea applies to 1D signals (from https://www.wikiwand.com/en/Delta-sigma_modulation :
" used to convert high bit-count, low-frequency digital signals into lower bit-count, higher-frequency digital signals")
Dithering noise, i.e. the difference between the original signal and the dithered signal, needs to be considered in "frequency space" like any other kind of noise.
Dithering noise has the special feature of depending very strongly on the sampling rate and (statistically) very weakly on the input signal being dithered.
Dithering allows replacing high-accuracy sampling (typically constrained by computer word lengths) with high-sample rate sampling (typically cheap): without using more accurate arithmetic, analog and often completely free lowpass filtering (such as looking at a dithered image from far enough) makes N samples, with M bits of accuracy each, equivalent to one sample with M+log N bits of accuracy.
You can indeed trade sample accuracy for sample rate as long as information is not lost -- and in dithering, information is preserved by diffusing the quantization error spatially.
Look at the image you posted; on the right, the effective "pixels" have become larger while each has acquired more gray levels. That's a lower sample frequency traded for higher quantization resolution.
You're thinking of error diffusion, not the dithering itself, they are two separate things. Error diffusion is what spreads samples out spatially.
Audio and image dithering are the same thing, they only become different when you add error diffusion.
The modern image dithering that you use when going from an HDR image down to an 8-bits per pixel PNG or JPG uses exactly the same (very simple) algorithm you'd use to dither an audio file during mastering.
Fair enough. I forgot about noise dithering, which only preserves error information statistically rather than exactly.
I don't understand your point about audio and image dithering being different when error diffusion is present. You can use error diffusion on both; in delta-sigma audio modulation, the error is diffused in only one dimension but it's still error diffusion.
Yes, absolutely right. Just making clear the distinction between dither and diffuse. Dither is always a local operation on a sample, diffuse is the spatial spreading to make the neighborhood more pleasant for humans.
I was reacting to the statement "images and audio usually use dithering in opposite ways". The article here is about error diffusion, but it's pretty hard to say whether adding error diffusion to the dither process is the 'usual' choice, for either images or audio. It's very very common to not use error diffusion in both cases.
Diffuse is more much important when the destination resolution is well above human perceptible just-noticeable differences. For example, going from 8 bits per channel to 1 bit per pixel, or for audio when going from 16 bit samples down to 8 bit samples.
I have never used a laser printer that did implement anything like this, the usual halftoning sucks. Once I handcrafted a file with a 600 dpi Floyd-Steinberg image (the native resolution of the printer I hadd) and it resulted in much better results, I didn't bother calibrating the gray levels though.
I came to the same conclusion just this week. I got the best print simply by matching the density of my black and white laser printer with the black dots I wanted it to print.
Specifically, I used the ImageMagick operations "-dither FloydSteinberg -monochrome" for more contrast or "-dither FloydSteinberg -remap pattern:gray50" for more fidelity to the original image.
I was surprised that the prints were better without adjusting for gamma. If I converted the image to a linear color space before the dither operation, the print came out too dark. I'm guessing that the gamma in the non-linear color space compensated for the dot gain on the printer to cancel out the effect.
> I was surprised that the prints were better without adjusting for gamma.
Ha, it was the same for me. I used imageworsener where I had to explicitly specify -nogamma for this effect.
An other place where wrong gamma handling accidentally works is text anti-aliasing. It turns out that for small font size the wrong handling of colorspace results in more readable text than the gamma-correct method only for dark text on light background [1]. No wonder people don't like to use light fonts on dark background (like terminals).
What a coincidence, I just stumbled upon this article a few hours ago while trying to find out which error diffusion algorithm is used by Photoshop out of curiosity.
I've been playing with dithering recently to create braille art[0] and this series of articles[1] by the libcaca developers has been a huge help. It also goes over model based dithering algorithms which tend to give the best results.
Yep, Photoshop does dithering by default when converting from 16bpp (or higher) to 8bpp. Many, many image processing apps & libraries don't do this, and I wish they would!
This is a little different than the article though -- all the article's dithers are good at half-toning, or 1bpp images. If your destination is 16bpp images, or 16-bit audio, you don't need error diffusion, just a decent sequence of random numbers.
Just to add to what @grenoire said -- if you only shift/truncate then there are times when can see (or hear) the difference between 1 bit. You can truncate, numerically speaking, but depending on what you're doing, the result won't be very good.
When taking an image from a higher range down to 8-bits per channel, there are two specific cases where you tend to see bad banding problems. When doing high quality poster prints like a giclee or photo print, the printer's color gamut is drastically different from a monitor, so banding that you don't notice on a monitor can suddenly become a plainly visible eye sore in a print. This is a big problem if you're paying $100 for your large format print.
The other problem is when resizing an image down. The resampling process smoothes out noise in an image. Normally, a DSLR photo has a signal to noise ratio that is less than 8 bits, meaning even 16 bit RAW photos have noise that is stronger than the lower 8 bits. You can usually truncate the lower 8 bits without noticing. But if you resample the image and downscale it, the noise is smoothed and banding can appear. This can happen with photos of any large flat field of color, like a wall or the sky.
The article mentions ordered dithering but fails to list void-and-cluster and similar variants thereof. Those parralelize really well (unlike error diffusion), don't result in obvious patterns (unlike normal ordered dithering) and can be run on gpus. It's quite useful to dither high bit depth videos to 8bit in realtime. Dithering HDR content has the benefit of not introducing banding on SDR displays.
Ordered dithering has also a very important feature : it's stable wrt. animations. If you try moving a single pixel in an animation using floyd steinberg, the error spread can be huge and has visible effects on the whole image. OD, however, has a limited scope and is much more resistant to animation.
Seeing a dithered image really brings me back to the mid 90's. Seeing a dithered photograph reminds me of the early web or AOL. You used to remove colors from GIFs to save space on web pages, as few as you could as long as the image was still tolerable.
With a 1MB SVGA card, you could pick between 16-bit color at 800x600, or 8-bit (256 colors) @ 1024x768. Did you value higher resolution, or not having to palette shift every time you switch apps?
Indeed, the dithered image looks way too bright. It should be converted from sRGB gamma to linear light before dithering to black/white. Also, will need more than 8bits per channel if you hold the linear image as an intermediate result, otherwise you get posterization in the dark areas before the dithering even starts.
A few years ago there was a need for 15-bpp or 16-bpp color images on phones rather than the 24-bpp images we usually work with, and dithering was a great way of producing them. No idea how much need there is today though.