Chris
02-11-2006, 12:20 PM
Image Resizing
--------------
ImgSource v4.0 offers 8 different resizing functions. These functions cover all of ImgSource's supported pixel formats (and then some), and a total of 18 different resizing algorithms. ImgSource can resize 1-bit, 8-bit (with palette), 8-bit grayscale, RGB-24, RGBA-32, 1-bit grayscale RGB-48, RGBA-64, 1-bit to 8-bit grayscale, 8-bit to RGB-24, and more.
The 18 resizing algorithms fall into five basic categories: Nearest Neighbor, Bi-linear Resampling, Bi-Cubic Resampling, Area Averaging and Filter-based. I'll describe the general idea, and the pros and cons to each algorithm. But first…
Resampling
----------
In image processing, the act of resizing a stream of data is known as resampling. To resample a signal, you take samples of the data source at specific intervals then apply algorithms to that data to produce a new signal that approximates the original, but which uses a different amount of data. In image processing, we're generally trying to resize a two dimensional image in such a way that the resized image resembles the original as much as possible, given a limited amount of time and memory. The goal when reducing images is to preserve the character of the source data, but with fewer points. The goal when enlarging images is to create an image which has more data than the source image by creating new data.
To illustrate these algorithms, I'll use some one-dimensional examples. When you read them, note that "Ns" stands for the number of source samples (ie. pixels), and that "Nd" stands for the number of destination (ie. output) samples: ex. if Nd = 10, there are 10 samples in the output.
Nearest-neighbor:
----------------------------------------
The easiest way to resample anything is to pick every Floor(Ns/Nd)th sample out of the source: ex. if Ns = 10 and Nd = 5 (that is, you want your output data to be 1/2 the size of the source), you will pick every 2nd data point from the source. The math is simple, and so this is a very fast operation. As with most processes that is both quick and simple, the results are not the best. This method ignores too much of the source data to accurately reproduce it. But, if you a resized image very quickly, and aren't particular about accuracy, this is a good choice.
When reducing, this method creates jagged, grainy images. And when enlarging, because it picks the same source point for multiple output points - this leads to the dreaded "fat pixel" effect.
Examples:
1. Assume Ns = 9 and Nd = 3 (reducing by 3x), the source data is:
Src0 Src1 Src2 Src3 Src4 Src5 Src6 Src7 Src8
And output data will be:
Src0 Src3 Src6
2. Assume Ns = 5 and Nd = 20 (enlarging 4x), the input data is:
Src0 Src1 Src2 Src3 Src4
And output data will be:
Src0 S0 S0 S0 S1 S1 S1 S1 S2 S2 S2 S2 S3 S3 S3 S3 S4 S4 S4 S4
Note how the source pixels repeat. Ideally, you'd want the second, third and fourth data points to be some blend of Src0 and Src1, and not just a strict copy.
Strictly speaking, when enlarging with this method, you are in fact preserving the data in the original image exactly: every source point is used in the destination image. And, you are not introducing any new data into the new image. But, the results are not visually pleasing. More sophisticated techniques can get rid of the fat pixel effect by calculating what the data between source pixel SrcN and SrcN+1 could be.
Bi-linear resampling
------------------------------------
This is a simple technique. And here it is:
For every destination pixel, find the location of the ideal source pixel by using the Ns/Nd ratio, as above. But this time, don't use the Floor function; instead, preserve the fractional information.
Example: if Ns = 5 and Nd = 15 and we're trying to find the 2nd point in the destination, the ideal source point is at 0.666 (2 * (5 / 15) = 0.666). Since we can't address data at fractional locations, we'll do a simple weighted average of the two data points closest to our ideal location: 0 and 1. Our destination pixel is then 0.666 of the data at point 0 and (1.0 - 0.666) of the data at point 1.
This technique is called linear interpolation. If you were to graph the two data points used in the calculation above, with a line between them, the ideal data value will be somewhere on that line. Bi-linear means that you do it twice - once horizontally and once vertically. The math is the same in either direction.
Using this technique on image data gives results that are far better than the nearest neighbor technique. It greatly reduces the fat pixel effect when enlarging, and gives nice smooth images when reducing. It's also very fast.
Bi-cubic resampling
------------------------------------------
Cubic interpolation is similar to linear interpolation in that you use your existing data to come up with an equation to model that data so that you can make an educated guess at what other points in that data set will be. Linear interpolation uses two data points to generate a simple line, and you pick your destination data from that line; in cubic interpolation, you use four data points to generate a 3rd degree equation (of the form x3 + x2 + x + c) and pick your destination data from the curve. This makes the math much more complicated, and I won't try to explain it.
In linear interpolation, you use two source data points to find one the destination point. In cubic interpolation, you use four source points to find one destination point. So, each destination point is created from twice as many source points, and a cubic equation can model more sophisticated relationships than a simple line can.
The image results from bi-cubic interpolation (cubic interpolation performed in both X and Y directions) are often sharper than bi-linear interpolation because of the higher accuracy possible with 3rd degree equations and the higher number of source pixels used. The downside is that the calculations required to do this are more time-consuming than previous methods.
Area-averaging
------------------------------
This technique is much different from the resampling techniques described above. In those techniques, the number of source points per destination point was fixed by the requirements of the math: 2 for linear resampling, 4 for cubic resampling. The intent of this algorithm is also different. The resampling techniques are designed to reproduce or mimic the source data as closely as possible; area averaging is designed to find the average data value in a given range of data.
The way it works is simple and intuitive: if you have Nd output pixels, just divide the source data into Nd regions and calculate each destination point as the average data value from the corresponding source region. Every data point in the source contributes equally to the output: and a destination pixel is the average of all source pixels that it represents.
Unfortunately, this technique can only be used when reducing images. If Ns < Nd, the source regions end up being less than one pixel each and the technique becomes equivalent to the nearest-neighbor algorithm. But, for reduction, this can give excellent results, and it's very fast.
Filter resampling
-----------------------------------------
ImgSource provides 13 different filter-based resizing methods. They all work like this: for each pixel, a number of neighboring pixels is combined in a weighted average to form the output pixel. In each method, the weightings for the pixels are determined by a different function, and each method uses a different number of neighbor pixels.
The results from these different methods vary dramatically. Some of the methods are probably useless for any purpose, but other methods give the best results of all the ImgSource resizing methods. But, as with many things, you pay for the better results with long calculation times. The better algorithms (ex. Catrom, Lanczos, Mitchell) sample a large number of source pixels for every output pixel, and that takes time.
The actual number of pixels sampled depends on the method and the ratio of Ns/Nd, and in general reduction is more costly than enlarging.
Here's an example: if you are using the Lanczos3 method, and reducing the image by 1/3, the algorithm will use eighteen source points for every output point - in each dimension. That means it does a weighted average of 324 source pixels per output pixel. The Sinc method is even more costly: the same 1/3 reduction will use 576 source pixels per output pixel. Compare that to four source pixels per output pixel for bi-linear resizing and one source per output pixel for nearest-neighbor, and you can see why the filtering methods are much slower. On the bright side, enlarging with these methods is less costly: Lanczos3 uses 9 source pixels, regardless of output size, Sinc uses 16, etc..
But if you need excellent results and time is not critical, though, these methods are worth the time.
Bottom Line
-----------
For most purposes, the simple methods (bi-linear, bi-cubic and area-averaging) will give respectable results. If you can afford to wait, some of the filtered methods can give outstanding results. The method you choose just depends on how long your users are willing to wait vs. the quality the need.
--------------
ImgSource v4.0 offers 8 different resizing functions. These functions cover all of ImgSource's supported pixel formats (and then some), and a total of 18 different resizing algorithms. ImgSource can resize 1-bit, 8-bit (with palette), 8-bit grayscale, RGB-24, RGBA-32, 1-bit grayscale RGB-48, RGBA-64, 1-bit to 8-bit grayscale, 8-bit to RGB-24, and more.
The 18 resizing algorithms fall into five basic categories: Nearest Neighbor, Bi-linear Resampling, Bi-Cubic Resampling, Area Averaging and Filter-based. I'll describe the general idea, and the pros and cons to each algorithm. But first…
Resampling
----------
In image processing, the act of resizing a stream of data is known as resampling. To resample a signal, you take samples of the data source at specific intervals then apply algorithms to that data to produce a new signal that approximates the original, but which uses a different amount of data. In image processing, we're generally trying to resize a two dimensional image in such a way that the resized image resembles the original as much as possible, given a limited amount of time and memory. The goal when reducing images is to preserve the character of the source data, but with fewer points. The goal when enlarging images is to create an image which has more data than the source image by creating new data.
To illustrate these algorithms, I'll use some one-dimensional examples. When you read them, note that "Ns" stands for the number of source samples (ie. pixels), and that "Nd" stands for the number of destination (ie. output) samples: ex. if Nd = 10, there are 10 samples in the output.
Nearest-neighbor:
----------------------------------------
The easiest way to resample anything is to pick every Floor(Ns/Nd)th sample out of the source: ex. if Ns = 10 and Nd = 5 (that is, you want your output data to be 1/2 the size of the source), you will pick every 2nd data point from the source. The math is simple, and so this is a very fast operation. As with most processes that is both quick and simple, the results are not the best. This method ignores too much of the source data to accurately reproduce it. But, if you a resized image very quickly, and aren't particular about accuracy, this is a good choice.
When reducing, this method creates jagged, grainy images. And when enlarging, because it picks the same source point for multiple output points - this leads to the dreaded "fat pixel" effect.
Examples:
1. Assume Ns = 9 and Nd = 3 (reducing by 3x), the source data is:
Src0 Src1 Src2 Src3 Src4 Src5 Src6 Src7 Src8
And output data will be:
Src0 Src3 Src6
2. Assume Ns = 5 and Nd = 20 (enlarging 4x), the input data is:
Src0 Src1 Src2 Src3 Src4
And output data will be:
Src0 S0 S0 S0 S1 S1 S1 S1 S2 S2 S2 S2 S3 S3 S3 S3 S4 S4 S4 S4
Note how the source pixels repeat. Ideally, you'd want the second, third and fourth data points to be some blend of Src0 and Src1, and not just a strict copy.
Strictly speaking, when enlarging with this method, you are in fact preserving the data in the original image exactly: every source point is used in the destination image. And, you are not introducing any new data into the new image. But, the results are not visually pleasing. More sophisticated techniques can get rid of the fat pixel effect by calculating what the data between source pixel SrcN and SrcN+1 could be.
Bi-linear resampling
------------------------------------
This is a simple technique. And here it is:
For every destination pixel, find the location of the ideal source pixel by using the Ns/Nd ratio, as above. But this time, don't use the Floor function; instead, preserve the fractional information.
Example: if Ns = 5 and Nd = 15 and we're trying to find the 2nd point in the destination, the ideal source point is at 0.666 (2 * (5 / 15) = 0.666). Since we can't address data at fractional locations, we'll do a simple weighted average of the two data points closest to our ideal location: 0 and 1. Our destination pixel is then 0.666 of the data at point 0 and (1.0 - 0.666) of the data at point 1.
This technique is called linear interpolation. If you were to graph the two data points used in the calculation above, with a line between them, the ideal data value will be somewhere on that line. Bi-linear means that you do it twice - once horizontally and once vertically. The math is the same in either direction.
Using this technique on image data gives results that are far better than the nearest neighbor technique. It greatly reduces the fat pixel effect when enlarging, and gives nice smooth images when reducing. It's also very fast.
Bi-cubic resampling
------------------------------------------
Cubic interpolation is similar to linear interpolation in that you use your existing data to come up with an equation to model that data so that you can make an educated guess at what other points in that data set will be. Linear interpolation uses two data points to generate a simple line, and you pick your destination data from that line; in cubic interpolation, you use four data points to generate a 3rd degree equation (of the form x3 + x2 + x + c) and pick your destination data from the curve. This makes the math much more complicated, and I won't try to explain it.
In linear interpolation, you use two source data points to find one the destination point. In cubic interpolation, you use four source points to find one destination point. So, each destination point is created from twice as many source points, and a cubic equation can model more sophisticated relationships than a simple line can.
The image results from bi-cubic interpolation (cubic interpolation performed in both X and Y directions) are often sharper than bi-linear interpolation because of the higher accuracy possible with 3rd degree equations and the higher number of source pixels used. The downside is that the calculations required to do this are more time-consuming than previous methods.
Area-averaging
------------------------------
This technique is much different from the resampling techniques described above. In those techniques, the number of source points per destination point was fixed by the requirements of the math: 2 for linear resampling, 4 for cubic resampling. The intent of this algorithm is also different. The resampling techniques are designed to reproduce or mimic the source data as closely as possible; area averaging is designed to find the average data value in a given range of data.
The way it works is simple and intuitive: if you have Nd output pixels, just divide the source data into Nd regions and calculate each destination point as the average data value from the corresponding source region. Every data point in the source contributes equally to the output: and a destination pixel is the average of all source pixels that it represents.
Unfortunately, this technique can only be used when reducing images. If Ns < Nd, the source regions end up being less than one pixel each and the technique becomes equivalent to the nearest-neighbor algorithm. But, for reduction, this can give excellent results, and it's very fast.
Filter resampling
-----------------------------------------
ImgSource provides 13 different filter-based resizing methods. They all work like this: for each pixel, a number of neighboring pixels is combined in a weighted average to form the output pixel. In each method, the weightings for the pixels are determined by a different function, and each method uses a different number of neighbor pixels.
The results from these different methods vary dramatically. Some of the methods are probably useless for any purpose, but other methods give the best results of all the ImgSource resizing methods. But, as with many things, you pay for the better results with long calculation times. The better algorithms (ex. Catrom, Lanczos, Mitchell) sample a large number of source pixels for every output pixel, and that takes time.
The actual number of pixels sampled depends on the method and the ratio of Ns/Nd, and in general reduction is more costly than enlarging.
Here's an example: if you are using the Lanczos3 method, and reducing the image by 1/3, the algorithm will use eighteen source points for every output point - in each dimension. That means it does a weighted average of 324 source pixels per output pixel. The Sinc method is even more costly: the same 1/3 reduction will use 576 source pixels per output pixel. Compare that to four source pixels per output pixel for bi-linear resizing and one source per output pixel for nearest-neighbor, and you can see why the filtering methods are much slower. On the bright side, enlarging with these methods is less costly: Lanczos3 uses 9 source pixels, regardless of output size, Sinc uses 16, etc..
But if you need excellent results and time is not critical, though, these methods are worth the time.
Bottom Line
-----------
For most purposes, the simple methods (bi-linear, bi-cubic and area-averaging) will give respectable results. If you can afford to wait, some of the filtered methods can give outstanding results. The method you choose just depends on how long your users are willing to wait vs. the quality the need.