Proposal for a new image format (container)
In addition to several MIP-levels, the image file contains a matrix of fixed-size sub-images for each MIP-level (e.g. 1024x1024px).
Each sub-image has a set of coordinates/indexes attached to it so that the original image can easily be reconstructed by stitching the sub-images together.
The MIP-based file format will have to be altered slightly: Instead of a simple JPG-compressed MIP-level, a tiled image would have to be stored.
Since every MIP-level is only half the size of its previous level, the number of tiles will be reduced by a factor of 4 in each step.
The file header now consisting of a simple offset map would have to be extended to store the intra-MIP offsets for each tile in addition to the MIP-level offset.
Why all the effort?
if the user zooms in on a large image (several megapixels), decoding of the next higher res MIP-level takes a lot of time.
Current screens only have resolutions of 2-3 megapixels, meaning the full image is never visible, so decoding the whole image is a waste of memory and time. A simple grid-based visibility test would quickly yield which tiles are actually visible and which are hidden from view. Only the (partially) visible ones would have to send a load request to the image loader thread, greatly reducing decoding time and memory consumption.
The overhead
instead of reading just one offset and jumping directly to the data, the image loader thread will have to read two offsets.
The first offset value is necessary to jump to the MIP-level internal offset table describing the offset for each image tile.
The second offset value is necessary to jump from the MIP-level offset table directly to the tile data.
Each data request now consists of two offset lookups instead of just one, but considering the significant reduction in the amount of data actually read from the file, the overhead is neglectable.
Example Code
void convertToLFI(const QString& sImagePath)
{
// LFI: Lightbox fragmented image
// open the file and pass it on to the decoding chain to get a QImage
QImage img;
if (imageLoader::eOk == CApi::loadImage(sImagePath, img))
{
// write static header information
stream << QString("LightBox fragmented image");
stream << (int)0x6c696768; // light
stream << (int)0x74626f78; // box
int iNofReductionsX = log(img.width())/log(2) - 7;
int iNofReductionsY = log(img.height())/log(2) - 7;
int iNofMipLevels = min(iNofReductionsX, iNofReductionsY);
// reserve space for the MIP offset table
QDataStream mipOffsetsStream = stream;
stream += iNofMipLevels * sizeof(int) + 32; // advance stream by number of mip levels and add some padding
while (128 >= img.width() && 128 >= img.height())
{
// determine the offset
mipOffsetsStream << stream.pos();
// store the MIP level
  int iNofTilesX = img.width() / 128;
  int iNofTilesY = img.height() / 128;
QDataStream subImgOffsetStream = stream;
stream += iNofTilesX * iNofTilesY * sizeof(int);
for (int iY = 0; iY < iNofTilesY; ++iY)
{
for (int iX = 0; iX < iNofTilesX; ++iX)
{
subImgOffsetStream << stream.pos();
stream << getSubImage(img, iX, iY, 128, 128, "jpg"); // write the sub-image to the stream
}
}
}
}
}
maybe the tiles could also be written/read sequentially since the item and the mip level would have to have the identical tile setup.
the item could then just request tile number X of MIP level Y.
the header absolutely must include the tile size used in the file. This could otherwise change, rendering old files useless.
in order to provide a pointer for each tile of each MIP level, the item will have to prepare a structure in its loadImage() method:
// compute the number of mip levels
for each mip level: create a new list of pointers
for this to work, the image item will have to query the imageloader class for those information.
each list consists of the exact number of NULL-pointers that particular mip level needs
this way, the image loader will always have pointers at its disposal and will never have to create lists and pointers at render time.
image loader class:
getMipLevelConfiguration(sImagePath, iMipLevel):
returns the number of tiles in a particular MIP level, so that an item can prepare its structure before actually requesting image data.
If at the point of the function call, the image hasn’t been cached yet, it will have to be cached before the function returns.
In order to keep the gui thread from locking, the item could just call a method of the image loader (pseudocode):
imageLoader->buildPointerStructure(sImagePath);
the imageloader then caches the item if necessary (threaded) and, as soon as caching is done, reads the required information from the file, then calls the item’s “buildPointerStructure()” function to build the pointer-list structure.
as soon as the item has the pointer list structure set, it can begin rendering (if it is visible).
when rendering, the item knows which level it is at, so it will (in a first implementation attempt) try to render tiles of that particular level by fetching the pointer list from the pointer structure.
iterating through the list, each pointer that is not NULL will be rendered, all the while advancing x and y coordinates by one tile per pointer.
zooming the view/the item:
if the item jumps from a higher to a lower resolution (by zooming the view out), it will actively discard the data stored in the higher mip level (perhaps by skipping one mip level in order to prevent immediate recaching when zooming back in slightly).
panning the view/moving the item:
a visibility test is performed on the currently displayed MIP level, discarding any tile that is outside the (extended) view rect of the graphicsview.
All this will have to be specified in greater detail in a follow-up post.