gifcap - A library for easy, fast, creation of GIF files


This is a small C language animated GIF writer library.

It is available at this bitbucket repository, which includes a demo.

It is based on Jon Olick's excellent GIF writer code (archive).
I have improved upon it by including the following features:
The "gif" in gifcap is pronounced as ɡ ɪ f , GHIF.
As a caveat, I have also left the code in a very poor, hacky state, and couldn't be arsed to clean it.
The account below I believe is an interesting exploration of some computer science problems.

History

In december 2018 while working on collision code for my 3D FPS engine I decided to add GIF output as a debugging aid.

I added both demo recording and gif output, and managed to fix it pic.twitter.com/cfgu9Ud6Ai

— REGULAR KDRNIC now with FRUSTRATION (@kdrnic) December 17, 2018

The GIFs it would generate were of enormous file size, and it would leave the application running very slowly even for relatively small, low FPS gifs.

Improving filesize

Improving filesize was simply a matter of adding support for what one can call delta encoding, or which some GIF software calls "greenscreen", that is, to only store the pixels changed from one frame to the next. The image below, from GifCam's GIF editor (archive) shows in green unchanged areas from one frame to the next, which need not be stored, saving disk space.


JO's lib lacked support for such frames, likely due to simplicity concerns (he boasts the 398 lines of code). The code changes to achieve this were quite simple:
//if using all colours, on the first frame, or new palette, can't delta-frame #define NO_GREENSCREEN (gif->numColors == 256 || gif->frame == 0 || genPalette) unsigned char *indexedPixels = gif->buf[0], *oldPixels = gif->buf[1]; if(NO_GREENSCREEN && oldPixels) memset(oldPixels, 0, size); // Write new GIF frame to indexedPixels // [...] //calc delta-frame for(i = 0; i < size * !NO_GREENSCREEN; i++){ if(indexedPixels[i] == oldPixels[i]) indexedPixels[i] = 0; else oldPixels[i] = indexedPixels[i]; } // Graphic Control Extension //last byte here tells whether to use a transparent colour and delta frame #define BLOCK_START NO_GREENSCREEN ? "\x21\xf9\x04\x00" : "\x21\xf9\x04\x05" fwrite(BLOCK_START, 4, 1, gif->fp);
//if using all colours, on the first frame, or new palette, can't delta-frame #define NO_GREENSCREEN (gif->numColors == 256 || gif->frame == 0 || genPalette) unsigned char *indexedPixels = gif->buf[0], *oldPixels = gif->buf[1]; if(NO_GREENSCREEN && oldPixels) memset(oldPixels, 0, size); // Write new GIF frame to indexedPixels // [...] //calc delta-frame for(i = 0; i < size * !NO_GREENSCREEN; i++){ if(indexedPixels[i] == oldPixels[i]) indexedPixels[i] = 0; else oldPixels[i] = indexedPixels[i]; } // Graphic Control Extension //last byte here tells whether to use a transparent colour and delta frame #define BLOCK_START NO_GREENSCREEN ? "\x21\xf9\x04\x00" : "\x21\xf9\x04\x05" fwrite(BLOCK_START, 4, 1, gif->fp);

And the benefits in file size were drastic. Note it is not necessary to perform run length encoding or other such methods on the pixels that are marked as unchanged: the LZW compression algorithm takes care of that.

Optimising encoding

GIF encoding, specially at "higher resolutions" such as a mind-breaking 320x240, was going quite slow, single digit frame-per-second values. Some very basic profiling showed that the big hotspot in GIF encoding was the "quantization" step, that is, the code to turn a 24-bit R8G8B8 colour value into an 8-bit palette index. In JO's original library this hotspot is a straightforward linear search:
//naive search int rgb[3] = { ditheredPixels[k+0], ditheredPixels[k+1], ditheredPixels[k+2] }; for(i = 0; i < gif->numColors && bestd; ++i){ int bb = palette[i*3+0]-rgb[0]; int gg = palette[i*3+1]-rgb[1]; int rr = palette[i*3+2]-rgb[2]; int d = bb*bb + gg*gg + rr*rr; if(d < bestd) { bestd = d; best = i; } }
//naive search int rgb[3] = { ditheredPixels[k+0], ditheredPixels[k+1], ditheredPixels[k+2] }; for(i = 0; i < gif->numColors && bestd; ++i){ int bb = palette[i*3+0]-rgb[0]; int gg = palette[i*3+1]-rgb[1]; int rr = palette[i*3+2]-rgb[2]; int d = bb*bb + gg*gg + rr*rr; if(d < bestd) { bestd = d; best = i; } }

The computer-science savvy friend will reframe this problem, every colour is a point in 3D space, the quantization step is a nearest-neighbour search, and the palette needs to be turned into some sort of tree data structure. The answer is a kd-tree. To quote Wikipedia:
In computer science, a k-d tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). k-d trees are a special case of binary space partitioning trees.
The code for a kd-tree implementation I based on Rosetta Code's example implementation (archive), shamelessly. The quantization is now done in two steps:
//Build kdtree from palette struct kd_node_t *palkdn = gif->palkdn; struct kd_node_t *palkdt = gif->palkdt, kdcol; const struct kd_node_t *kdbest; if(gif->frame == 0 || genPalette){ for(i = (gif->numColors != 256); i < gif->numColors + (gif->numColors != 256); i++){ palkdn[i].x[0] = palette[i*3+0]; palkdn[i].x[1] = palette[i*3+1]; palkdn[i].x[2] = palette[i*3+2]; palkdn[i].idx = i; } palkdt = gif->palkdt = kdtree_make_tree(palkdn + (gif->numColors != 256), gif->numColors, 0); } // [...] //Find best fit colour with kd-tree kdcol.x[0] = ditheredPixels[k+0]; kdcol.x[1] = ditheredPixels[k+1]; kdcol.x[2] = ditheredPixels[k+2]; kdtree_nearest(palkdt, &kdcol, 0, &kdbest, &bestd); best = kdbest->idx;
//Build kdtree from palette struct kd_node_t *palkdn = gif->palkdn; struct kd_node_t *palkdt = gif->palkdt, kdcol; const struct kd_node_t *kdbest; if(gif->frame == 0 || genPalette){ for(i = (gif->numColors != 256); i < gif->numColors + (gif->numColors != 256); i++){ palkdn[i].x[0] = palette[i*3+0]; palkdn[i].x[1] = palette[i*3+1]; palkdn[i].x[2] = palette[i*3+2]; palkdn[i].idx = i; } palkdt = gif->palkdt = kdtree_make_tree(palkdn + (gif->numColors != 256), gif->numColors, 0); } // [...] //Find best fit colour with kd-tree kdcol.x[0] = ditheredPixels[k+0]; kdcol.x[1] = ditheredPixels[k+1]; kdcol.x[2] = ditheredPixels[k+2]; kdtree_nearest(palkdt, &kdcol, 0, &kdbest, &bestd); best = kdbest->idx;

The performance benefits were drastic.

Multithreading

Still, each frame necessarily takes a decent amount of time to encode. And disk I/O, which can block, was another factor
The library could already do over 12 FPS for decent resolutions, but I intended this for a realtime application, such as the game engine. Those have to run at 60 FPS, but they also have to run smoothly - a frame can't take much longer than the other. And in the naive way to call the encoder, blocking the renderer for GIF output at every 5th frame, the performance was thus degraded.

I thus applied the PTHREADS library to straightforwardly allow encoding a frame while the main application continued to run:
static void *gif_func(void *arg) { // [...] do{ pthread_mutex_lock(&gc->thr_lock); //Wait for a frame to encode while(!gc->thr_has_frame){ pthread_cond_wait(&gc->thr_cond, &gc->thr_lock); } // [...] // Encode frame jo_gif_frame(&gc->jo, c, (gc->delta + gc->delta_error) / 100, gc->gen_pal); // [...] gc->thr_has_frame = 0; pthread_mutex_unlock(&gc->thr_lock); } while(1); return 0; } int gifcap_start(gifcap_t *cap, const char *filename, gifcap_param_t params) { // [...] cap->thr_has_frame = 0; cap->thr_cond = PTHREAD_COND_INITIALIZER; cap->thr_lock = PTHREAD_MUTEX_INITIALIZER; cap->thr_arg.gc = cap; if(pthread_create(&cap->thr, 0, gif_func, &cap->thr_arg)){ // [...] return 1; } // [...] return 0; } void gifcap_frame(gifcap_t *cap, unsigned char *buf) { pthread_mutex_lock(&cap->thr_lock); //Copy frame data to another buffer memcpy(cap->thr_arg.frame_data, buf, cap->jo.width * cap->jo.height * 4); //Signal that we have a frame to encode cap->thr_has_frame = 1; pthread_cond_signal(&cap->thr_cond); pthread_mutex_unlock(&cap->thr_lock); } void gifcap_end(gifcap_t *cap) { pthread_mutex_lock(&cap->thr_lock); pthread_cancel(cap->thr); // [...] }
static void *gif_func(void *arg) { // [...] do{ pthread_mutex_lock(&gc->thr_lock); //Wait for a frame to encode while(!gc->thr_has_frame){ pthread_cond_wait(&gc->thr_cond, &gc->thr_lock); } // [...] // Encode frame jo_gif_frame(&gc->jo, c, (gc->delta + gc->delta_error) / 100, gc->gen_pal); // [...] gc->thr_has_frame = 0; pthread_mutex_unlock(&gc->thr_lock); } while(1); return 0; } int gifcap_start(gifcap_t *cap, const char *filename, gifcap_param_t params) { // [...] cap->thr_has_frame = 0; cap->thr_cond = PTHREAD_COND_INITIALIZER; cap->thr_lock = PTHREAD_MUTEX_INITIALIZER; cap->thr_arg.gc = cap; if(pthread_create(&cap->thr, 0, gif_func, &cap->thr_arg)){ // [...] return 1; } // [...] return 0; } void gifcap_frame(gifcap_t *cap, unsigned char *buf) { pthread_mutex_lock(&cap->thr_lock); //Copy frame data to another buffer memcpy(cap->thr_arg.frame_data, buf, cap->jo.width * cap->jo.height * 4); //Signal that we have a frame to encode cap->thr_has_frame = 1; pthread_cond_signal(&cap->thr_cond); pthread_mutex_unlock(&cap->thr_lock); } void gifcap_end(gifcap_t *cap) { pthread_mutex_lock(&cap->thr_lock); pthread_cancel(cap->thr); // [...] }

Of course the main application will still block if the GIF encoder cannot keep up - but the speed of the main application is thus decoupled from the speed of the GIF encoder.

Mandelbrot demo

I was thinking of what to use as a GIF encoding demo. I originaly thought of using some implementation of famous PSX Doom fire effect as explained by Fabien Sanglard (archive), but as Jon Olick's original page on his library has an animation based on a Julia set I opted to do one based on the similar Mandelbrot set instead.

The code is again "borrowed" from Rosetta Code (archive).

It uses Allegro 4, as most of my graphical applications.
The nice colouring algorithm is from this StackExchange post.

Post scriptum

Many thanks for Jon Olick for the original library, without which this wouldn't exist.
Thanks for Rosetta Code, as a convenient code repository.
If you use this, specially if you improve upon it, please give me a shout at Twitter or the IRC.