[cogl/wip/public-atlas-apis: 3/13] stb_image: upgrade to version 1.33



commit 6a86d11fa5bd63815cb1288bf986b809b6be9df6
Author: Robert Bragg <robert linux intel com>
Date:   Sun Feb 9 03:26:04 2014 +0000

    stb_image: upgrade to version 1.33
    
    This upgrades stb_image to 1.33 which notably fixes some problems with
    inconsistently using #if and #ifdef to check the STBI_SIMD define.

 cogl/stb_image.c | 3367 +++++++++++++++++++++++++++++++++---------------------
 1 files changed, 2076 insertions(+), 1291 deletions(-)
---
diff --git a/cogl/stb_image.c b/cogl/stb_image.c
index 12315d5..04138da 100644
--- a/cogl/stb_image.c
+++ b/cogl/stb_image.c
@@ -1,90 +1,93 @@
-/* stbi-1.18 - public domain JPEG/PNG reader - http://nothings.org/stb_image.c
-                      when you control the images you're loading
+/* stbi-1.33 - public domain JPEG/PNG reader - http://nothings.org/stb_image.c
+   when you control the images you're loading
+                                     no warranty implied; use at your own risk
 
    QUICK NOTES:
       Primarily of interest to game developers and other people who can
           avoid problematic images and only need the trivial interface
 
-      JPEG baseline (no JPEG progressive, no oddball channel decimations)
-      PNG 8-bit only
-      BMP non-1bpp, non-RLE
+      JPEG baseline (no JPEG progressive)
+      PNG 8-bit-per-channel only
+
       TGA (not sure what subset, if a subset)
+      BMP non-1bpp, non-RLE
       PSD (composited view only, no extra channels)
+
+      GIF (*comp always reports as 4-channel)
       HDR (radiance rgbE format)
-      writes BMP,TGA (define STBI_NO_WRITE to remove code)
-      decoded from memory or through stdio FILE (define STBI_NO_STDIO to remove code)
-      supports installable dequantizing-IDCT, YCbCr-to-RGB conversion (define STBI_SIMD)
+      PIC (Softimage PIC)
+
+      - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
+      - decode from arbitrary I/O callbacks
+      - overridable dequantizing-IDCT, YCbCr-to-RGB conversion (define STBI_SIMD)
+
+   Latest revisions:
+      1.33 (2011-07-14) minor fixes suggested by Dave Moore
+      1.32 (2011-07-13) info support for all filetypes (SpartanJ)
+      1.31 (2011-06-19) a few more leak fixes, bug in PNG handling (SpartanJ)
+      1.30 (2011-06-11) added ability to load files via io callbacks (Ben Wenger)
+      1.29 (2010-08-16) various warning fixes from Aurelien Pocheville 
+      1.28 (2010-08-01) fix bug in GIF palette transparency (SpartanJ)
+      1.27 (2010-08-01) cast-to-uint8 to fix warnings (Laurent Gomila)
+                        allow trailing 0s at end of image data (Laurent Gomila)
+      1.26 (2010-07-24) fix bug in file buffering for PNG reported by SpartanJ
+
+   See end of file for full revision history.
 
    TODO:
-      stbi_info_*
+      stbi_info support for BMP,PSD,HDR,PIC
+
+
+ ============================    Contributors    =========================
+              
+ Image formats                                Optimizations & bugfixes
+    Sean Barrett (jpeg, png, bmp)                Fabian "ryg" Giesen
+    Nicolas Schulz (hdr, psd)                                                 
+    Jonathan Dummer (tga)                     Bug fixes & warning fixes           
+    Jean-Marc Lienher (gif)                      Marc LeBlanc               
+    Tom Seddon (pic)                             Christpher Lloyd           
+    Thatcher Ulrich (psd)                        Dave Moore                 
+                                                 Won Chun                   
+                                                 the Horde3D community      
+ Extensions, features                            Janez Zemva                
+    Jetro Lauha (stbi_info)                      Jonathan Blow              
+    James "moose2000" Brown (iPhone PNG)         Laurent Gomila                             
+    Ben "Disch" Wenger (io callbacks)            Aruelien Pocheville
+    Martin "SpartanJ" Golini                     Ryamond Barbiero
+                                                 David Woo
+                                                 
+
+ If your name should be here but isn't, let Sean know.
 
-   history:
-      1.18   fix a threading bug (local mutable static)
-      1.17   support interlaced PNG
-      1.16   major bugfix - convert_format converted one too many pixels
-      1.15   initialize some fields for thread safety
-      1.14   fix threadsafe conversion bug; header-file-only version (#define STBI_HEADER_FILE_ONLY before 
including)
-      1.13   threadsafe
-      1.12   const qualifiers in the API
-      1.11   Support installable IDCT, colorspace conversion routines
-      1.10   Fixes for 64-bit (don't use "unsigned long")
-             optimized upsampling by Fabian "ryg" Giesen
-      1.09   Fix format-conversion for PSD code (bad global variables!)
-      1.08   Thatcher Ulrich's PSD code integrated by Nicolas Schulz
-      1.07   attempt to fix C++ warning/errors again
-      1.06   attempt to fix C++ warning/errors again
-      1.05   fix TGA loading to return correct *comp and use good luminance calc
-      1.04   default float alpha is 1, not 255; use 'void *' for stbi_image_free
-      1.03   bugfixes to STBI_NO_STDIO, STBI_NO_HDR
-      1.02   support for (subset of) HDR files, float interface for preferred access to them
-      1.01   fix bug: possible bug in handling right-side up bmps... not sure
-             fix bug: the stbi_bmp_load() and stbi_tga_load() functions didn't work at all
-      1.00   interface to zlib that skips zlib header
-      0.99   correct handling of alpha in palette
-      0.98   TGA loader by lonesock; dynamically add loaders (untested)
-      0.97   jpeg errors on too large a file; also catch another malloc failure
-      0.96   fix detection of invalid v value - particleman mollyrocket forum
-      0.95   during header scan, seek to markers in case of padding
-      0.94   STBI_NO_STDIO to disable stdio usage; rename all #defines the same
-      0.93   handle jpegtran output; verbose errors
-      0.92   read 4,8,16,24,32-bit BMP files of several formats
-      0.91   output 24-bit Windows 3.0 BMP files
-      0.90   fix a few more warnings; bump version number to approach 1.0
-      0.61   bugfixes due to Marc LeBlanc, Christopher Lloyd
-      0.60   fix compiling as c++
-      0.59   fix warnings: merge Dave Moore's -Wall fixes
-      0.58   fix bug: zlib uncompressed mode len/nlen was wrong endian
-      0.57   fix bug: jpg last huffman symbol before marker was >9 bits but less
-                      than 16 available
-      0.56   fix bug: zlib uncompressed mode len vs. nlen
-      0.55   fix bug: restart_interval not initialized to 0
-      0.54   allow NULL for 'int *comp'
-      0.53   fix bug in png 3->4; speedup png decoding
-      0.52   png handles req_comp=3,4 directly; minor cleanup; jpeg comments
-      0.51   obey req_comp requests, 1-component jpegs return as 1-component,
-             on 'test' only check type, not whether we support this variant
 */
 
-#include <glib.h>
+#include <config.h>
+
+#include <stdint.h>
 
 #ifndef STBI_INCLUDE_STB_IMAGE_H
 #define STBI_INCLUDE_STB_IMAGE_H
 
+// To get a header file for this, either cut and paste the header,
+// or create stb_image.h, #define STBI_HEADER_FILE_ONLY, and
+// then include stb_image.c from it.
+
 ////   begin header file  ////////////////////////////////////////////////////
 //
 // Limitations:
-//    - no progressive/interlaced support (jpeg, png)
-//    - 8-bit samples only (jpeg, png)
-//    - not threadsafe
-//    - channel subsampling of at most 2 in each dimension (jpeg)
+//    - no jpeg progressive support
+//    - non-HDR formats support 8-bit samples only (jpeg, png)
 //    - no delayed line count (jpeg) -- IJG doesn't support either
+//    - no 1-bit BMP
+//    - GIF always returns *comp=4
 //
 // Basic usage (see HDR discussion below):
 //    int x,y,n;
 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
-//    // ... process data if not NULL ...
+//    // ... process data if not NULL ... 
 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
 //    // ... replace '0' with '1'..'4' to force that many components per pixel
+//    // ... but 'n' will always be the number that it would have been if you said 0
 //    stbi_image_free(data)
 //
 // Standard parameters:
@@ -119,8 +122,22 @@
 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
 // more user-friendly ones.
 //
-// Paletted PNG and BMP images are automatically depalettized.
+// Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
+//
+// ===========================================================================
+//
+// iPhone PNG support:
+//
+// By default we convert iphone-formatted PNGs back to RGB; nominally they
+// would silently load as BGR, except the existing code should have just
+// failed on such iPhone PNGs. But you can disable this conversion by
+// by calling stbi_convert_iphone_png_to_rgb(0), in which case
+// you will always just get the native iphone "format" through.
 //
+// Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
+// pixel to remove any premultiplied alpha *only* if the image file explicitly
+// says there's premultiplied data (currently only happens in iPhone images,
+// and only if iPhone convert-to-rgb processing is on).
 //
 // ===========================================================================
 //
@@ -143,7 +160,7 @@
 // (linear) floats to preserve the full dynamic range:
 //
 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
-//
+// 
 // If you load LDR images through this interface, those images will
 // be promoted to floating point values, run through the inverse of
 // constants corresponding to the above:
@@ -157,8 +174,26 @@
 // not), using:
 //
 //     stbi_is_hdr(char *filename);
+//
+// ===========================================================================
+//
+// I/O callbacks
+//
+// I/O callbacks allow you to read from arbitrary sources, like packaged
+// files or some other source. Data read from callbacks are processed
+// through a small internal buffer (currently 128 bytes) to try to reduce
+// overhead. 
+//
+// The three functions you must define are "read" (reads some bytes of data),
+// "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
+
 
 #ifndef STBI_NO_STDIO
+
+#if defined(_MSC_VER) && _MSC_VER >= 0x1400
+#define _CRT_SECURE_NO_WARNINGS // suppress bogus warnings about fopen()
+#endif
+
 #include <stdio.h>
 #endif
 
@@ -171,7 +206,7 @@ enum
    STBI_grey       = 1,
    STBI_grey_alpha = 2,
    STBI_rgb        = 3,
-   STBI_rgb_alpha  = 4,
+   STBI_rgb_alpha  = 4
 };
 
 typedef unsigned char stbi_uc;
@@ -180,160 +215,106 @@ typedef unsigned char stbi_uc;
 extern "C" {
 #endif
 
-// WRITING API
-
-#if !defined(STBI_NO_WRITE) && !defined(STBI_NO_STDIO)
-// write a BMP/TGA file given tightly packed 'comp' channels (no padding, nor bmp-stride-padding)
-// (you must include the appropriate extension in the filename).
-// returns TRUE on success, FALSE if couldn't open file, error writing file
-extern int      stbi_write_bmp       (char const *filename,     int x, int y, int comp, void *data);
-extern int      stbi_write_tga       (char const *filename,     int x, int y, int comp, void *data);
-#endif
-
+//////////////////////////////////////////////////////////////////////////////
+//
 // PRIMARY API - works on images of any type
+//
 
+//
 // load image by filename, open file, or memory buffer
+//
+
+extern stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
+
 #ifndef STBI_NO_STDIO
 extern stbi_uc *stbi_load            (char const *filename,     int *x, int *y, int *comp, int req_comp);
 extern stbi_uc *stbi_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
-extern int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
-#endif
-extern stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
 // for stbi_load_from_file, file pointer is left pointing immediately after image
+#endif
+
+typedef struct
+{
+   int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of 
bytes actually read 
+   void     (*skip)  (void *user,unsigned n);            // skip the next 'n' bytes
+   int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
+} stbi_io_callbacks;
+
+extern stbi_uc *stbi_load_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int 
*comp, int req_comp);
 
 #ifndef STBI_NO_HDR
-#ifndef STBI_NO_STDIO
-extern float *stbi_loadf            (char const *filename,     int *x, int *y, int *comp, int req_comp);
-extern float *stbi_loadf_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
-#endif
-extern float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
+   extern float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
 
-extern void   stbi_hdr_to_ldr_gamma(float gamma);
-extern void   stbi_hdr_to_ldr_scale(float scale);
+   #ifndef STBI_NO_STDIO
+   extern float *stbi_loadf            (char const *filename,   int *x, int *y, int *comp, int req_comp);
+   extern float *stbi_loadf_from_file  (FILE *f,                int *x, int *y, int *comp, int req_comp);
+   #endif
+   
+   extern float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int 
*comp, int req_comp);
 
-extern void   stbi_ldr_to_hdr_gamma(float gamma);
-extern void   stbi_ldr_to_hdr_scale(float scale);
+   extern void   stbi_hdr_to_ldr_gamma(float gamma);
+   extern void   stbi_hdr_to_ldr_scale(float scale);
 
+   extern void   stbi_ldr_to_hdr_gamma(float gamma);
+   extern void   stbi_ldr_to_hdr_scale(float scale);
 #endif // STBI_NO_HDR
 
+// stbi_is_hdr is always defined
+extern int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
+extern int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
+#ifndef STBI_NO_STDIO
+extern int      stbi_is_hdr          (char const *filename);
+extern int      stbi_is_hdr_from_file(FILE *f);
+#endif // STBI_NO_STDIO
+
+
 // get a VERY brief reason for failure
 // NOT THREADSAFE
-extern char    *stbi_failure_reason  (void);
+extern const char *stbi_failure_reason  (void); 
 
 // free the loaded image -- this is just free()
 extern void     stbi_image_free      (void *retval_from_stbi_load);
 
 // get image dimensions & components without fully decoding
 extern int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
-extern int      stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
-#ifndef STBI_NO_STDIO
-extern int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
-extern int      stbi_is_hdr          (char const *filename);
-extern int      stbi_is_hdr_from_file(FILE *f);
-#endif
-
-// ZLIB client - used by PNG, available for other purposes
-
-extern char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
-extern char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
-extern int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
-
-extern char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
-extern int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
-
-// TYPE-SPECIFIC ACCESS
-
-// is it a jpeg?
-extern int      stbi_jpeg_test_memory     (stbi_uc const *buffer, int len);
-extern stbi_uc *stbi_jpeg_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
-extern int      stbi_jpeg_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
+extern int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int 
*comp);
 
 #ifndef STBI_NO_STDIO
-extern stbi_uc *stbi_jpeg_load            (char const *filename,     int *x, int *y, int *comp, int 
req_comp);
-extern int      stbi_jpeg_test_file       (FILE *f);
-extern stbi_uc *stbi_jpeg_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int 
req_comp);
+extern int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
+extern int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
 
-extern int      stbi_jpeg_info            (char const *filename,     int *x, int *y, int *comp);
-extern int      stbi_jpeg_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
 #endif
 
-// is it a png?
-extern int      stbi_png_test_memory      (stbi_uc const *buffer, int len);
-extern stbi_uc *stbi_png_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
-extern int      stbi_png_info_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
 
-#ifndef STBI_NO_STDIO
-extern stbi_uc *stbi_png_load             (char const *filename,     int *x, int *y, int *comp, int 
req_comp);
-extern int      stbi_png_info             (char const *filename,     int *x, int *y, int *comp);
-extern int      stbi_png_test_file        (FILE *f);
-extern stbi_uc *stbi_png_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int 
req_comp);
-extern int      stbi_png_info_from_file   (FILE *f,                  int *x, int *y, int *comp);
-#endif
 
-// is it a bmp?
-extern int      stbi_bmp_test_memory      (stbi_uc const *buffer, int len);
+// for image formats that explicitly notate that they have premultiplied alpha,
+// we just return the colors as stored in the file. set this flag to force
+// unpremultiplication. results are undefined if the unpremultiply overflow.
+extern void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
 
-extern stbi_uc *stbi_bmp_load             (char const *filename,     int *x, int *y, int *comp, int 
req_comp);
-extern stbi_uc *stbi_bmp_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
-#ifndef STBI_NO_STDIO
-extern int      stbi_bmp_test_file        (FILE *f);
-extern stbi_uc *stbi_bmp_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int 
req_comp);
-#endif
+// indicate whether we should process iphone images back to canonical format,
+// or just pass them through "as-is"
+extern void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
 
-// is it a tga?
-extern int      stbi_tga_test_memory      (stbi_uc const *buffer, int len);
-
-extern stbi_uc *stbi_tga_load             (char const *filename,     int *x, int *y, int *comp, int 
req_comp);
-extern stbi_uc *stbi_tga_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
-#ifndef STBI_NO_STDIO
-extern int      stbi_tga_test_file        (FILE *f);
-extern stbi_uc *stbi_tga_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int 
req_comp);
-#endif
 
-// is it a psd?
-extern int      stbi_psd_test_memory      (stbi_uc const *buffer, int len);
-
-extern stbi_uc *stbi_psd_load             (char const *filename,     int *x, int *y, int *comp, int 
req_comp);
-extern stbi_uc *stbi_psd_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
-#ifndef STBI_NO_STDIO
-extern int      stbi_psd_test_file        (FILE *f);
-extern stbi_uc *stbi_psd_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int 
req_comp);
-#endif
-
-// is it an hdr?
-extern int      stbi_hdr_test_memory      (stbi_uc const *buffer, int len);
+// ZLIB client - used by PNG, available for other purposes
 
-extern float *  stbi_hdr_load             (char const *filename,     int *x, int *y, int *comp, int 
req_comp);
-extern float *  stbi_hdr_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp);
-#ifndef STBI_NO_STDIO
-extern int      stbi_hdr_test_file        (FILE *f);
-extern float *  stbi_hdr_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int 
req_comp);
-#endif
+extern char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
+extern char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int 
*outlen, int parse_header);
+extern char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
+extern int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
 
-// define new loaders
-typedef struct
-{
-   int       (*test_memory)(stbi_uc const *buffer, int len);
-   stbi_uc * (*load_from_memory)(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
-   #ifndef STBI_NO_STDIO
-   int       (*test_file)(FILE *f);
-   stbi_uc * (*load_from_file)(FILE *f, int *x, int *y, int *comp, int req_comp);
-   #endif
-} stbi_loader;
+extern char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
+extern int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
 
-// register a loader by filling out the above structure (you must defined ALL functions)
-// returns 1 if added or already added, 0 if not added (too many loaders)
-// NOT THREADSAFE
-extern int stbi_register_loader(stbi_loader *loader);
 
 // define faster low-level operations (typically SIMD support)
-#if STBI_SIMD
-typedef void (*stbi_idct_8x8)(uint8_t *out, int out_stride, short data[64], unsigned short *dequantize);
+#ifdef STBI_SIMD
+typedef void (*stbi_idct_8x8)(stbi_uc *out, int out_stride, short data[64], unsigned short *dequantize);
 // compute an integer IDCT on "input"
 //     input[x] = data[x] * dequantize[x]
 //     write results to 'out': 64 samples, each run of 8 spaced by 'out_stride'
 //                             CLAMP results to 0..255
-typedef void (*stbi_YCbCr_to_RGB_run)(uint8_t *output, uint8_t const *y, uint8_t const *cb, uint8_t const 
*cr, int count, int step);
+typedef void (*stbi_YCbCr_to_RGB_run)(stbi_uc *output, stbi_uc const  *y, stbi_uc const *cb, stbi_uc const 
*cr, int count, int step);
 // compute a conversion from YCbCr to RGB
 //     'count' pixels
 //     write pixels to 'output'; each pixel is 'step' bytes (either 3 or 4; if 4, write '255' as 4th), order 
R,G,B
@@ -345,6 +326,7 @@ extern void stbi_install_idct(stbi_idct_8x8 func);
 extern void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func);
 #endif // STBI_SIMD
 
+
 #ifdef __cplusplus
 }
 #endif
@@ -358,7 +340,7 @@ extern void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func);
 
 #ifndef STBI_NO_HDR
 #include <math.h>  // ldexp
-#include <string.h> // strcmp
+#include <string.h> // strcmp, strtok
 #endif
 
 #ifndef STBI_NO_STDIO
@@ -370,40 +352,161 @@ extern void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func);
 #include <stdarg.h>
 
 #ifndef _MSC_VER
-  #ifdef __cplusplus
-  #define __forceinline inline
-  #else
-  #define __forceinline
-  #endif
+   #ifdef __cplusplus
+   #define stbi_inline inline
+   #else
+   #define stbi_inline
+   #endif
+#else
+   #define stbi_inline __forceinline
 #endif
 
 
-// should produce compiler error if size is wrong
-typedef unsigned char validate_uint32_t[sizeof(uint32_t)==4];
+// implementation:
+typedef unsigned int   uint;
 
 #if defined(STBI_NO_STDIO) && !defined(STBI_NO_WRITE)
 #define STBI_NO_WRITE
 #endif
 
-//////////////////////////////////////////////////////////////////////////////
-//
-// Generic API that works on all image types
+#define STBI_NOTUSED(v)  (void)sizeof(v)
+
+#ifdef _MSC_VER
+#define STBI_HAS_LROTL
+#endif
+
+#ifdef STBI_HAS_LROTL
+   #define stbi_lrot(x,y)  _lrotl(x,y)
+#else
+   #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
+#endif
+
+///////////////////////////////////////////////
 //
+//  stbi struct and start_xxx functions
+
+// stbi structure is our basic context used by all images, so it
+// contains all the IO context, plus some basic image information
+typedef struct
+{
+   uint32_t img_x, img_y;
+   int img_n, img_out_n;
+   
+   stbi_io_callbacks io;
+   void *io_user_data;
+
+   int read_from_callbacks;
+   int buflen;
+   uint8_t buffer_start[128];
+
+   uint8_t *img_buffer, *img_buffer_end;
+   uint8_t *img_buffer_original;
+} stbi;
+
+
+static void refill_buffer(stbi *s);
+
+// initialize a memory-decode context
+static void start_mem(stbi *s, uint8_t const *buffer, int len)
+{
+   s->io.read = NULL;
+   s->read_from_callbacks = 0;
+   s->img_buffer = s->img_buffer_original = (uint8_t *) buffer;
+   s->img_buffer_end = (uint8_t *) buffer+len;
+}
+
+// initialize a callback-based context
+static void start_callbacks(stbi *s, stbi_io_callbacks *c, void *user)
+{
+   s->io = *c;
+   s->io_user_data = user;
+   s->buflen = sizeof(s->buffer_start);
+   s->read_from_callbacks = 1;
+   s->img_buffer_original = s->buffer_start;
+   refill_buffer(s);
+}
+
+#ifndef STBI_NO_STDIO
+
+static int stdio_read(void *user, char *data, int size)
+{
+   return (int) fread(data,1,size,(FILE*) user);
+}
+
+static void stdio_skip(void *user, unsigned n)
+{
+   fseek((FILE*) user, n, SEEK_CUR);
+}
+
+static int stdio_eof(void *user)
+{
+   return feof((FILE*) user);
+}
+
+static stbi_io_callbacks stbi_stdio_callbacks =
+{
+   stdio_read,
+   stdio_skip,
+   stdio_eof,
+};
+
+static void start_file(stbi *s, FILE *f)
+{
+   start_callbacks(s, &stbi_stdio_callbacks, (void *) f);
+}
+
+//static void stop_file(stbi *s) { }
+
+#endif // !STBI_NO_STDIO
+
+static void stbi_rewind(stbi *s)
+{
+   // conceptually rewind SHOULD rewind to the beginning of the stream,
+   // but we just rewind to the beginning of the initial buffer, because
+   // we only use it after doing 'test', which only ever looks at at most 92 bytes
+   s->img_buffer = s->img_buffer_original;
+}
+
+static int      stbi_jpeg_test(stbi *s);
+static stbi_uc *stbi_jpeg_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_jpeg_info(stbi *s, int *x, int *y, int *comp);
+static int      stbi_png_test(stbi *s);
+static stbi_uc *stbi_png_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_png_info(stbi *s, int *x, int *y, int *comp);
+static int      stbi_bmp_test(stbi *s);
+static stbi_uc *stbi_bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_tga_test(stbi *s);
+static stbi_uc *stbi_tga_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_tga_info(stbi *s, int *x, int *y, int *comp);
+static int      stbi_psd_test(stbi *s);
+static stbi_uc *stbi_psd_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_hdr_test(stbi *s);
+static float   *stbi_hdr_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_pic_test(stbi *s);
+static stbi_uc *stbi_pic_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_gif_test(stbi *s);
+static stbi_uc *stbi_gif_load(stbi *s, int *x, int *y, int *comp, int req_comp);
+static int      stbi_gif_info(stbi *s, int *x, int *y, int *comp);
+
 
 // this is not threadsafe
-static char *failure_reason;
+static const char *failure_reason;
 
-char *stbi_failure_reason(void)
+const char *stbi_failure_reason(void)
 {
    return failure_reason;
 }
 
-static int e(char *str)
+static int e(const char *str)
 {
    failure_reason = str;
    return 0;
 }
 
+// e - error
+// epf - error returning pointer to float
+// epuc - error returning pointer to unsigned char
+
 #ifdef STBI_NO_FAILURE_STRINGS
    #define e(x,y)  0
 #elif defined(STBI_FAILURE_USERMSG)
@@ -420,33 +523,33 @@ void stbi_image_free(void *retval_from_stbi_load)
    free(retval_from_stbi_load);
 }
 
-#define MAX_LOADERS  32
-stbi_loader *loaders[MAX_LOADERS];
-static int max_loaders = 0;
-
-int stbi_register_loader(stbi_loader *loader)
-{
-   int i;
-   for (i=0; i < MAX_LOADERS; ++i) {
-      // already present?
-      if (loaders[i] == loader)
-         return 1;
-      // end of the list?
-      if (loaders[i] == NULL) {
-         loaders[i] = loader;
-         max_loaders = i+1;
-         return 1;
-      }
-   }
-   // no room for it
-   return 0;
-}
-
 #ifndef STBI_NO_HDR
 static float   *ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
 static stbi_uc *hdr_to_ldr(float   *data, int x, int y, int comp);
 #endif
 
+static unsigned char *stbi_load_main(stbi *s, int *x, int *y, int *comp, int req_comp)
+{
+   if (stbi_jpeg_test(s)) return stbi_jpeg_load(s,x,y,comp,req_comp);
+   if (stbi_png_test(s))  return stbi_png_load(s,x,y,comp,req_comp);
+   if (stbi_bmp_test(s))  return stbi_bmp_load(s,x,y,comp,req_comp);
+   if (stbi_gif_test(s))  return stbi_gif_load(s,x,y,comp,req_comp);
+   if (stbi_psd_test(s))  return stbi_psd_load(s,x,y,comp,req_comp);
+   if (stbi_pic_test(s))  return stbi_pic_load(s,x,y,comp,req_comp);
+
+   #ifndef STBI_NO_HDR
+   if (stbi_hdr_test(s)) {
+      float *hdr = stbi_hdr_load(s, x,y,comp,req_comp);
+      return hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
+   }
+   #endif
+
+   // test tga last because it's a crappy test!
+   if (stbi_tga_test(s))
+      return stbi_tga_load(s,x,y,comp,req_comp);
+   return epuc("unknown image type", "Image not of any known type, or corrupt");
+}
+
 #ifndef STBI_NO_STDIO
 unsigned char *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
 {
@@ -460,58 +563,54 @@ unsigned char *stbi_load(char const *filename, int *x, int *y, int *comp, int re
 
 unsigned char *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
 {
-   int i;
-   if (stbi_jpeg_test_file(f))
-      return stbi_jpeg_load_from_file(f,x,y,comp,req_comp);
-   if (stbi_png_test_file(f))
-      return stbi_png_load_from_file(f,x,y,comp,req_comp);
-   if (stbi_bmp_test_file(f))
-      return stbi_bmp_load_from_file(f,x,y,comp,req_comp);
-   if (stbi_psd_test_file(f))
-      return stbi_psd_load_from_file(f,x,y,comp,req_comp);
-   #ifndef STBI_NO_HDR
-   if (stbi_hdr_test_file(f)) {
-      float *hdr = stbi_hdr_load_from_file(f, x,y,comp,req_comp);
-      return hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
-   }
-   #endif
-   for (i=0; i < max_loaders; ++i)
-      if (loaders[i]->test_file(f))
-         return loaders[i]->load_from_file(f,x,y,comp,req_comp);
-   // test tga last because it's a crappy test!
-   if (stbi_tga_test_file(f))
-      return stbi_tga_load_from_file(f,x,y,comp,req_comp);
-   return epuc("unknown image type", "Image not of any known type, or corrupt");
+   stbi s;
+   start_file(&s,f);
+   return stbi_load_main(&s,x,y,comp,req_comp);
 }
-#endif
+#endif //!STBI_NO_STDIO
 
 unsigned char *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
 {
-   int i;
-   if (stbi_jpeg_test_memory(buffer,len))
-      return stbi_jpeg_load_from_memory(buffer,len,x,y,comp,req_comp);
-   if (stbi_png_test_memory(buffer,len))
-      return stbi_png_load_from_memory(buffer,len,x,y,comp,req_comp);
-   if (stbi_bmp_test_memory(buffer,len))
-      return stbi_bmp_load_from_memory(buffer,len,x,y,comp,req_comp);
-   if (stbi_psd_test_memory(buffer,len))
-      return stbi_psd_load_from_memory(buffer,len,x,y,comp,req_comp);
+   stbi s;
+   start_mem(&s,buffer,len);
+   return stbi_load_main(&s,x,y,comp,req_comp);
+}
+
+unsigned char *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int 
*comp, int req_comp)
+{
+   stbi s;
+   start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
+   return stbi_load_main(&s,x,y,comp,req_comp);
+}
+
+#ifndef STBI_NO_HDR
+
+static float *stbi_loadf_main(stbi *s, int *x, int *y, int *comp, int req_comp)
+{
+   unsigned char *data;
    #ifndef STBI_NO_HDR
-   if (stbi_hdr_test_memory(buffer, len)) {
-      float *hdr = stbi_hdr_load_from_memory(buffer, len,x,y,comp,req_comp);
-      return hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
-   }
+   if (stbi_hdr_test(s))
+      return stbi_hdr_load(s,x,y,comp,req_comp);
    #endif
-   for (i=0; i < max_loaders; ++i)
-      if (loaders[i]->test_memory(buffer,len))
-         return loaders[i]->load_from_memory(buffer,len,x,y,comp,req_comp);
-   // test tga last because it's a crappy test!
-   if (stbi_tga_test_memory(buffer,len))
-      return stbi_tga_load_from_memory(buffer,len,x,y,comp,req_comp);
-   return epuc("unknown image type", "Image not of any known type, or corrupt");
+   data = stbi_load_main(s, x, y, comp, req_comp);
+   if (data)
+      return ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
+   return epf("unknown image type", "Image not of any known type, or corrupt");
 }
 
-#ifndef STBI_NO_HDR
+float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
+{
+   stbi s;
+   start_mem(&s,buffer,len);
+   return stbi_loadf_main(&s,x,y,comp,req_comp);
+}
+
+float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int 
req_comp)
+{
+   stbi s;
+   start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
+   return stbi_loadf_main(&s,x,y,comp,req_comp);
+}
 
 #ifndef STBI_NO_STDIO
 float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
@@ -526,31 +625,13 @@ float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
 
 float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
 {
-   unsigned char *data;
-   #ifndef STBI_NO_HDR
-   if (stbi_hdr_test_file(f))
-      return stbi_hdr_load_from_file(f,x,y,comp,req_comp);
-   #endif
-   data = stbi_load_from_file(f, x, y, comp, req_comp);
-   if (data)
-      return ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
-   return epf("unknown image type", "Image not of any known type, or corrupt");
+   stbi s;
+   start_file(&s,f);
+   return stbi_loadf_main(&s,x,y,comp,req_comp);
 }
-#endif
+#endif // !STBI_NO_STDIO
 
-float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
-{
-   stbi_uc *data;
-   #ifndef STBI_NO_HDR
-   if (stbi_hdr_test_memory(buffer, len))
-      return stbi_hdr_load_from_memory(buffer, len,x,y,comp,req_comp);
-   #endif
-   data = stbi_load_from_memory(buffer, len, x, y, comp, req_comp);
-   if (data)
-      return ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
-   return epf("unknown image type", "Image not of any known type, or corrupt");
-}
-#endif
+#endif // !STBI_NO_HDR
 
 // these is-hdr-or-not is defined independent of whether STBI_NO_HDR is
 // defined, for API simplicity; if STBI_NO_HDR is defined, it always
@@ -559,8 +640,12 @@ float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, in
 int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
 {
    #ifndef STBI_NO_HDR
-   return stbi_hdr_test_memory(buffer, len);
+   stbi s;
+   start_mem(&s,buffer,len);
+   return stbi_hdr_test(&s);
    #else
+   STBI_NOTUSED(buffer);
+   STBI_NOTUSED(len);
    return 0;
    #endif
 }
@@ -580,20 +665,25 @@ extern int      stbi_is_hdr          (char const *filename)
 extern int      stbi_is_hdr_from_file(FILE *f)
 {
    #ifndef STBI_NO_HDR
-   return stbi_hdr_test_file(f);
+   stbi s;
+   start_file(&s,f);
+   return stbi_hdr_test(&s);
    #else
    return 0;
    #endif
 }
+#endif // !STBI_NO_STDIO
 
-#endif
-
-// @TODO: get image dimensions & components without fully decoding
-#ifndef STBI_NO_STDIO
-extern int      stbi_info            (char const *filename,           int *x, int *y, int *comp);
-extern int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
-#endif
-extern int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
+extern int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
+{
+   #ifndef STBI_NO_HDR
+   stbi s;
+   start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
+   return stbi_hdr_test(&s);
+   #else
+   return 0;
+   #endif
+}
 
 #ifndef STBI_NO_HDR
 static float h2l_gamma_i=1.0f/2.2f, h2l_scale_i=1.0f;
@@ -616,71 +706,86 @@ enum
 {
    SCAN_load=0,
    SCAN_type,
-   SCAN_header,
+   SCAN_header
 };
 
-typedef struct
-{
-   uint32_t img_x, img_y;
-   int img_n, img_out_n;
-
-   #ifndef STBI_NO_STDIO
-   FILE  *img_file;
-   #endif
-   uint8_t *img_buffer, *img_buffer_end;
-} stbi;
-
-#ifndef STBI_NO_STDIO
-static void start_file(stbi *s, FILE *f)
+static void refill_buffer(stbi *s)
 {
-   s->img_file = f;
-}
-#endif
-
-static void start_mem(stbi *s, uint8_t const *buffer, int len)
-{
-#ifndef STBI_NO_STDIO
-   s->img_file = NULL;
-#endif
-   s->img_buffer = (uint8_t *) buffer;
-   s->img_buffer_end = (uint8_t *) buffer+len;
+   int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
+   if (n == 0) {
+      // at end of file, treat same as if from memory
+      s->read_from_callbacks = 0;
+      s->img_buffer = s->img_buffer_end-1;
+      *s->img_buffer = 0;
+   } else {
+      s->img_buffer = s->buffer_start;
+      s->img_buffer_end = s->buffer_start + n;
+   }
 }
 
-__forceinline static int get8(stbi *s)
+stbi_inline static int get8(stbi *s)
 {
-#ifndef STBI_NO_STDIO
-   if (s->img_file) {
-      int c = fgetc(s->img_file);
-      return c == EOF ? 0 : c;
-   }
-#endif
    if (s->img_buffer < s->img_buffer_end)
       return *s->img_buffer++;
+   if (s->read_from_callbacks) {
+      refill_buffer(s);
+      return *s->img_buffer++;
+   }
    return 0;
 }
 
-__forceinline static int at_eof(stbi *s)
+stbi_inline static int at_eof(stbi *s)
 {
-#ifndef STBI_NO_STDIO
-   if (s->img_file)
-      return feof(s->img_file);
-#endif
-   return s->img_buffer >= s->img_buffer_end;
+   if (s->io.read) {
+      if (!(s->io.eof)(s->io_user_data)) return 0;
+      // if feof() is true, check if buffer = end
+      // special case: we've only got the special 0 character at the end
+      if (s->read_from_callbacks == 0) return 1;
+   }
+
+   return s->img_buffer >= s->img_buffer_end;   
 }
 
-__forceinline static uint8_t get8u(stbi *s)
+stbi_inline static uint8_t get8u(stbi *s)
 {
    return (uint8_t) get8(s);
 }
 
 static void skip(stbi *s, int n)
 {
-#ifndef STBI_NO_STDIO
-   if (s->img_file)
-      fseek(s->img_file, n, SEEK_CUR);
-   else
-#endif
+   if (s->io.read) {
+      int blen = s->img_buffer_end - s->img_buffer;
+      if (blen < n) {
+         s->img_buffer = s->img_buffer_end;
+         (s->io.skip)(s->io_user_data, n - blen);
+         return;
+      }
+   }
+   s->img_buffer += n;
+}
+
+static int getn(stbi *s, stbi_uc *buffer, int n)
+{
+   if (s->io.read) {
+      int blen = s->img_buffer_end - s->img_buffer;
+      if (blen < n) {
+         int res, count;
+
+         memcpy(buffer, s->img_buffer, blen);
+         
+         count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
+         res = (count == (n-blen));
+         s->img_buffer = s->img_buffer_end;
+         return res;
+      }
+   }
+
+   if (s->img_buffer+n <= s->img_buffer_end) {
+      memcpy(buffer, s->img_buffer, n);
       s->img_buffer += n;
+      return 1;
+   } else
+      return 0;
 }
 
 static int get16(stbi *s)
@@ -707,18 +812,6 @@ static uint32_t get32le(stbi *s)
    return z + (get16le(s) << 16);
 }
 
-static void getn(stbi *s, stbi_uc *buffer, int n)
-{
-#ifndef STBI_NO_STDIO
-   if (s->img_file) {
-      fread(buffer, 1, n, s->img_file);
-      return;
-   }
-#endif
-   memcpy(buffer, s->img_buffer, n);
-   s->img_buffer += n;
-}
-
 //////////////////////////////////////////////////////////////////////////////
 //
 //  generic converter from built-in img_n to req_comp
@@ -735,7 +828,7 @@ static uint8_t compute_y(int r, int g, int b)
    return (uint8_t) (((r*77) + (g*150) +  (29*b)) >> 8);
 }
 
-static unsigned char *convert_format(unsigned char *data, int img_n, int req_comp, guint x, guint y)
+static unsigned char *convert_format(unsigned char *data, int img_n, int req_comp, uint x, uint y)
 {
    int i,j;
    unsigned char *good;
@@ -757,7 +850,7 @@ static unsigned char *convert_format(unsigned char *data, int img_n, int req_com
       #define CASE(a,b)   case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
       // convert source image with img_n components to one with req_comp components;
       // avoid switch per pixel, so use switch per scanline and massive macros
-      switch(COMBO(img_n, req_comp)) {
+      switch (COMBO(img_n, req_comp)) {
          CASE(1,2) dest[0]=src[0], dest[1]=255; break;
          CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
          CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
@@ -810,13 +903,13 @@ static stbi_uc *hdr_to_ldr(float   *data, int x, int y, int comp)
          float z = (float) pow(data[i*comp+k]*h2l_scale_i, h2l_gamma_i) * 255 + 0.5f;
          if (z < 0) z = 0;
          if (z > 255) z = 255;
-         output[i*comp + k] = float2int(z);
+         output[i*comp + k] = (uint8_t) float2int(z);
       }
       if (k < comp) {
          float z = data[i*comp+k] * 255 + 0.5f;
          if (z < 0) z = 0;
          if (z > 255) z = 255;
-         output[i*comp + k] = float2int(z);
+         output[i*comp + k] = (uint8_t) float2int(z);
       }
    }
    free(data);
@@ -856,21 +949,21 @@ static stbi_uc *hdr_to_ldr(float   *data, int x, int y, int comp)
 
 typedef struct
 {
-   uint8_t fast[1 << FAST_BITS];
+   uint8_t  fast[1 << FAST_BITS];
    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
    uint16_t code[256];
-   uint8_t values[256];
-   uint8_t size[257];
+   uint8_t  values[256];
+   uint8_t  size[257];
    unsigned int maxcode[18];
-   int delta[17];   // old 'firstsymbol' - old 'firstcode'
+   int    delta[17];   // old 'firstsymbol' - old 'firstcode'
 } huffman;
 
 typedef struct
 {
-   #if STBI_SIMD
+   #ifdef STBI_SIMD
    unsigned short dequant2[4][64];
    #endif
-   stbi s;
+   stbi *s;
    huffman huff_dc[4];
    huffman huff_ac[4];
    uint8_t dequant[4][64];
@@ -948,16 +1041,16 @@ static int build_huffman(huffman *h, int *count)
 static void grow_buffer_unsafe(jpeg *j)
 {
    do {
-      int b = j->nomore ? 0 : get8(&j->s);
+      int b = j->nomore ? 0 : get8(j->s);
       if (b == 0xff) {
-         int c = get8(&j->s);
+         int c = get8(j->s);
          if (c != 0) {
             j->marker = (unsigned char) c;
             j->nomore = 1;
             return;
          }
       }
-      j->code_buffer = (j->code_buffer << 8) | b;
+      j->code_buffer |= b << (24 - j->code_bits);
       j->code_bits += 8;
    } while (j->code_bits <= 24);
 }
@@ -966,7 +1059,7 @@ static void grow_buffer_unsafe(jpeg *j)
 static uint32_t bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
 
 // decode a jpeg huffman value from the bitstream
-__forceinline static int decode(jpeg *j, huffman *h)
+stbi_inline static int decode(jpeg *j, huffman *h)
 {
    unsigned int temp;
    int c,k;
@@ -975,12 +1068,14 @@ __forceinline static int decode(jpeg *j, huffman *h)
 
    // look at the top FAST_BITS and determine what symbol ID it is,
    // if the code is <= FAST_BITS
-   c = (j->code_buffer >> (j->code_bits - FAST_BITS)) & ((1 << FAST_BITS)-1);
+   c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
    k = h->fast[c];
    if (k < 255) {
-      if (h->size[k] > j->code_bits)
+      int s = h->size[k];
+      if (s > j->code_bits)
          return -1;
-      j->code_bits -= h->size[k];
+      j->code_buffer <<= s;
+      j->code_bits -= s;
       return h->values[k];
    }
 
@@ -990,10 +1085,7 @@ __forceinline static int decode(jpeg *j, huffman *h)
    // end; in other words, regardless of the number of bits, it
    // wants to be compared against something shifted to have 16;
    // that way we don't need to shift inside the loop.
-   if (j->code_bits < 16)
-      temp = (j->code_buffer << (16 - j->code_bits)) & 0xffff;
-   else
-      temp = (j->code_buffer >> (j->code_bits - 16)) & 0xffff;
+   temp = j->code_buffer >> 16;
    for (k=FAST_BITS+1 ; ; ++k)
       if (temp < h->maxcode[k])
          break;
@@ -1007,23 +1099,33 @@ __forceinline static int decode(jpeg *j, huffman *h)
       return -1;
 
    // convert the huffman code to the symbol id
-   c = ((j->code_buffer >> (j->code_bits - k)) & bmask[k]) + h->delta[k];
-   assert((((j->code_buffer) >> (j->code_bits - h->size[c])) & bmask[h->size[c]]) == h->code[c]);
+   c = ((j->code_buffer >> (32 - k)) & bmask[k]) + h->delta[k];
+   assert((((j->code_buffer) >> (32 - h->size[c])) & bmask[h->size[c]]) == h->code[c]);
 
    // convert the id to a symbol
    j->code_bits -= k;
+   j->code_buffer <<= k;
    return h->values[c];
 }
 
 // combined JPEG 'receive' and JPEG 'extend', since baseline
 // always extends everything it receives.
-__forceinline static int extend_receive(jpeg *j, int n)
+stbi_inline static int extend_receive(jpeg *j, int n)
 {
    unsigned int m = 1 << (n-1);
    unsigned int k;
    if (j->code_bits < n) grow_buffer_unsafe(j);
-   k = (j->code_buffer >> (j->code_bits - n)) & bmask[n];
+
+   #if 1
+   k = stbi_lrot(j->code_buffer, n);
+   j->code_buffer = k & ~bmask[n];
+   k &= bmask[n];
+   j->code_bits -= n;
+   #else
+   k = (j->code_buffer >> (32 - n)) & bmask[n];
    j->code_bits -= n;
+   j->code_buffer <<= n;
+   #endif
    // the following test is probably a random branch that won't
    // predict well. I tried to table accelerate it but failed.
    // maybe it's compiling as a conditional move?
@@ -1086,9 +1188,8 @@ static int decode_block(jpeg *j, short data[64], huffman *hdc, huffman *hac, int
 }
 
 // take a -128..127 value and clamp it and convert to 0..255
-__forceinline static uint8_t clamp(int x)
+stbi_inline static uint8_t clamp(int x)
 {
-   x += 128;
    // trick to use a single test to catch both cases
    if ((unsigned int) x > 255) {
       if (x < 0) return 0;
@@ -1138,65 +1239,18 @@ __forceinline static uint8_t clamp(int x)
    t1 += p2+p4;                                \
    t0 += p1+p3;
 
-#if !STBI_SIMD
-// .344 seconds on 3*anemones.jpg
-static void idct_block(uint8_t *out, int out_stride, short data[64], uint8_t *dequantize)
-{
-   int i,val[64],*v=val;
-   uint8_t *o,*dq = dequantize;
-   short *d = data;
-
-   // columns
-   for (i=0; i < 8; ++i,++d,++dq, ++v) {
-      // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
-      if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
-           && d[40]==0 && d[48]==0 && d[56]==0) {
-         //    no shortcut                 0     seconds
-         //    (1|2|3|4|5|6|7)==0          0     seconds
-         //    all separate               -0.047 seconds
-         //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
-         int dcterm = d[0] * dq[0] << 2;
-         v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
-      } else {
-         IDCT_1D(d[ 0]*dq[ 0],d[ 8]*dq[ 8],d[16]*dq[16],d[24]*dq[24],
-                 d[32]*dq[32],d[40]*dq[40],d[48]*dq[48],d[56]*dq[56])
-         // constants scaled things up by 1<<12; let's bring them back
-         // down, but keep 2 extra bits of precision
-         x0 += 512; x1 += 512; x2 += 512; x3 += 512;
-         v[ 0] = (x0+t3) >> 10;
-         v[56] = (x0-t3) >> 10;
-         v[ 8] = (x1+t2) >> 10;
-         v[48] = (x1-t2) >> 10;
-         v[16] = (x2+t1) >> 10;
-         v[40] = (x2-t1) >> 10;
-         v[24] = (x3+t0) >> 10;
-         v[32] = (x3-t0) >> 10;
-      }
-   }
-
-   for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
-      // no fast case since the first 1D IDCT spread components out
-      IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
-      // constants scaled things up by 1<<12, plus we had 1<<2 from first
-      // loop, plus horizontal and vertical each scale by sqrt(8) so together
-      // we've got an extra 1<<3, so 1<<17 total we need to remove.
-      x0 += 65536; x1 += 65536; x2 += 65536; x3 += 65536;
-      o[0] = clamp((x0+t3) >> 17);
-      o[7] = clamp((x0-t3) >> 17);
-      o[1] = clamp((x1+t2) >> 17);
-      o[6] = clamp((x1-t2) >> 17);
-      o[2] = clamp((x2+t1) >> 17);
-      o[5] = clamp((x2-t1) >> 17);
-      o[3] = clamp((x3+t0) >> 17);
-      o[4] = clamp((x3-t0) >> 17);
-   }
-}
+#ifdef STBI_SIMD
+typedef unsigned short stbi_dequantize_t;
 #else
-static void idct_block(uint8_t *out, int out_stride, short data[64], unsigned short *dequantize)
+typedef uint8_t stbi_dequantize_t;
+#endif
+
+// .344 seconds on 3*anemones.jpg
+static void idct_block(uint8_t *out, int out_stride, short data[64], stbi_dequantize_t *dequantize)
 {
    int i,val[64],*v=val;
+   stbi_dequantize_t *dq = dequantize;
    uint8_t *o;
-   unsigned short *dq = dequantize;
    short *d = data;
 
    // columns
@@ -1233,7 +1287,15 @@ static void idct_block(uint8_t *out, int out_stride, short data[64], unsigned sh
       // constants scaled things up by 1<<12, plus we had 1<<2 from first
       // loop, plus horizontal and vertical each scale by sqrt(8) so together
       // we've got an extra 1<<3, so 1<<17 total we need to remove.
-      x0 += 65536; x1 += 65536; x2 += 65536; x3 += 65536;
+      // so we want to round that, which means adding 0.5 * 1<<17,
+      // aka 65536. Also, we'll end up with -128 to 127 that we want
+      // to encode as 0..255 by adding 128, so we'll add that before the shift
+      x0 += 65536 + (128<<17);
+      x1 += 65536 + (128<<17);
+      x2 += 65536 + (128<<17);
+      x3 += 65536 + (128<<17);
+      // tried computing the shifts into temps, or'ing the temps to see
+      // if any were out of range, but that was slower
       o[0] = clamp((x0+t3) >> 17);
       o[7] = clamp((x0-t3) >> 17);
       o[1] = clamp((x1+t2) >> 17);
@@ -1244,9 +1306,11 @@ static void idct_block(uint8_t *out, int out_stride, short data[64], unsigned sh
       o[4] = clamp((x3-t0) >> 17);
    }
 }
+
+#ifdef STBI_SIMD
 static stbi_idct_8x8 stbi_idct_installed = idct_block;
 
-extern void stbi_install_idct(stbi_idct_8x8 func)
+void stbi_install_idct(stbi_idct_8x8 func)
 {
    stbi_idct_installed = func;
 }
@@ -1260,10 +1324,10 @@ static uint8_t get_marker(jpeg *j)
 {
    uint8_t x;
    if (j->marker != MARKER_none) { x = j->marker; j->marker = MARKER_none; return x; }
-   x = get8u(&j->s);
+   x = get8u(j->s);
    if (x != 0xff) return MARKER_none;
    while (x == 0xff)
-      x = get8u(&j->s);
+      x = get8u(j->s);
    return x;
 }
 
@@ -1290,7 +1354,7 @@ static int parse_entropy_coded_data(jpeg *z)
    reset(z);
    if (z->scan_n == 1) {
       int i,j;
-      #if STBI_SIMD
+      #ifdef STBI_SIMD
       __declspec(align(16))
       #endif
       short data[64];
@@ -1304,7 +1368,7 @@ static int parse_entropy_coded_data(jpeg *z)
       for (j=0; j < h; ++j) {
          for (i=0; i < w; ++i) {
             if (!decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+z->img_comp[n].ha, n)) 
return 0;
-            #if STBI_SIMD
+            #ifdef STBI_SIMD
             stbi_idct_installed(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data, 
z->dequant2[z->img_comp[n].tq]);
             #else
             idct_block(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data, 
z->dequant[z->img_comp[n].tq]);
@@ -1334,7 +1398,7 @@ static int parse_entropy_coded_data(jpeg *z)
                      int x2 = (i*z->img_comp[n].h + x)*8;
                      int y2 = (j*z->img_comp[n].v + y)*8;
                      if (!decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+z->img_comp[n].ha, 
n)) return 0;
-                     #if STBI_SIMD
+                     #ifdef STBI_SIMD
                      stbi_idct_installed(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, 
data, z->dequant2[z->img_comp[n].tq]);
                      #else
                      idct_block(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data, 
z->dequant[z->img_comp[n].tq]);
@@ -1368,21 +1432,21 @@ static int process_marker(jpeg *z, int m)
          return e("progressive jpeg","JPEG format not supported (progressive)");
 
       case 0xDD: // DRI - specify restart interval
-         if (get16(&z->s) != 4) return e("bad DRI len","Corrupt JPEG");
-         z->restart_interval = get16(&z->s);
+         if (get16(z->s) != 4) return e("bad DRI len","Corrupt JPEG");
+         z->restart_interval = get16(z->s);
          return 1;
 
       case 0xDB: // DQT - define quantization table
-         L = get16(&z->s)-2;
+         L = get16(z->s)-2;
          while (L > 0) {
-            int q = get8(&z->s);
+            int q = get8(z->s);
             int p = q >> 4;
             int t = q & 15,i;
             if (p != 0) return e("bad DQT type","Corrupt JPEG");
             if (t > 3) return e("bad DQT table","Corrupt JPEG");
             for (i=0; i < 64; ++i)
-               z->dequant[t][dezigzag[i]] = get8u(&z->s);
-            #if STBI_SIMD
+               z->dequant[t][dezigzag[i]] = get8u(z->s);
+            #ifdef STBI_SIMD
             for (i=0; i < 64; ++i)
                z->dequant2[t][i] = z->dequant[t][i];
             #endif
@@ -1391,16 +1455,16 @@ static int process_marker(jpeg *z, int m)
          return L==0;
 
       case 0xC4: // DHT - define huffman table
-         L = get16(&z->s)-2;
+         L = get16(z->s)-2;
          while (L > 0) {
             uint8_t *v;
             int sizes[16],i,m=0;
-            int q = get8(&z->s);
+            int q = get8(z->s);
             int tc = q >> 4;
             int th = q & 15;
             if (tc > 1 || th > 3) return e("bad DHT header","Corrupt JPEG");
             for (i=0; i < 16; ++i) {
-               sizes[i] = get8(&z->s);
+               sizes[i] = get8(z->s);
                m += sizes[i];
             }
             L -= 17;
@@ -1412,14 +1476,14 @@ static int process_marker(jpeg *z, int m)
                v = z->huff_ac[th].values;
             }
             for (i=0; i < m; ++i)
-               v[i] = get8u(&z->s);
+               v[i] = get8u(z->s);
             L -= m;
          }
          return L==0;
    }
    // check for comment block or APP blocks
    if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
-      skip(&z->s, get16(&z->s)-2);
+      skip(z->s, get16(z->s)-2);
       return 1;
    }
    return 0;
@@ -1429,31 +1493,31 @@ static int process_marker(jpeg *z, int m)
 static int process_scan_header(jpeg *z)
 {
    int i;
-   int Ls = get16(&z->s);
-   z->scan_n = get8(&z->s);
-   if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s.img_n) return e("bad SOS component 
count","Corrupt JPEG");
+   int Ls = get16(z->s);
+   z->scan_n = get8(z->s);
+   if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return e("bad SOS component 
count","Corrupt JPEG");
    if (Ls != 6+2*z->scan_n) return e("bad SOS len","Corrupt JPEG");
    for (i=0; i < z->scan_n; ++i) {
-      int id = get8(&z->s), which;
-      int q = get8(&z->s);
-      for (which = 0; which < z->s.img_n; ++which)
+      int id = get8(z->s), which;
+      int q = get8(z->s);
+      for (which = 0; which < z->s->img_n; ++which)
          if (z->img_comp[which].id == id)
             break;
-      if (which == z->s.img_n) return 0;
+      if (which == z->s->img_n) return 0;
       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return e("bad DC huff","Corrupt 
JPEG");
       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return e("bad AC huff","Corrupt 
JPEG");
       z->order[i] = which;
    }
-   if (get8(&z->s) != 0) return e("bad SOS","Corrupt JPEG");
-   get8(&z->s); // should be 63, but might be 0
-   if (get8(&z->s) != 0) return e("bad SOS","Corrupt JPEG");
+   if (get8(z->s) != 0) return e("bad SOS","Corrupt JPEG");
+   get8(z->s); // should be 63, but might be 0
+   if (get8(z->s) != 0) return e("bad SOS","Corrupt JPEG");
 
    return 1;
 }
 
 static int process_frame_header(jpeg *z, int scan)
 {
-   stbi *s = &z->s;
+   stbi *s = z->s;
    int Lf,p,i,q, h_max=1,v_max=1,c;
    Lf = get16(s);         if (Lf < 11) return e("bad SOF len","Corrupt JPEG"); // JPEG
    p  = get8(s);          if (p != 8) return e("only 8-bit","JPEG format not supported: 8-bit only"); // 
JPEG baseline
@@ -1543,7 +1607,7 @@ static int decode_jpeg_header(jpeg *z, int scan)
       m = get_marker(z);
       while (m == MARKER_none) {
          // some files have extra padding after their blocks, so ok, we'll scan
-         if (at_eof(&z->s)) return e("no SOF", "Corrupt JPEG");
+         if (at_eof(z->s)) return e("no SOF", "Corrupt JPEG");
          m = get_marker(z);
       }
    }
@@ -1561,6 +1625,19 @@ static int decode_jpeg_image(jpeg *j)
       if (SOS(m)) {
          if (!process_scan_header(j)) return 0;
          if (!parse_entropy_coded_data(j)) return 0;
+         if (j->marker == MARKER_none ) {
+            // handle 0s at the end of image data from IP Kamera 9060
+            while (!at_eof(j->s)) {
+               int x = get8(j->s);
+               if (x == 255) {
+                  j->marker = get8u(j->s);
+                  break;
+               } else if (x != 0) {
+                  return 0;
+               }
+            }
+            // if we reach eof without hitting a marker, get_marker() below will fail and we'll eventually 
return 0
+         }
       } else {
          if (!process_marker(j, m)) return 0;
       }
@@ -1572,12 +1649,16 @@ static int decode_jpeg_image(jpeg *j)
 // static jfif-centered resampling (across block boundaries)
 
 typedef uint8_t *(*resample_row_func)(uint8_t *out, uint8_t *in0, uint8_t *in1,
-                                      int w, int hs);
+                                    int w, int hs);
 
 #define div4(x) ((uint8_t) ((x) >> 2))
 
 static uint8_t *resample_row_1(uint8_t *out, uint8_t *in_near, uint8_t *in_far, int w, int hs)
 {
+   STBI_NOTUSED(out);
+   STBI_NOTUSED(in_far);
+   STBI_NOTUSED(w);
+   STBI_NOTUSED(hs);
    return in_near;
 }
 
@@ -1585,6 +1666,7 @@ static uint8_t* resample_row_v_2(uint8_t *out, uint8_t *in_near, uint8_t *in_far
 {
    // need to generate two samples vertically for every one in input
    int i;
+   STBI_NOTUSED(hs);
    for (i=0; i < w; ++i)
       out[i] = div4(3*in_near[i] + in_far[i] + 2);
    return out;
@@ -1595,6 +1677,7 @@ static uint8_t*  resample_row_h_2(uint8_t *out, uint8_t *in_near, uint8_t *in_fa
    // need to generate two samples horizontally for every one in input
    int i;
    uint8_t *input = in_near;
+
    if (w == 1) {
       // if only one sample, can't do any interpolation
       out[0] = out[1] = input[0];
@@ -1610,6 +1693,10 @@ static uint8_t*  resample_row_h_2(uint8_t *out, uint8_t *in_near, uint8_t *in_fa
    }
    out[i*2+0] = div4(input[w-2]*3 + input[w-1] + 2);
    out[i*2+1] = input[w-1];
+
+   STBI_NOTUSED(in_far);
+   STBI_NOTUSED(hs);
+
    return out;
 }
 
@@ -1633,6 +1720,9 @@ static uint8_t *resample_row_hv_2(uint8_t *out, uint8_t *in_near, uint8_t *in_fa
       out[i*2  ] = div16(3*t1 + t0 + 8);
    }
    out[w*2-1] = div4(t1+2);
+
+   STBI_NOTUSED(hs);
+
    return out;
 }
 
@@ -1640,6 +1730,7 @@ static uint8_t *resample_row_generic(uint8_t *out, uint8_t *in_near, uint8_t *in
 {
    // resample with nearest-neighbor
    int i,j;
+   in_far = in_far;
    for (i=0; i < w; ++i)
       for (j=0; j < hs; ++j)
          out[i*hs+j] = in_near[i];
@@ -1675,7 +1766,7 @@ static void YCbCr_to_RGB_row(uint8_t *out, const uint8_t *y, const uint8_t *pcb,
    }
 }
 
-#if STBI_SIMD
+#ifdef STBI_SIMD
 static stbi_YCbCr_to_RGB_run stbi_YCbCr_installed = YCbCr_to_RGB_row;
 
 void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func)
@@ -1689,7 +1780,7 @@ void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func)
 static void cleanup_jpeg(jpeg *j)
 {
    int i;
-   for (i=0; i < j->s.img_n; ++i) {
+   for (i=0; i < j->s->img_n; ++i) {
       if (j->img_comp[i].data) {
          free(j->img_comp[i].raw_data);
          j->img_comp[i].data = NULL;
@@ -1706,7 +1797,7 @@ typedef struct
    resample_row_func resample;
    uint8_t *line0,*line1;
    int hs,vs;   // expansion factor in each axis
-   int w_lores; // horizontal pixels pre-expansion
+   int w_lores; // horizontal pixels pre-expansion 
    int ystep;   // how far through vertical expansion we are
    int ypos;    // which pre-expansion row we're on
 } stbi_resample;
@@ -1716,23 +1807,23 @@ static uint8_t *load_jpeg_image(jpeg *z, int *out_x, int *out_y, int *comp, int
    int n, decode_n;
    // validate req_comp
    if (req_comp < 0 || req_comp > 4) return epuc("bad req_comp", "Internal error");
-   z->s.img_n = 0;
+   z->s->img_n = 0;
 
    // load a jpeg image from whichever source
    if (!decode_jpeg_image(z)) { cleanup_jpeg(z); return NULL; }
 
    // determine actual number of components to generate
-   n = req_comp ? req_comp : z->s.img_n;
+   n = req_comp ? req_comp : z->s->img_n;
 
-   if (z->s.img_n == 3 && n < 3)
+   if (z->s->img_n == 3 && n < 3)
       decode_n = 1;
    else
-      decode_n = z->s.img_n;
+      decode_n = z->s->img_n;
 
    // resample and color-convert
    {
       int k;
-      guint i,j;
+      uint i,j;
       uint8_t *output;
       uint8_t *coutput[4];
 
@@ -1743,13 +1834,13 @@ static uint8_t *load_jpeg_image(jpeg *z, int *out_x, int *out_y, int *comp, int
 
          // allocate line buffer big enough for upsampling off the edges
          // with upsample factor of 4
-         z->img_comp[k].linebuf = (uint8_t *) malloc(z->s.img_x + 3);
+         z->img_comp[k].linebuf = (uint8_t *) malloc(z->s->img_x + 3);
          if (!z->img_comp[k].linebuf) { cleanup_jpeg(z); return epuc("outofmem", "Out of memory"); }
 
          r->hs      = z->img_h_max / z->img_comp[k].h;
          r->vs      = z->img_v_max / z->img_comp[k].v;
          r->ystep   = r->vs >> 1;
-         r->w_lores = (z->s.img_x + r->hs-1) / r->hs;
+         r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
          r->ypos    = 0;
          r->line0   = r->line1 = z->img_comp[k].data;
 
@@ -1761,12 +1852,12 @@ static uint8_t *load_jpeg_image(jpeg *z, int *out_x, int *out_y, int *comp, int
       }
 
       // can't error after this so, this is safe
-      output = (uint8_t *) malloc(n * z->s.img_x * z->s.img_y + 1);
+      output = (uint8_t *) malloc(n * z->s->img_x * z->s->img_y + 1);
       if (!output) { cleanup_jpeg(z); return epuc("outofmem", "Out of memory"); }
 
       // now go ahead and resample
-      for (j=0; j < z->s.img_y; ++j) {
-         uint8_t *out = output + n * z->s.img_x * j;
+      for (j=0; j < z->s->img_y; ++j) {
+         uint8_t *out = output + n * z->s->img_x * j;
          for (k=0; k < decode_n; ++k) {
             stbi_resample *r = &res_comp[k];
             int y_bot = r->ystep >= (r->vs >> 1);
@@ -1783,14 +1874,14 @@ static uint8_t *load_jpeg_image(jpeg *z, int *out_x, int *out_y, int *comp, int
          }
          if (n >= 3) {
             uint8_t *y = coutput[0];
-            if (z->s.img_n == 3) {
-               #if STBI_SIMD
+            if (z->s->img_n == 3) {
+               #ifdef STBI_SIMD
                stbi_YCbCr_installed(out, y, coutput[1], coutput[2], z->s.img_x, n);
                #else
-               YCbCr_to_RGB_row(out, y, coutput[1], coutput[2], z->s.img_x, n);
+               YCbCr_to_RGB_row(out, y, coutput[1], coutput[2], z->s->img_x, n);
                #endif
             } else
-               for (i=0; i < z->s.img_x; ++i) {
+               for (i=0; i < z->s->img_x; ++i) {
                   out[0] = out[1] = out[2] = y[i];
                   out[3] = 255; // not used if n==3
                   out += n;
@@ -1798,72 +1889,55 @@ static uint8_t *load_jpeg_image(jpeg *z, int *out_x, int *out_y, int *comp, int
          } else {
             uint8_t *y = coutput[0];
             if (n == 1)
-               for (i=0; i < z->s.img_x; ++i) out[i] = y[i];
+               for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
             else
-               for (i=0; i < z->s.img_x; ++i) *out++ = y[i], *out++ = 255;
+               for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
          }
       }
       cleanup_jpeg(z);
-      *out_x = z->s.img_x;
-      *out_y = z->s.img_y;
-      if (comp) *comp  = z->s.img_n; // report original components, not output
+      *out_x = z->s->img_x;
+      *out_y = z->s->img_y;
+      if (comp) *comp  = z->s->img_n; // report original components, not output
       return output;
    }
 }
 
-#ifndef STBI_NO_STDIO
-unsigned char *stbi_jpeg_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
+static unsigned char *stbi_jpeg_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
    jpeg j;
-   start_file(&j.s, f);
+   j.s = s;
    return load_jpeg_image(&j, x,y,comp,req_comp);
 }
 
-unsigned char *stbi_jpeg_load(char const *filename, int *x, int *y, int *comp, int req_comp)
-{
-   unsigned char *data;
-   FILE *f = fopen(filename, "rb");
-   if (!f) return NULL;
-   data = stbi_jpeg_load_from_file(f,x,y,comp,req_comp);
-   fclose(f);
-   return data;
-}
-#endif
-
-unsigned char *stbi_jpeg_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp)
+static int stbi_jpeg_test(stbi *s)
 {
+   int r;
    jpeg j;
-   start_mem(&j.s, buffer,len);
-   return load_jpeg_image(&j, x,y,comp,req_comp);
+   j.s = s;
+   r = decode_jpeg_header(&j, SCAN_type);
+   stbi_rewind(s);
+   return r;
 }
 
-#ifndef STBI_NO_STDIO
-int stbi_jpeg_test_file(FILE *f)
+static int stbi_jpeg_info_raw(jpeg *j, int *x, int *y, int *comp)
 {
-   int n,r;
-   jpeg j;
-   n = ftell(f);
-   start_file(&j.s, f);
-   r = decode_jpeg_header(&j, SCAN_type);
-   fseek(f,n,SEEK_SET);
-   return r;
+   if (!decode_jpeg_header(j, SCAN_header)) {
+      stbi_rewind( j->s );
+      return 0;
+   }
+   if (x) *x = j->s->img_x;
+   if (y) *y = j->s->img_y;
+   if (comp) *comp = j->s->img_n;
+   return 1;
 }
-#endif
 
-int stbi_jpeg_test_memory(stbi_uc const *buffer, int len)
+static int stbi_jpeg_info(stbi *s, int *x, int *y, int *comp)
 {
    jpeg j;
-   start_mem(&j.s, buffer,len);
-   return decode_jpeg_header(&j, SCAN_type);
+   j.s = s;
+   return stbi_jpeg_info_raw(&j, x, y, comp);
 }
 
-// @TODO:
-#ifndef STBI_NO_STDIO
-extern int      stbi_jpeg_info            (char const *filename,           int *x, int *y, int *comp);
-extern int      stbi_jpeg_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
-#endif
-extern int      stbi_jpeg_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
-
 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
 //    simple implementation
 //      - all input must be provided in an upfront buffer
@@ -1883,11 +1957,11 @@ typedef struct
    uint16_t firstcode[16];
    int maxcode[17];
    uint16_t firstsymbol[16];
-   uint8_t size[288];
-   uint16_t value[288];
+   uint8_t  size[288];
+   uint16_t value[288]; 
 } zhuffman;
 
-__forceinline static int bitreverse16(int n)
+stbi_inline static int bitreverse16(int n)
 {
   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
@@ -1896,7 +1970,7 @@ __forceinline static int bitreverse16(int n)
   return n;
 }
 
-__forceinline static int bit_reverse(int v, int bits)
+stbi_inline static int bit_reverse(int v, int bits)
 {
    assert(bits <= 16);
    // to bit reverse n bits, reverse 16 and shift
@@ -1912,7 +1986,7 @@ static int zbuild_huffman(zhuffman *z, uint8_t *sizelist, int num)
    // DEFLATE spec for generating codes
    memset(sizes, 0, sizeof(sizes));
    memset(z->fast, 255, sizeof(z->fast));
-   for (i=0; i < num; ++i)
+   for (i=0; i < num; ++i) 
       ++sizes[sizelist[i]];
    sizes[0] = 0;
    for (i=1; i < 16; ++i)
@@ -1969,7 +2043,7 @@ typedef struct
    zhuffman z_length, z_distance;
 } zbuf;
 
-__forceinline static int zget8(zbuf *z)
+stbi_inline static int zget8(zbuf *z)
 {
    if (z->zbuffer >= z->zbuffer_end) return 0;
    return *z->zbuffer++;
@@ -1984,17 +2058,17 @@ static void fill_bits(zbuf *z)
    } while (z->num_bits <= 24);
 }
 
-__forceinline static unsigned int zreceive(zbuf *z, int n)
+stbi_inline static unsigned int zreceive(zbuf *z, int n)
 {
    unsigned int k;
    if (z->num_bits < n) fill_bits(z);
    k = z->code_buffer & ((1 << n) - 1);
    z->code_buffer >>= n;
    z->num_bits -= n;
-   return k;
+   return k;   
 }
 
-__forceinline static int zhuffman_decode(zbuf *a, zhuffman *z)
+stbi_inline static int zhuffman_decode(zbuf *a, zhuffman *z)
 {
    int b,s,k;
    if (a->num_bits < 16) fill_bits(a);
@@ -2043,7 +2117,7 @@ static int length_base[31] = {
    15,17,19,23,27,31,35,43,51,59,
    67,83,99,115,131,163,195,227,258,0,0 };
 
-static int length_extra[31]=
+static int length_extra[31]= 
 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
 
 static int dist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
@@ -2244,6 +2318,22 @@ char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
 }
 
+char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int 
*outlen, int parse_header)
+{
+   zbuf a;
+   char *p = (char *) malloc(initial_size);
+   if (p == NULL) return NULL;
+   a.zbuffer = (uint8_t *) buffer;
+   a.zbuffer_end = (uint8_t *) buffer + len;
+   if (do_zlib(&a, p, initial_size, 1, parse_header)) {
+      if (outlen) *outlen = (int) (a.zout - a.zout_start);
+      return a.zout_start;
+   } else {
+      free(a.zout_start);
+      return NULL;
+   }
+}
+
 int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
 {
    zbuf a;
@@ -2314,20 +2404,20 @@ static int check_png_header(stbi *s)
    static uint8_t png_sig[8] = { 137,80,78,71,13,10,26,10 };
    int i;
    for (i=0; i < 8; ++i)
-      if (get8(s) != png_sig[i]) return e("bad png sig","Not a PNG");
+      if (get8u(s) != png_sig[i]) return e("bad png sig","Not a PNG");
    return 1;
 }
 
 typedef struct
 {
-   stbi s;
+   stbi *s;
    uint8_t *idata, *expanded, *out;
 } png;
 
 
 enum {
    F_none=0, F_sub=1, F_up=2, F_avg=3, F_paeth=4,
-   F_avg_first, F_paeth_first,
+   F_avg_first, F_paeth_first
 };
 
 static uint8_t first_row_filter[5] =
@@ -2349,7 +2439,7 @@ static int paeth(int a, int b, int c)
 // create the png data from post-deflated data
 static int create_png_image_raw(png *a, uint8_t *raw, uint32_t raw_len, int out_n, uint32_t x, uint32_t y)
 {
-   stbi *s = &a->s;
+   stbi *s = a->s;
    uint32_t i,j,stride = x*out_n;
    int k;
    int img_n = s->img_n; // copy it into a local for later
@@ -2373,7 +2463,7 @@ static int create_png_image_raw(png *a, uint8_t *raw, uint32_t raw_len, int out_
       if (j == 0) filter = first_row_filter[filter];
       // handle first pixel explicitly
       for (k=0; k < img_n; ++k) {
-         switch(filter) {
+         switch (filter) {
             case F_none       : cur[k] = raw[k]; break;
             case F_sub        : cur[k] = raw[k]; break;
             case F_up         : cur[k] = raw[k] + prior[k]; break;
@@ -2393,7 +2483,7 @@ static int create_png_image_raw(png *a, uint8_t *raw, uint32_t raw_len, int out_
              case f:     \
                 for (i=x-1; i >= 1; --i, raw+=img_n,cur+=img_n,prior+=img_n) \
                    for (k=0; k < img_n; ++k)
-         switch(filter) {
+         switch (filter) {
             CASE(F_none)  cur[k] = raw[k]; break;
             CASE(F_sub)   cur[k] = raw[k] + cur[k-img_n]; break;
             CASE(F_up)    cur[k] = raw[k] + prior[k]; break;
@@ -2409,7 +2499,7 @@ static int create_png_image_raw(png *a, uint8_t *raw, uint32_t raw_len, int out_
              case f:     \
                 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
                    for (k=0; k < img_n; ++k)
-         switch(filter) {
+         switch (filter) {
             CASE(F_none)  cur[k] = raw[k]; break;
             CASE(F_sub)   cur[k] = raw[k] + cur[k-out_n]; break;
             CASE(F_up)    cur[k] = raw[k] + prior[k]; break;
@@ -2430,12 +2520,12 @@ static int create_png_image(png *a, uint8_t *raw, uint32_t raw_len, int out_n, i
    int p;
    int save;
    if (!interlaced)
-      return create_png_image_raw(a, raw, raw_len, out_n, a->s.img_x, a->s.img_y);
+      return create_png_image_raw(a, raw, raw_len, out_n, a->s->img_x, a->s->img_y);
    save = stbi_png_partial;
    stbi_png_partial = 0;
 
    // de-interlacing
-   final = (uint8_t *) malloc(a->s.img_x * a->s.img_y * out_n);
+   final = (uint8_t *) malloc(a->s->img_x * a->s->img_y * out_n);
    for (p=0; p < 7; ++p) {
       int xorig[] = { 0,4,0,2,0,1,0 };
       int yorig[] = { 0,0,4,0,2,0,1 };
@@ -2443,8 +2533,8 @@ static int create_png_image(png *a, uint8_t *raw, uint32_t raw_len, int out_n, i
       int yspc[]  = { 8,8,8,4,4,2,2 };
       int i,j,x,y;
       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
-      x = (a->s.img_x - xorig[p] + xspc[p]-1) / xspc[p];
-      y = (a->s.img_y - yorig[p] + yspc[p]-1) / yspc[p];
+      x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
+      y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
       if (x && y) {
          if (!create_png_image_raw(a, raw, raw_len, out_n, x, y)) {
             free(final);
@@ -2452,7 +2542,7 @@ static int create_png_image(png *a, uint8_t *raw, uint32_t raw_len, int out_n, i
          }
          for (j=0; j < y; ++j)
             for (i=0; i < x; ++i)
-               memcpy(final + (j*yspc[p]+yorig[p])*a->s.img_x*out_n + (i*xspc[p]+xorig[p])*out_n,
+               memcpy(final + (j*yspc[p]+yorig[p])*a->s->img_x*out_n + (i*xspc[p]+xorig[p])*out_n,
                       a->out + (j*x+i)*out_n, out_n);
          free(a->out);
          raw += (x*out_n+1)*y;
@@ -2467,7 +2557,7 @@ static int create_png_image(png *a, uint8_t *raw, uint32_t raw_len, int out_n, i
 
 static int compute_transparency(png *z, uint8_t tc[3], int out_n)
 {
-   stbi *s = &z->s;
+   stbi *s = z->s;
    uint32_t i, pixel_count = s->img_x * s->img_y;
    uint8_t *p = z->out;
 
@@ -2492,7 +2582,7 @@ static int compute_transparency(png *z, uint8_t tc[3], int out_n)
 
 static int expand_palette(png *a, uint8_t *palette, int len, int pal_img_n)
 {
-   uint32_t i, pixel_count = a->s.img_x * a->s.img_y;
+   uint32_t i, pixel_count = a->s->img_x * a->s->img_y;
    uint8_t *p, *temp_out, *orig = a->out;
 
    p = (uint8_t *) malloc(pixel_count * pal_img_n);
@@ -2521,29 +2611,93 @@ static int expand_palette(png *a, uint8_t *palette, int len, int pal_img_n)
    }
    free(a->out);
    a->out = temp_out;
+
+   STBI_NOTUSED(len);
+
    return 1;
 }
 
+static int stbi_unpremultiply_on_load = 0;
+static int stbi_de_iphone_flag = 0;
+
+void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
+{
+   stbi_unpremultiply_on_load = flag_true_if_should_unpremultiply;
+}
+void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
+{
+   stbi_de_iphone_flag = flag_true_if_should_convert;
+}
+
+static void stbi_de_iphone(png *z)
+{
+   stbi *s = z->s;
+   uint32_t i, pixel_count = s->img_x * s->img_y;
+   uint8_t *p = z->out;
+
+   if (s->img_out_n == 3) {  // convert bgr to rgb
+      for (i=0; i < pixel_count; ++i) {
+         uint8_t t = p[0];
+         p[0] = p[2];
+         p[2] = t;
+         p += 3;
+      }
+   } else {
+      assert(s->img_out_n == 4);
+      if (stbi_unpremultiply_on_load) {
+         // convert bgr to rgb and unpremultiply
+         for (i=0; i < pixel_count; ++i) {
+            uint8_t a = p[3];
+            uint8_t t = p[0];
+            if (a) {
+               p[0] = p[2] * 255 / a;
+               p[1] = p[1] * 255 / a;
+               p[2] =  t   * 255 / a;
+            } else {
+               p[0] = p[2];
+               p[2] = t;
+            } 
+            p += 4;
+         }
+      } else {
+         // convert bgr to rgb
+         for (i=0; i < pixel_count; ++i) {
+            uint8_t t = p[0];
+            p[0] = p[2];
+            p[2] = t;
+            p += 4;
+         }
+      }
+   }
+}
+
 static int parse_png_file(png *z, int scan, int req_comp)
 {
    uint8_t palette[1024], pal_img_n=0;
    uint8_t has_trans=0, tc[3];
    uint32_t ioff=0, idata_limit=0, i, pal_len=0;
-   int first=1,k,interlace=0;
-   stbi *s = &z->s;
+   int first=1,k,interlace=0, iphone=0;
+   stbi *s = z->s;
+
+   z->expanded = NULL;
+   z->idata = NULL;
+   z->out = NULL;
 
    if (!check_png_header(s)) return 0;
 
    if (scan == SCAN_type) return 1;
 
-   for(;;first=0) {
+   for (;;) {
       chunk c = get_chunk_header(s);
-      if (first && c.type != PNG_TYPE('I','H','D','R'))
-         return e("first not IHDR","Corrupt PNG");
       switch (c.type) {
+         case PNG_TYPE('C','g','B','I'):
+            iphone = stbi_de_iphone_flag;
+            skip(s, c.length);
+            break;
          case PNG_TYPE('I','H','D','R'): {
             int depth,color,comp,filter;
             if (!first) return e("multiple IHDR","Corrupt PNG");
+            first = 0;
             if (c.length != 13) return e("bad IHDR len","Corrupt PNG");
             s->img_x = get32(s); if (s->img_x > (1 << 24)) return e("too large","Very large image 
(corrupt?)");
             s->img_y = get32(s); if (s->img_y > (1 << 24)) return e("too large","Very large image 
(corrupt?)");
@@ -2569,6 +2723,7 @@ static int parse_png_file(png *z, int scan, int req_comp)
          }
 
          case PNG_TYPE('P','L','T','E'):  {
+            if (first) return e("first not IHDR", "Corrupt PNG");
             if (c.length > 256*3) return e("invalid PLTE","Corrupt PNG");
             pal_len = c.length / 3;
             if (pal_len * 3 != c.length) return e("invalid PLTE","Corrupt PNG");
@@ -2582,6 +2737,7 @@ static int parse_png_file(png *z, int scan, int req_comp)
          }
 
          case PNG_TYPE('t','R','N','S'): {
+            if (first) return e("first not IHDR", "Corrupt PNG");
             if (z->idata) return e("tRNS after IDAT","Corrupt PNG");
             if (pal_img_n) {
                if (scan == SCAN_header) { s->img_n = 4; return 1; }
@@ -2601,6 +2757,7 @@ static int parse_png_file(png *z, int scan, int req_comp)
          }
 
          case PNG_TYPE('I','D','A','T'): {
+            if (first) return e("first not IHDR", "Corrupt PNG");
             if (pal_img_n && !pal_len) return e("no PLTE","Corrupt PNG");
             if (scan == SCAN_header) { s->img_n = pal_img_n; return 1; }
             if (ioff + c.length > idata_limit) {
@@ -2611,26 +2768,17 @@ static int parse_png_file(png *z, int scan, int req_comp)
                p = (uint8_t *) realloc(z->idata, idata_limit); if (p == NULL) return e("outofmem", "Out of 
memory");
                z->idata = p;
             }
-            #ifndef STBI_NO_STDIO
-            if (s->img_file)
-            {
-               if (fread(z->idata+ioff,1,c.length,s->img_file) != c.length) return e("outofdata","Corrupt 
PNG");
-            }
-            else
-            #endif
-            {
-               memcpy(z->idata+ioff, s->img_buffer, c.length);
-               s->img_buffer += c.length;
-            }
+            if (!getn(s, z->idata+ioff,c.length)) return e("outofdata","Corrupt PNG");
             ioff += c.length;
             break;
          }
 
          case PNG_TYPE('I','E','N','D'): {
             uint32_t raw_len;
+            if (first) return e("first not IHDR", "Corrupt PNG");
             if (scan != SCAN_load) return 1;
             if (z->idata == NULL) return e("no IDAT","Corrupt PNG");
-            z->expanded = (uint8_t *) stbi_zlib_decode_malloc((char *) z->idata, ioff, (int *) &raw_len);
+            z->expanded = (uint8_t *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, 
16384, (int *) &raw_len, !iphone);
             if (z->expanded == NULL) return 0; // zlib should set error
             free(z->idata); z->idata = NULL;
             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
@@ -2640,6 +2788,8 @@ static int parse_png_file(png *z, int scan, int req_comp)
             if (!create_png_image(z, z->expanded, raw_len, s->img_out_n, interlace)) return 0;
             if (has_trans)
                if (!compute_transparency(z, tc, s->img_out_n)) return 0;
+            if (iphone && s->img_out_n > 2)
+               stbi_de_iphone(z);
             if (pal_img_n) {
                // pal_img_n == 3 or 4
                s->img_n = pal_img_n; // record the actual colors we had
@@ -2654,6 +2804,7 @@ static int parse_png_file(png *z, int scan, int req_comp)
 
          default:
             // if critical, fail
+            if (first) return e("first not IHDR", "Corrupt PNG");
             if ((c.type & (1 << 29)) == 0) {
                #ifndef STBI_NO_FAILURE_STRINGS
                // not threadsafe
@@ -2676,21 +2827,18 @@ static int parse_png_file(png *z, int scan, int req_comp)
 static unsigned char *do_png(png *p, int *x, int *y, int *n, int req_comp)
 {
    unsigned char *result=NULL;
-   p->expanded = NULL;
-   p->idata = NULL;
-   p->out = NULL;
    if (req_comp < 0 || req_comp > 4) return epuc("bad req_comp", "Internal error");
    if (parse_png_file(p, SCAN_load, req_comp)) {
       result = p->out;
       p->out = NULL;
-      if (req_comp && req_comp != p->s.img_out_n) {
-         result = convert_format(result, p->s.img_out_n, req_comp, p->s.img_x, p->s.img_y);
-         p->s.img_out_n = req_comp;
+      if (req_comp && req_comp != p->s->img_out_n) {
+         result = convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
+         p->s->img_out_n = req_comp;
          if (result == NULL) return result;
       }
-      *x = p->s.img_x;
-      *y = p->s.img_y;
-      if (n) *n = p->s.img_n;
+      *x = p->s->img_x;
+      *y = p->s->img_y;
+      if (n) *n = p->s->img_n;
    }
    free(p->out);      p->out      = NULL;
    free(p->expanded); p->expanded = NULL;
@@ -2699,75 +2847,40 @@ static unsigned char *do_png(png *p, int *x, int *y, int *n, int req_comp)
    return result;
 }
 
-#ifndef STBI_NO_STDIO
-unsigned char *stbi_png_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
+static unsigned char *stbi_png_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
    png p;
-   start_file(&p.s, f);
+   p.s = s;
    return do_png(&p, x,y,comp,req_comp);
 }
 
-unsigned char *stbi_png_load(char const *filename, int *x, int *y, int *comp, int req_comp)
+static int stbi_png_test(stbi *s)
 {
-   unsigned char *data;
-   FILE *f = fopen(filename, "rb");
-   if (!f) return NULL;
-   data = stbi_png_load_from_file(f,x,y,comp,req_comp);
-   fclose(f);
-   return data;
-}
-#endif
-
-unsigned char *stbi_png_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int 
req_comp)
-{
-   png p;
-   start_mem(&p.s, buffer,len);
-   return do_png(&p, x,y,comp,req_comp);
-}
-
-#ifndef STBI_NO_STDIO
-int stbi_png_test_file(FILE *f)
-{
-   png p;
-   int n,r;
-   n = ftell(f);
-   start_file(&p.s, f);
-   r = parse_png_file(&p, SCAN_type,STBI_default);
-   fseek(f,n,SEEK_SET);
+   int r;
+   r = check_png_header(s);
+   stbi_rewind(s);
    return r;
 }
-#endif
 
-int stbi_png_test_memory(stbi_uc const *buffer, int len)
+static int stbi_png_info_raw(png *p, int *x, int *y, int *comp)
 {
-   png p;
-   start_mem(&p.s, buffer, len);
-   return parse_png_file(&p, SCAN_type,STBI_default);
+   if (!parse_png_file(p, SCAN_header, 0)) {
+      stbi_rewind( p->s );
+      return 0;
+   }
+   if (x) *x = p->s->img_x;
+   if (y) *y = p->s->img_y;
+   if (comp) *comp = p->s->img_n;
+   return 1;
 }
 
-// TODO: load header from png
-#ifndef STBI_NO_STDIO
-int      stbi_png_info             (char const *filename,           int *x, int *y, int *comp)
+static int      stbi_png_info(stbi *s, int *x, int *y, int *comp)
 {
    png p;
-   FILE *f = fopen(filename, "rb");
-   if (!f) return 0;
-   start_file(&p.s, f);
-   if (parse_png_file(&p, SCAN_header, 0)) {
-      if(x) *x = p.s.img_x;
-      if(y) *y = p.s.img_y;
-      if (comp) *comp = p.s.img_n;
-      fclose(f);
-      return 1;
-   }
-   fclose(f);
-   return 0;
+   p.s = s;
+   return stbi_png_info_raw(&p, x, y, comp);
 }
 
-extern int      stbi_png_info_from_file   (FILE *f,                  int *x, int *y, int *comp);
-#endif
-extern int      stbi_png_info_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
-
 // Microsoft/Windows BMP image
 
 static int bmp_test(stbi *s)
@@ -2784,24 +2897,13 @@ static int bmp_test(stbi *s)
    return 0;
 }
 
-#ifndef STBI_NO_STDIO
-int      stbi_bmp_test_file        (FILE *f)
+static int stbi_bmp_test(stbi *s)
 {
-   stbi s;
-   int r,n = ftell(f);
-   start_file(&s,f);
-   r = bmp_test(&s);
-   fseek(f,n,SEEK_SET);
+   int r = bmp_test(s);
+   stbi_rewind(s);
    return r;
 }
-#endif
 
-int      stbi_bmp_test_memory      (stbi_uc const *buffer, int len)
-{
-   stbi s;
-   start_mem(&s, buffer, len);
-   return bmp_test(&s);
-}
 
 // returns 0..31 for the highest set bit
 static int high_bit(unsigned int z)
@@ -2857,7 +2959,6 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
    offset = get32le(s);
    hsz = get32le(s);
    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108) return epuc("unknown BMP", "BMP type not 
supported: unknown");
-   failure_reason = "bad BMP";
    if (hsz == 12) {
       s->img_x = get16le(s);
       s->img_y = get16le(s);
@@ -2865,7 +2966,7 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
       s->img_x = get32le(s);
       s->img_y = get32le(s);
    }
-   if (get16le(s) != 1) return 0;
+   if (get16le(s) != 1) return epuc("bad BMP", "bad BMP");
    bpp = get16le(s);
    if (bpp == 1) return epuc("monochrome", "BMP type not supported: 1-bit");
    flip_vertically = ((int) s->img_y) > 0;
@@ -2892,14 +2993,14 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
             mr = mg = mb = 0;
             if (compress == 0) {
                if (bpp == 32) {
-                  mr = 0xff << 16;
-                  mg = 0xff <<  8;
-                  mb = 0xff <<  0;
-                  ma = 0xff << 24;
+                  mr = 0xffu << 16;
+                  mg = 0xffu <<  8;
+                  mb = 0xffu <<  0;
+                  ma = 0xffu << 24;
                } else {
-                  mr = 31 << 10;
-                  mg = 31 <<  5;
-                  mb = 31 <<  0;
+                  mr = 31u << 10;
+                  mg = 31u <<  5;
+                  mb = 31u <<  0;
                }
             } else if (compress == 3) {
                mr = get32le(s);
@@ -2908,10 +3009,10 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
                // not documented, but generated by photoshop and handled by mspaint
                if (mr == mg && mg == mb) {
                   // ?!?!?
-                  return NULL;
+                  return epuc("bad BMP", "bad BMP");
                }
             } else
-               return NULL;
+               return epuc("bad BMP", "bad BMP");
          }
       } else {
          assert(hsz == 108);
@@ -2937,9 +3038,9 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
       int z=0;
       if (psize == 0 || psize > 256) { free(out); return epuc("invalid", "Corrupt BMP"); }
       for (i=0; i < psize; ++i) {
-         pal[i][2] = get8(s);
-         pal[i][1] = get8(s);
-         pal[i][0] = get8(s);
+         pal[i][2] = get8u(s);
+         pal[i][1] = get8u(s);
+         pal[i][0] = get8u(s);
          if (hsz != 12) get8(s);
          pal[i][3] = 255;
       }
@@ -2980,11 +3081,11 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
       if (bpp == 24) {
          easy = 1;
       } else if (bpp == 32) {
-         if (mb == 0xff && mg == 0xff00 && mr == 0xff000000 && ma == 0xff000000)
+         if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
             easy = 2;
       }
       if (!easy) {
-         if (!mr || !mg || !mb) return epuc("bad masks", "Corrupt BMP");
+         if (!mr || !mg || !mb) { free(out); return epuc("bad masks", "Corrupt BMP"); }
          // right shift amt to put high bit in position #7
          rshift = high_bit(mr)-7; rcount = bitcount(mr);
          gshift = high_bit(mg)-7; gcount = bitcount(mr);
@@ -2995,22 +3096,22 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
          if (easy) {
             for (i=0; i < (int) s->img_x; ++i) {
                int a;
-               out[z+2] = get8(s);
-               out[z+1] = get8(s);
-               out[z+0] = get8(s);
+               out[z+2] = get8u(s);
+               out[z+1] = get8u(s);
+               out[z+0] = get8u(s);
                z += 3;
                a = (easy == 2 ? get8(s) : 255);
-               if (target == 4) out[z++] = a;
+               if (target == 4) out[z++] = (uint8_t) a;
             }
          } else {
             for (i=0; i < (int) s->img_x; ++i) {
                uint32_t v = (bpp == 16 ? get16le(s) : get32le(s));
                int a;
-               out[z++] = shiftsigned(v & mr, rshift, rcount);
-               out[z++] = shiftsigned(v & mg, gshift, gcount);
-               out[z++] = shiftsigned(v & mb, bshift, bcount);
+               out[z++] = (uint8_t) shiftsigned(v & mr, rshift, rcount);
+               out[z++] = (uint8_t) shiftsigned(v & mg, gshift, gcount);
+               out[z++] = (uint8_t) shiftsigned(v & mb, bshift, bcount);
                a = (ma ? shiftsigned(v & ma, ashift, acount) : 255);
-               if (target == 4) out[z++] = a;
+               if (target == 4) out[z++] = (uint8_t) a; 
             }
          }
          skip(s, pad);
@@ -3034,538 +3135,1059 @@ static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 
    *x = s->img_x;
    *y = s->img_y;
-   if (comp) *comp = target;
+   if (comp) *comp = s->img_n;
    return out;
 }
 
-#ifndef STBI_NO_STDIO
-stbi_uc *stbi_bmp_load             (char const *filename,           int *x, int *y, int *comp, int req_comp)
+static stbi_uc *stbi_bmp_load(stbi *s,int *x, int *y, int *comp, int req_comp)
 {
-   stbi_uc *data;
-   FILE *f = fopen(filename, "rb");
-   if (!f) return NULL;
-   data = stbi_bmp_load_from_file(f, x,y,comp,req_comp);
-   fclose(f);
-   return data;
+   return bmp_load(s, x,y,comp,req_comp);
+}
+
+
+// Targa Truevision - TGA
+// by Jonathan Dummer
+
+static int tga_info(stbi *s, int *x, int *y, int *comp)
+{
+    int tga_w, tga_h, tga_comp;
+    int sz;
+    get8u(s);                   // discard Offset
+    sz = get8u(s);              // color type
+    if( sz > 1 ) {
+        stbi_rewind(s);
+        return 0;      // only RGB or indexed allowed
+    }
+    sz = get8u(s);              // image type
+    // only RGB or grey allowed, +/- RLE
+    if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
+    skip(s,9);
+    tga_w = get16le(s);
+    if( tga_w < 1 ) {
+        stbi_rewind(s);
+        return 0;   // test width
+    }
+    tga_h = get16le(s);
+    if( tga_h < 1 ) {
+        stbi_rewind(s);
+        return 0;   // test height
+    }
+    sz = get8(s);               // bits per pixel
+    // only RGB or RGBA or grey allowed
+    if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
+        stbi_rewind(s);
+        return 0;
+    }
+    tga_comp = sz;
+    if (x) *x = tga_w;
+    if (y) *y = tga_h;
+    if (comp) *comp = tga_comp / 8;
+    return 1;                   // seems to have passed everything
+}
+
+int stbi_tga_info(stbi *s, int *x, int *y, int *comp)
+{
+    return tga_info(s, x, y, comp);
 }
 
-stbi_uc *stbi_bmp_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp)
+static int tga_test(stbi *s)
 {
-   stbi s;
-   start_file(&s, f);
-   return bmp_load(&s, x,y,comp,req_comp);
+   int sz;
+   get8u(s);      //   discard Offset
+   sz = get8u(s);   //   color type
+   if ( sz > 1 ) return 0;   //   only RGB or indexed allowed
+   sz = get8u(s);   //   image type
+   if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0;   //   only 
RGB or grey allowed, +/- RLE
+   get16(s);      //   discard palette start
+   get16(s);      //   discard palette length
+   get8(s);         //   discard bits per palette color entry
+   get16(s);      //   discard x origin
+   get16(s);      //   discard y origin
+   if ( get16(s) < 1 ) return 0;      //   test width
+   if ( get16(s) < 1 ) return 0;      //   test height
+   sz = get8(s);   //   bits per pixel
+   if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) ) return 0;   //   only RGB or RGBA or grey 
allowed
+   return 1;      //   seems to have passed everything
+}
+
+static int stbi_tga_test(stbi *s)
+{
+   int res = tga_test(s);
+   stbi_rewind(s);
+   return res;
 }
-#endif
 
-stbi_uc *stbi_bmp_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
+static stbi_uc *tga_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
-   stbi s;
-   start_mem(&s, buffer, len);
-   return bmp_load(&s, x,y,comp,req_comp);
+   //   read in the TGA header stuff
+   int tga_offset = get8u(s);
+   int tga_indexed = get8u(s);
+   int tga_image_type = get8u(s);
+   int tga_is_RLE = 0;
+   int tga_palette_start = get16le(s);
+   int tga_palette_len = get16le(s);
+   int tga_palette_bits = get8u(s);
+   int tga_x_origin = get16le(s);
+   int tga_y_origin = get16le(s);
+   int tga_width = get16le(s);
+   int tga_height = get16le(s);
+   int tga_bits_per_pixel = get8u(s);
+   int tga_inverted = get8u(s);
+   //   image data
+   unsigned char *tga_data;
+   unsigned char *tga_palette = NULL;
+   int i, j;
+   unsigned char raw_data[4];
+   unsigned char trans_data[4];
+   int RLE_count = 0;
+   int RLE_repeating = 0;
+   int read_next_pixel = 1;
+
+   //   do a tiny bit of precessing
+   if ( tga_image_type >= 8 )
+   {
+      tga_image_type -= 8;
+      tga_is_RLE = 1;
+   }
+   /* int tga_alpha_bits = tga_inverted & 15; */
+   tga_inverted = 1 - ((tga_inverted >> 5) & 1);
+
+   //   error check
+   if ( //(tga_indexed) ||
+      (tga_width < 1) || (tga_height < 1) ||
+      (tga_image_type < 1) || (tga_image_type > 3) ||
+      ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
+      (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
+      )
+   {
+      return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
+   }
+
+   //   If I'm paletted, then I'll use the number of bits from the palette
+   if ( tga_indexed )
+   {
+      tga_bits_per_pixel = tga_palette_bits;
+   }
+
+   //   tga info
+   *x = tga_width;
+   *y = tga_height;
+   if ( (req_comp < 1) || (req_comp > 4) )
+   {
+      //   just use whatever the file was
+      req_comp = tga_bits_per_pixel / 8;
+      *comp = req_comp;
+   } else
+   {
+      //   force a new number of components
+      *comp = tga_bits_per_pixel/8;
+   }
+   tga_data = (unsigned char*)malloc( tga_width * tga_height * req_comp );
+   if (!tga_data) return epuc("outofmem", "Out of memory");
+
+   //   skip to the data's starting position (offset usually = 0)
+   skip(s, tga_offset );
+   //   do I need to load a palette?
+   if ( tga_indexed )
+   {
+      //   any data to skip? (offset usually = 0)
+      skip(s, tga_palette_start );
+      //   load the palette
+      tga_palette = (unsigned char*)malloc( tga_palette_len * tga_palette_bits / 8 );
+      if (!tga_palette) return epuc("outofmem", "Out of memory");
+      if (!getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
+         free(tga_data);
+         free(tga_palette);
+         return epuc("bad palette", "Corrupt TGA");
+      }
+   }
+   //   load the data
+   trans_data[0] = trans_data[1] = trans_data[2] = trans_data[3] = 0;
+   for (i=0; i < tga_width * tga_height; ++i)
+   {
+      //   if I'm in RLE mode, do I need to get a RLE chunk?
+      if ( tga_is_RLE )
+      {
+         if ( RLE_count == 0 )
+         {
+            //   yep, get the next byte as a RLE command
+            int RLE_cmd = get8u(s);
+            RLE_count = 1 + (RLE_cmd & 127);
+            RLE_repeating = RLE_cmd >> 7;
+            read_next_pixel = 1;
+         } else if ( !RLE_repeating )
+         {
+            read_next_pixel = 1;
+         }
+      } else
+      {
+         read_next_pixel = 1;
+      }
+      //   OK, if I need to read a pixel, do it now
+      if ( read_next_pixel )
+      {
+         //   load however much data we did have
+         if ( tga_indexed )
+         {
+            //   read in 1 byte, then perform the lookup
+            int pal_idx = get8u(s);
+            if ( pal_idx >= tga_palette_len )
+            {
+               //   invalid index
+               pal_idx = 0;
+            }
+            pal_idx *= tga_bits_per_pixel / 8;
+            for (j = 0; j*8 < tga_bits_per_pixel; ++j)
+            {
+               raw_data[j] = tga_palette[pal_idx+j];
+            }
+         } else
+         {
+            //   read in the data raw
+            for (j = 0; j*8 < tga_bits_per_pixel; ++j)
+            {
+               raw_data[j] = get8u(s);
+            }
+         }
+         //   convert raw to the intermediate format
+         switch (tga_bits_per_pixel)
+         {
+         case 8:
+            //   Luminous => RGBA
+            trans_data[0] = raw_data[0];
+            trans_data[1] = raw_data[0];
+            trans_data[2] = raw_data[0];
+            trans_data[3] = 255;
+            break;
+         case 16:
+            //   Luminous,Alpha => RGBA
+            trans_data[0] = raw_data[0];
+            trans_data[1] = raw_data[0];
+            trans_data[2] = raw_data[0];
+            trans_data[3] = raw_data[1];
+            break;
+         case 24:
+            //   BGR => RGBA
+            trans_data[0] = raw_data[2];
+            trans_data[1] = raw_data[1];
+            trans_data[2] = raw_data[0];
+            trans_data[3] = 255;
+            break;
+         case 32:
+            //   BGRA => RGBA
+            trans_data[0] = raw_data[2];
+            trans_data[1] = raw_data[1];
+            trans_data[2] = raw_data[0];
+            trans_data[3] = raw_data[3];
+            break;
+         default:
+            trans_data[0] = trans_data[1] = 0;
+            trans_data[2] = trans_data[3] = 0;
+            break;
+         }
+         //   clear the reading flag for the next pixel
+         read_next_pixel = 0;
+      } // end of reading a pixel
+      //   convert to final format
+      switch (req_comp)
+      {
+      case 1:
+         //   RGBA => Luminance
+         tga_data[i*req_comp+0] = compute_y(trans_data[0],trans_data[1],trans_data[2]);
+         break;
+      case 2:
+         //   RGBA => Luminance,Alpha
+         tga_data[i*req_comp+0] = compute_y(trans_data[0],trans_data[1],trans_data[2]);
+         tga_data[i*req_comp+1] = trans_data[3];
+         break;
+      case 3:
+         //   RGBA => RGB
+         tga_data[i*req_comp+0] = trans_data[0];
+         tga_data[i*req_comp+1] = trans_data[1];
+         tga_data[i*req_comp+2] = trans_data[2];
+         break;
+      case 4:
+         //   RGBA => RGBA
+         tga_data[i*req_comp+0] = trans_data[0];
+         tga_data[i*req_comp+1] = trans_data[1];
+         tga_data[i*req_comp+2] = trans_data[2];
+         tga_data[i*req_comp+3] = trans_data[3];
+         break;
+      }
+      //   in case we're in RLE mode, keep counting down
+      --RLE_count;
+   }
+   //   do I need to invert the image?
+   if ( tga_inverted )
+   {
+      for (j = 0; j*2 < tga_height; ++j)
+      {
+         int index1 = j * tga_width * req_comp;
+         int index2 = (tga_height - 1 - j) * tga_width * req_comp;
+         for (i = tga_width * req_comp; i > 0; --i)
+         {
+            unsigned char temp = tga_data[index1];
+            tga_data[index1] = tga_data[index2];
+            tga_data[index2] = temp;
+            ++index1;
+            ++index2;
+         }
+      }
+   }
+   //   clear my palette, if I had one
+   if ( tga_palette != NULL )
+   {
+      free( tga_palette );
+   }
+   //   the things I do to get rid of an error message, and yet keep
+   //   Microsoft's C compilers happy... [8^(
+   tga_palette_start = tga_palette_len = tga_palette_bits =
+         tga_x_origin = tga_y_origin = 0;
+   //   OK, done
+   return tga_data;
 }
 
-// Targa Truevision - TGA
-// by Jonathan Dummer
+static stbi_uc *stbi_tga_load(stbi *s, int *x, int *y, int *comp, int req_comp)
+{
+   return tga_load(s,x,y,comp,req_comp);
+}
 
-static int tga_test(stbi *s)
+
+// *************************************************************************************************
+// Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
+
+static int psd_test(stbi *s)
 {
-       int sz;
-       get8u(s);               //      discard Offset
-       sz = get8u(s);  //      color type
-       if( sz > 1 ) return 0;  //      only RGB or indexed allowed
-       sz = get8u(s);  //      image type
-       if( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0;    //    
  only RGB or grey allowed, +/- RLE
-       get16(s);               //      discard palette start
-       get16(s);               //      discard palette length
-       get8(s);                        //      discard bits per palette color entry
-       get16(s);               //      discard x origin
-       get16(s);               //      discard y origin
-       if( get16(s) < 1 ) return 0;            //      test width
-       if( get16(s) < 1 ) return 0;            //      test height
-       sz = get8(s);   //      bits per pixel
-       if( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) ) return 0;     //      only RGB or RGBA or 
grey allowed
-       return 1;               //      seems to have passed everything
+   if (get32(s) != 0x38425053) return 0;   // "8BPS"
+   else return 1;
 }
 
-#ifndef STBI_NO_STDIO
-int      stbi_tga_test_file        (FILE *f)
+static int stbi_psd_test(stbi *s)
 {
-   stbi s;
-   int r,n = ftell(f);
-   start_file(&s, f);
-   r = tga_test(&s);
-   fseek(f,n,SEEK_SET);
+   int r = psd_test(s);
+   stbi_rewind(s);
    return r;
 }
-#endif
 
-int      stbi_tga_test_memory      (stbi_uc const *buffer, int len)
+static stbi_uc *psd_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
-   stbi s;
-   start_mem(&s, buffer, len);
-   return tga_test(&s);
+   int   pixelCount;
+   int channelCount, compression;
+   int channel, i, count, len;
+   int w,h;
+   uint8_t *out;
+
+   // Check identifier
+   if (get32(s) != 0x38425053)   // "8BPS"
+      return epuc("not PSD", "Corrupt PSD image");
+
+   // Check file type version.
+   if (get16(s) != 1)
+      return epuc("wrong version", "Unsupported version of PSD image");
+
+   // Skip 6 reserved bytes.
+   skip(s, 6 );
+
+   // Read the number of channels (R, G, B, A, etc).
+   channelCount = get16(s);
+   if (channelCount < 0 || channelCount > 16)
+      return epuc("wrong channel count", "Unsupported number of channels in PSD image");
+
+   // Read the rows and columns of the image.
+   h = get32(s);
+   w = get32(s);
+   
+   // Make sure the depth is 8 bits.
+   if (get16(s) != 8)
+      return epuc("unsupported bit depth", "PSD bit depth is not 8 bit");
+
+   // Make sure the color mode is RGB.
+   // Valid options are:
+   //   0: Bitmap
+   //   1: Grayscale
+   //   2: Indexed color
+   //   3: RGB color
+   //   4: CMYK color
+   //   7: Multichannel
+   //   8: Duotone
+   //   9: Lab color
+   if (get16(s) != 3)
+      return epuc("wrong color format", "PSD is not in RGB color format");
+
+   // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
+   skip(s,get32(s) );
+
+   // Skip the image resources.  (resolution, pen tool paths, etc)
+   skip(s, get32(s) );
+
+   // Skip the reserved data.
+   skip(s, get32(s) );
+
+   // Find out if the data is compressed.
+   // Known values:
+   //   0: no compression
+   //   1: RLE compressed
+   compression = get16(s);
+   if (compression > 1)
+      return epuc("bad compression", "PSD has an unknown compression format");
+
+   // Create the destination image.
+   out = (stbi_uc *) malloc(4 * w*h);
+   if (!out) return epuc("outofmem", "Out of memory");
+   pixelCount = w*h;
+
+   // Initialize the data to zero.
+   //memset( out, 0, pixelCount * 4 );
+   
+   // Finally, the image data.
+   if (compression) {
+      // RLE as used by .PSD and .TIFF
+      // Loop until you get the number of unpacked bytes you are expecting:
+      //     Read the next source byte into n.
+      //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
+      //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
+      //     Else if n is 128, noop.
+      // Endloop
+
+      // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
+      // which we're going to just skip.
+      skip(s, h * channelCount * 2 );
+
+      // Read the RLE data by channel.
+      for (channel = 0; channel < 4; channel++) {
+         uint8_t *p;
+         
+         p = out+channel;
+         if (channel >= channelCount) {
+            // Fill this channel with default data.
+            for (i = 0; i < pixelCount; i++) *p = (channel == 3 ? 255 : 0), p += 4;
+         } else {
+            // Read the RLE data.
+            count = 0;
+            while (count < pixelCount) {
+               len = get8(s);
+               if (len == 128) {
+                  // No-op.
+               } else if (len < 128) {
+                  // Copy next len+1 bytes literally.
+                  len++;
+                  count += len;
+                  while (len) {
+                     *p = get8u(s);
+                     p += 4;
+                     len--;
+                  }
+               } else if (len > 128) {
+                  uint8_t   val;
+                  // Next -len+1 bytes in the dest are replicated from next source byte.
+                  // (Interpret len as a negative 8-bit int.)
+                  len ^= 0x0FF;
+                  len += 2;
+                  val = get8u(s);
+                  count += len;
+                  while (len) {
+                     *p = val;
+                     p += 4;
+                     len--;
+                  }
+               }
+            }
+         }
+      }
+      
+   } else {
+      // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
+      // where each channel consists of an 8-bit value for each pixel in the image.
+      
+      // Read the data by channel.
+      for (channel = 0; channel < 4; channel++) {
+         uint8_t *p;
+         
+         p = out + channel;
+         if (channel > channelCount) {
+            // Fill this channel with default data.
+            for (i = 0; i < pixelCount; i++) *p = channel == 3 ? 255 : 0, p += 4;
+         } else {
+            // Read the data.
+            for (i = 0; i < pixelCount; i++)
+               *p = get8u(s), p += 4;
+         }
+      }
+   }
+
+   if (req_comp && req_comp != 4) {
+      out = convert_format(out, 4, req_comp, w, h);
+      if (out == NULL) return out; // convert_format frees input on failure
+   }
+
+   if (comp) *comp = channelCount;
+   *y = h;
+   *x = w;
+   
+   return out;
 }
 
-static stbi_uc *tga_load(stbi *s, int *x, int *y, int *comp, int req_comp)
+static stbi_uc *stbi_psd_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
-       //      read in the TGA header stuff
-       int tga_offset = get8u(s);
-       int tga_indexed = get8u(s);
-       int tga_image_type = get8u(s);
-       int tga_is_RLE = 0;
-       int tga_palette_start = get16le(s);
-       int tga_palette_len = get16le(s);
-       int tga_palette_bits = get8u(s);
-       int tga_x_origin = get16le(s);
-       int tga_y_origin = get16le(s);
-       int tga_width = get16le(s);
-       int tga_height = get16le(s);
-       int tga_bits_per_pixel = get8u(s);
-       int tga_inverted = get8u(s);
-       //      image data
-       unsigned char *tga_data;
-       unsigned char *tga_palette = NULL;
-       int i, j;
-       unsigned char raw_data[4];
-       unsigned char trans_data[4];
-       int RLE_count = 0;
-       int RLE_repeating = 0;
-       int read_next_pixel = 1;
-       //      do a tiny bit of precessing
-       if( tga_image_type >= 8 )
-       {
-               tga_image_type -= 8;
-               tga_is_RLE = 1;
-       }
-       /* int tga_alpha_bits = tga_inverted & 15; */
-       tga_inverted = 1 - ((tga_inverted >> 5) & 1);
-
-       //      error check
-       if( //(tga_indexed) ||
-               (tga_width < 1) || (tga_height < 1) ||
-               (tga_image_type < 1) || (tga_image_type > 3) ||
-               ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
-               (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
-               )
-       {
-               return NULL;
-       }
-
-       //      If I'm paletted, then I'll use the number of bits from the palette
-       if( tga_indexed )
-       {
-               tga_bits_per_pixel = tga_palette_bits;
-       }
-
-       //      tga info
-       *x = tga_width;
-       *y = tga_height;
-       if( (req_comp < 1) || (req_comp > 4) )
-       {
-               //      just use whatever the file was
-               req_comp = tga_bits_per_pixel / 8;
-               *comp = req_comp;
-       } else
-       {
-               //      force a new number of components
-               *comp = tga_bits_per_pixel/8;
-       }
-       tga_data = (unsigned char*)malloc( tga_width * tga_height * req_comp );
-
-       //      skip to the data's starting position (offset usually = 0)
-       skip(s, tga_offset );
-       //      do I need to load a palette?
-       if( tga_indexed )
-       {
-               //      any data to skip? (offset usually = 0)
-               skip(s, tga_palette_start );
-               //      load the palette
-               tga_palette = (unsigned char*)malloc( tga_palette_len * tga_palette_bits / 8 );
-               getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 );
-       }
-       //      load the data
-       for( i = 0; i < tga_width * tga_height; ++i )
-       {
-               //      if I'm in RLE mode, do I need to get a RLE chunk?
-               if( tga_is_RLE )
-               {
-                       if( RLE_count == 0 )
-                       {
-                               //      yep, get the next byte as a RLE command
-                               int RLE_cmd = get8u(s);
-                               RLE_count = 1 + (RLE_cmd & 127);
-                               RLE_repeating = RLE_cmd >> 7;
-                               read_next_pixel = 1;
-                       } else if( !RLE_repeating )
-                       {
-                               read_next_pixel = 1;
-                       }
-               } else
-               {
-                       read_next_pixel = 1;
-               }
-               //      OK, if I need to read a pixel, do it now
-               if( read_next_pixel )
-               {
-                       //      load however much data we did have
-                       if( tga_indexed )
-                       {
-                               //      read in 1 byte, then perform the lookup
-                               int pal_idx = get8u(s);
-                               if( pal_idx >= tga_palette_len )
-                               {
-                                       //      invalid index
-                                       pal_idx = 0;
-                               }
-                               pal_idx *= tga_bits_per_pixel / 8;
-                               for( j = 0; j*8 < tga_bits_per_pixel; ++j )
-                               {
-                                       raw_data[j] = tga_palette[pal_idx+j];
-                               }
-                       } else
-                       {
-                               //      read in the data raw
-                               for( j = 0; j*8 < tga_bits_per_pixel; ++j )
-                               {
-                                       raw_data[j] = get8u(s);
-                               }
-                       }
-                       //      convert raw to the intermediate format
-                       switch( tga_bits_per_pixel )
-                       {
-                       case 8:
-                               //      Luminous => RGBA
-                               trans_data[0] = raw_data[0];
-                               trans_data[1] = raw_data[0];
-                               trans_data[2] = raw_data[0];
-                               trans_data[3] = 255;
-                               break;
-                       case 16:
-                               //      Luminous,Alpha => RGBA
-                               trans_data[0] = raw_data[0];
-                               trans_data[1] = raw_data[0];
-                               trans_data[2] = raw_data[0];
-                               trans_data[3] = raw_data[1];
-                               break;
-                       case 24:
-                               //      BGR => RGBA
-                               trans_data[0] = raw_data[2];
-                               trans_data[1] = raw_data[1];
-                               trans_data[2] = raw_data[0];
-                               trans_data[3] = 255;
-                               break;
-                       case 32:
-                               //      BGRA => RGBA
-                               trans_data[0] = raw_data[2];
-                               trans_data[1] = raw_data[1];
-                               trans_data[2] = raw_data[0];
-                               trans_data[3] = raw_data[3];
-                               break;
-                        default:
-                               trans_data[0] = trans_data[1] = 0;
-                               trans_data[2] = trans_data[3] = 0;
-                               break;
-                       }
-                       //      clear the reading flag for the next pixel
-                       read_next_pixel = 0;
-               } // end of reading a pixel
-               //      convert to final format
-               switch( req_comp )
-               {
-               case 1:
-                       //      RGBA => Luminance
-                       tga_data[i*req_comp+0] = compute_y(trans_data[0],trans_data[1],trans_data[2]);
-                       break;
-               case 2:
-                       //      RGBA => Luminance,Alpha
-                       tga_data[i*req_comp+0] = compute_y(trans_data[0],trans_data[1],trans_data[2]);
-                       tga_data[i*req_comp+1] = trans_data[3];
-                       break;
-               case 3:
-                       //      RGBA => RGB
-                       tga_data[i*req_comp+0] = trans_data[0];
-                       tga_data[i*req_comp+1] = trans_data[1];
-                       tga_data[i*req_comp+2] = trans_data[2];
-                       break;
-               case 4:
-                       //      RGBA => RGBA
-                       tga_data[i*req_comp+0] = trans_data[0];
-                       tga_data[i*req_comp+1] = trans_data[1];
-                       tga_data[i*req_comp+2] = trans_data[2];
-                       tga_data[i*req_comp+3] = trans_data[3];
-                       break;
-               }
-               //      in case we're in RLE mode, keep counting down
-               --RLE_count;
-       }
-       //      do I need to invert the image?
-       if( tga_inverted )
-       {
-               for( j = 0; j*2 < tga_height; ++j )
-               {
-                       int index1 = j * tga_width * req_comp;
-                       int index2 = (tga_height - 1 - j) * tga_width * req_comp;
-                       for( i = tga_width * req_comp; i > 0; --i )
-                       {
-                               unsigned char temp = tga_data[index1];
-                               tga_data[index1] = tga_data[index2];
-                               tga_data[index2] = temp;
-                               ++index1;
-                               ++index2;
-                       }
-               }
-       }
-       //      clear my palette, if I had one
-       if( tga_palette != NULL )
-       {
-               free( tga_palette );
-       }
-       //      the things I do to get rid of an error message, and yet keep
-       //      Microsoft's C compilers happy... [8^(
-       tga_palette_start = tga_palette_len = tga_palette_bits =
-                       tga_x_origin = tga_y_origin = 0;
-       //      OK, done
-       return tga_data;
+   return psd_load(s,x,y,comp,req_comp);
 }
 
-#ifndef STBI_NO_STDIO
-stbi_uc *stbi_tga_load             (char const *filename,           int *x, int *y, int *comp, int req_comp)
+// *************************************************************************************************
+// Softimage PIC loader
+// by Tom Seddon
+//
+// See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
+// See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
+
+static int pic_is4(stbi *s,const char *str)
 {
-   stbi_uc *data;
-   FILE *f = fopen(filename, "rb");
-   if (!f) return NULL;
-   data = stbi_tga_load_from_file(f, x,y,comp,req_comp);
-   fclose(f);
-   return data;
+   int i;
+   for (i=0; i<4; ++i)
+      if (get8(s) != (stbi_uc)str[i])
+         return 0;
+
+   return 1;
 }
 
-stbi_uc *stbi_tga_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp)
+static int pic_test(stbi *s)
 {
-   stbi s;
-   start_file(&s, f);
-   return tga_load(&s, x,y,comp,req_comp);
+   int i;
+
+   if (!pic_is4(s,"\x53\x80\xF6\x34"))
+      return 0;
+
+   for(i=0;i<84;++i)
+      get8(s);
+
+   if (!pic_is4(s,"PICT"))
+      return 0;
+
+   return 1;
 }
-#endif
 
-stbi_uc *stbi_tga_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
+typedef struct
 {
-   stbi s;
-   start_mem(&s, buffer, len);
-   return tga_load(&s, x,y,comp,req_comp);
+   stbi_uc size,type,channel;
+} pic_packet_t;
+
+static stbi_uc *pic_readval(stbi *s, int channel, stbi_uc *dest)
+{
+   int mask=0x80, i;
+
+   for (i=0; i<4; ++i, mask>>=1) {
+      if (channel & mask) {
+         if (at_eof(s)) return epuc("bad file","PIC file too short");
+         dest[i]=get8u(s);
+      }
+   }
+
+   return dest;
 }
 
+static void pic_copyval(int channel,stbi_uc *dest,const stbi_uc *src)
+{
+   int mask=0x80,i;
 
-// *************************************************************************************************
-// Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicholas Schulz, tweaked by STB
+   for (i=0;i<4; ++i, mask>>=1)
+      if (channel&mask)
+         dest[i]=src[i];
+}
 
-static int psd_test(stbi *s)
+static stbi_uc *pic_load2(stbi *s,int width,int height,int *comp, stbi_uc *result)
 {
-       if (get32(s) != 0x38425053) return 0;   // "8BPS"
-       else return 1;
+   int act_comp=0,num_packets=0,y,chained;
+   pic_packet_t packets[10];
+
+   // this will (should...) cater for even some bizarre stuff like having data
+    // for the same channel in multiple packets.
+   do {
+      pic_packet_t *packet;
+
+      if (num_packets==sizeof(packets)/sizeof(packets[0]))
+         return epuc("bad format","too many packets");
+
+      packet = &packets[num_packets++];
+
+      chained = get8(s);
+      packet->size    = get8u(s);
+      packet->type    = get8u(s);
+      packet->channel = get8u(s);
+
+      act_comp |= packet->channel;
+
+      if (at_eof(s))          return epuc("bad file","file too short (reading packets)");
+      if (packet->size != 8)  return epuc("bad format","packet isn't 8bpp");
+   } while (chained);
+
+   *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
+
+   for(y=0; y<height; ++y) {
+      int packet_idx;
+
+      for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
+         pic_packet_t *packet = &packets[packet_idx];
+         stbi_uc *dest = result+y*width*4;
+
+         switch (packet->type) {
+            default:
+               return epuc("bad format","packet has bad compression type");
+
+            case 0: {//uncompressed
+               int x;
+
+               for(x=0;x<width;++x, dest+=4)
+                  if (!pic_readval(s,packet->channel,dest))
+                     return 0;
+               break;
+            }
+
+            case 1://Pure RLE
+               {
+                  int left=width, i;
+
+                  while (left>0) {
+                     stbi_uc count,value[4];
+
+                     count=get8u(s);
+                     if (at_eof(s))   return epuc("bad file","file too short (pure read count)");
+
+                     if (count > left)
+                        count = (uint8_t) left;
+
+                     if (!pic_readval(s,packet->channel,value))  return 0;
+
+                     for(i=0; i<count; ++i,dest+=4)
+                        pic_copyval(packet->channel,dest,value);
+                     left -= count;
+                  }
+               }
+               break;
+
+            case 2: {//Mixed RLE
+               int left=width;
+               while (left>0) {
+                  int count = get8(s), i;
+                  if (at_eof(s))  return epuc("bad file","file too short (mixed read count)");
+
+                  if (count >= 128) { // Repeated
+                     stbi_uc value[4];
+                     int i;
+
+                     if (count==128)
+                        count = get16(s);
+                     else
+                        count -= 127;
+                     if (count > left)
+                        return epuc("bad file","scanline overrun");
+
+                     if (!pic_readval(s,packet->channel,value))
+                        return 0;
+
+                     for(i=0;i<count;++i, dest += 4)
+                        pic_copyval(packet->channel,dest,value);
+                  } else { // Raw
+                     ++count;
+                     if (count>left) return epuc("bad file","scanline overrun");
+
+                     for(i=0;i<count;++i, dest+=4)
+                        if (!pic_readval(s,packet->channel,dest))
+                           return 0;
+                  }
+                  left-=count;
+               }
+               break;
+            }
+         }
+      }
+   }
+
+   return result;
 }
 
-#ifndef STBI_NO_STDIO
-int stbi_psd_test_file(FILE *f)
+static stbi_uc *pic_load(stbi *s,int *px,int *py,int *comp,int req_comp)
 {
-   stbi s;
-   int r,n = ftell(f);
-   start_file(&s, f);
-   r = psd_test(&s);
-   fseek(f,n,SEEK_SET);
+   stbi_uc *result;
+   int i, x,y;
+
+   for (i=0; i<92; ++i)
+      get8(s);
+
+   x = get16(s);
+   y = get16(s);
+   if (at_eof(s))  return epuc("bad file","file too short (pic header)");
+   if ((1 << 28) / x < y) return epuc("too large", "Image too large to decode");
+
+   get32(s); //skip `ratio'
+   get16(s); //skip `fields'
+   get16(s); //skip `pad'
+
+   // intermediate buffer is RGBA
+   result = (stbi_uc *) malloc(x*y*4);
+   memset(result, 0xff, x*y*4);
+
+   if (!pic_load2(s,x,y,comp, result)) {
+      free(result);
+      result=0;
+   }
+   *px = x;
+   *py = y;
+   if (req_comp == 0) req_comp = *comp;
+   result=convert_format(result,4,req_comp,x,y);
+
+   return result;
+}
+
+static int stbi_pic_test(stbi *s)
+{
+   int r = pic_test(s);
+   stbi_rewind(s);
    return r;
 }
-#endif
 
-int stbi_psd_test_memory(stbi_uc const *buffer, int len)
+static stbi_uc *stbi_pic_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
-   stbi s;
-   start_mem(&s, buffer, len);
-   return psd_test(&s);
+   return pic_load(s,x,y,comp,req_comp);
 }
 
-static stbi_uc *psd_load(stbi *s, int *x, int *y, int *comp, int req_comp)
+// *************************************************************************************************
+// GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
+typedef struct stbi_gif_lzw_struct {
+   int16_t prefix;
+   uint8_t first;
+   uint8_t suffix;
+} stbi_gif_lzw;
+
+typedef struct stbi_gif_struct
 {
-       int     pixelCount;
-       int channelCount, compression;
-       int channel, i, count, len;
    int w,h;
-   uint8_t *out;
+   stbi_uc *out;                 // output buffer (always 4 components)
+   int flags, bgindex, ratio, transparent, eflags;
+   uint8_t  pal[256][4];
+   uint8_t lpal[256][4];
+   stbi_gif_lzw codes[4096];
+   uint8_t *color_table;
+   int parse, step;
+   int lflags;
+   int start_x, start_y;
+   int max_x, max_y;
+   int cur_x, cur_y;
+   int line_size;
+} stbi_gif;
+
+static int gif_test(stbi *s)
+{
+   int sz;
+   if (get8(s) != 'G' || get8(s) != 'I' || get8(s) != 'F' || get8(s) != '8') return 0;
+   sz = get8(s);
+   if (sz != '9' && sz != '7') return 0;
+   if (get8(s) != 'a') return 0;
+   return 1;
+}
 
-       // Check identifier
-       if (get32(s) != 0x38425053)     // "8BPS"
-               return epuc("not PSD", "Corrupt PSD image");
+static int stbi_gif_test(stbi *s)
+{
+   int r = gif_test(s);
+   stbi_rewind(s);
+   return r;
+}
+
+static void stbi_gif_parse_colortable(stbi *s, uint8_t pal[256][4], int num_entries, int transp)
+{
+   int i;
+   for (i=0; i < num_entries; ++i) {
+      pal[i][2] = get8u(s);
+      pal[i][1] = get8u(s);
+      pal[i][0] = get8u(s);
+      pal[i][3] = transp ? 0 : 255;
+   }   
+}
 
-       // Check file type version.
-       if (get16(s) != 1)
-               return epuc("wrong version", "Unsupported version of PSD image");
+static int stbi_gif_header(stbi *s, stbi_gif *g, int *comp, int is_info)
+{
+   uint8_t version;
+   if (get8(s) != 'G' || get8(s) != 'I' || get8(s) != 'F' || get8(s) != '8')
+      return e("not GIF", "Corrupt GIF");
 
-       // Skip 6 reserved bytes.
-       skip(s, 6 );
+   version = get8u(s);
+   if (version != '7' && version != '9')    return e("not GIF", "Corrupt GIF");
+   if (get8(s) != 'a')                      return e("not GIF", "Corrupt GIF");
+ 
+   failure_reason = "";
+   g->w = get16le(s);
+   g->h = get16le(s);
+   g->flags = get8(s);
+   g->bgindex = get8(s);
+   g->ratio = get8(s);
+   g->transparent = -1;
 
-       // Read the number of channels (R, G, B, A, etc).
-       channelCount = get16(s);
-       if (channelCount < 0 || channelCount > 16)
-               return epuc("wrong channel count", "Unsupported number of channels in PSD image");
+   if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
 
-       // Read the rows and columns of the image.
-   h = get32(s);
-   w = get32(s);
+   if (is_info) return 1;
 
-       // Make sure the depth is 8 bits.
-       if (get16(s) != 8)
-               return epuc("unsupported bit depth", "PSD bit depth is not 8 bit");
-
-       // Make sure the color mode is RGB.
-       // Valid options are:
-       //   0: Bitmap
-       //   1: Grayscale
-       //   2: Indexed color
-       //   3: RGB color
-       //   4: CMYK color
-       //   7: Multichannel
-       //   8: Duotone
-       //   9: Lab color
-       if (get16(s) != 3)
-               return epuc("wrong color format", "PSD is not in RGB color format");
-
-       // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
-       skip(s,get32(s) );
-
-       // Skip the image resources.  (resolution, pen tool paths, etc)
-       skip(s, get32(s) );
-
-       // Skip the reserved data.
-       skip(s, get32(s) );
-
-       // Find out if the data is compressed.
-       // Known values:
-       //   0: no compression
-       //   1: RLE compressed
-       compression = get16(s);
-       if (compression > 1)
-               return epuc("bad compression", "PSD has an unknown compression format");
-
-       // Create the destination image.
-       out = (stbi_uc *) malloc(4 * w*h);
-       if (!out) return epuc("outofmem", "Out of memory");
-   pixelCount = w*h;
+   if (g->flags & 0x80)
+      stbi_gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
+
+   return 1;
+}
 
-       // Initialize the data to zero.
-       //memset( out, 0, pixelCount * 4 );
+static int stbi_gif_info_raw(stbi *s, int *x, int *y, int *comp)
+{
+   stbi_gif g;   
+   if (!stbi_gif_header(s, &g, comp, 1)) {
+      stbi_rewind( s );
+      return 0;
+   }
+   if (x) *x = g.w;
+   if (y) *y = g.h;
+   return 1;
+}
 
-       // Finally, the image data.
-       if (compression) {
-               // RLE as used by .PSD and .TIFF
-               // Loop until you get the number of unpacked bytes you are expecting:
-               //     Read the next source byte into n.
-               //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
-               //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
-               //     Else if n is 128, noop.
-               // Endloop
+static void stbi_out_gif_code(stbi_gif *g, uint16_t code)
+{
+   uint8_t *p, *c;
 
-               // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
-               // which we're going to just skip.
-               skip(s, h * channelCount * 2 );
+   // recurse to decode the prefixes, since the linked-list is backwards,
+   // and working backwards through an interleaved image would be nasty
+   if (g->codes[code].prefix >= 0)
+      stbi_out_gif_code(g, g->codes[code].prefix);
 
-               // Read the RLE data by channel.
-               for (channel = 0; channel < 4; channel++) {
-                       uint8_t *p;
+   if (g->cur_y >= g->max_y) return;
+  
+   p = &g->out[g->cur_x + g->cur_y];
+   c = &g->color_table[g->codes[code].suffix * 4];
 
-         p = out+channel;
-                       if (channel >= channelCount) {
-                               // Fill this channel with default data.
-                               for (i = 0; i < pixelCount; i++) *p = (channel == 3 ? 255 : 0), p += 4;
-                       } else {
-                               // Read the RLE data.
-                               count = 0;
-                               while (count < pixelCount) {
-                                       len = get8(s);
-                                       if (len == 128) {
-                                               // No-op.
-                                       } else if (len < 128) {
-                                               // Copy next len+1 bytes literally.
-                                               len++;
-                                               count += len;
-                                               while (len) {
-                                                       *p = get8(s);
-                     p += 4;
-                                                       len--;
-                                               }
-                                       } else if (len > 128) {
-                                               uint32_t        val;
-                                               // Next -len+1 bytes in the dest are replicated from next 
source byte.
-                                               // (Interpret len as a negative 8-bit int.)
-                                               len ^= 0x0FF;
-                                               len += 2;
-                  val = get8(s);
-                                               count += len;
-                                               while (len) {
-                                                       *p = val;
-                     p += 4;
-                                                       len--;
-                                               }
-                                       }
-                               }
-                       }
-               }
+   if (c[3] >= 128) {
+      p[0] = c[2];
+      p[1] = c[1];
+      p[2] = c[0];
+      p[3] = c[3];
+   }
+   g->cur_x += 4;
 
-       } else {
-               // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
-               // where each channel consists of an 8-bit value for each pixel in the image.
+   if (g->cur_x >= g->max_x) {
+      g->cur_x = g->start_x;
+      g->cur_y += g->step;
 
-               // Read the data by channel.
-               for (channel = 0; channel < 4; channel++) {
-                       uint8_t *p;
+      while (g->cur_y >= g->max_y && g->parse > 0) {
+         g->step = (1 << g->parse) * g->line_size;
+         g->cur_y = g->start_y + (g->step >> 1);
+         --g->parse;
+      }
+   }
+}
 
-         p = out + channel;
-                       if (channel > channelCount) {
-                               // Fill this channel with default data.
-                               for (i = 0; i < pixelCount; i++) *p = channel == 3 ? 255 : 0, p += 4;
-                       } else {
-                               // Read the data.
-                               count = 0;
-                               for (i = 0; i < pixelCount; i++)
-                                       *p = get8(s), p += 4;
-                       }
-               }
-       }
-
-       if (req_comp && req_comp != 4) {
-               out = convert_format(out, 4, req_comp, w, h);
-               if (out == NULL) return out; // convert_format frees input on failure
-       }
-
-       if (comp) *comp = channelCount;
-       *y = h;
-       *x = w;
-
-       return out;
+static uint8_t *stbi_process_gif_raster(stbi *s, stbi_gif *g)
+{
+   uint8_t lzw_cs;
+   int32_t len, code;
+   uint32_t first;
+   int32_t codesize, codemask, avail, oldcode, bits, valid_bits, clear;
+   stbi_gif_lzw *p;
+
+   lzw_cs = get8u(s);
+   clear = 1 << lzw_cs;
+   first = 1;
+   codesize = lzw_cs + 1;
+   codemask = (1 << codesize) - 1;
+   bits = 0;
+   valid_bits = 0;
+   for (code = 0; code < clear; code++) {
+      g->codes[code].prefix = -1;
+      g->codes[code].first = (uint8_t) code;
+      g->codes[code].suffix = (uint8_t) code;
+   }
+
+   // support no starting clear code
+   avail = clear+2;
+   oldcode = -1;
+
+   len = 0;
+   for(;;) {
+      if (valid_bits < codesize) {
+         if (len == 0) {
+            len = get8(s); // start new block
+            if (len == 0) 
+               return g->out;
+         }
+         --len;
+         bits |= (int32_t) get8(s) << valid_bits;
+         valid_bits += 8;
+      } else {
+         int32_t code = bits & codemask;
+         bits >>= codesize;
+         valid_bits -= codesize;
+         // @OPTIMIZE: is there some way we can accelerate the non-clear path?
+         if (code == clear) {  // clear code
+            codesize = lzw_cs + 1;
+            codemask = (1 << codesize) - 1;
+            avail = clear + 2;
+            oldcode = -1;
+            first = 0;
+         } else if (code == clear + 1) { // end of stream code
+            skip(s, len);
+            while ((len = get8(s)) > 0)
+               skip(s,len);
+            return g->out;
+         } else if (code <= avail) {
+            if (first) return epuc("no clear code", "Corrupt GIF");
+
+            if (oldcode >= 0) {
+               p = &g->codes[avail++];
+               if (avail > 4096)        return epuc("too many codes", "Corrupt GIF");
+               p->prefix = (int16_t) oldcode;
+               p->first = g->codes[oldcode].first;
+               p->suffix = (code == avail) ? p->first : g->codes[code].first;
+            } else if (code == avail)
+               return epuc("illegal code in raster", "Corrupt GIF");
+
+            stbi_out_gif_code(g, (uint16_t) code);
+
+            if ((avail & codemask) == 0 && avail <= 0x0FFF) {
+               codesize++;
+               codemask = (1 << codesize) - 1;
+            }
+
+            oldcode = code;
+         } else {
+            return epuc("illegal code in raster", "Corrupt GIF");
+         }
+      } 
+   }
 }
 
-#ifndef STBI_NO_STDIO
-stbi_uc *stbi_psd_load(char const *filename, int *x, int *y, int *comp, int req_comp)
+static void stbi_fill_gif_background(stbi_gif *g)
 {
-   stbi_uc *data;
-   FILE *f = fopen(filename, "rb");
-   if (!f) return NULL;
-   data = stbi_psd_load_from_file(f, x,y,comp,req_comp);
-   fclose(f);
-   return data;
+   int i;
+   uint8_t *c = g->pal[g->bgindex];
+   // @OPTIMIZE: write a dword at a time
+   for (i = 0; i < g->w * g->h * 4; i += 4) {
+      uint8_t *p  = &g->out[i];
+      p[0] = c[2];
+      p[1] = c[1];
+      p[2] = c[0];
+      p[3] = c[3];
+   }
 }
 
-stbi_uc *stbi_psd_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
+// this function is designed to support animated gifs, although stb_image doesn't support it
+static uint8_t *stbi_gif_load_next(stbi *s, stbi_gif *g, int *comp, int req_comp)
 {
-   stbi s;
-   start_file(&s, f);
-   return psd_load(&s, x,y,comp,req_comp);
+   int i;
+   uint8_t *old_out = 0;
+
+   if (g->out == 0) {
+      if (!stbi_gif_header(s, g, comp,0))     return 0; // failure_reason set by stbi_gif_header
+      g->out = (uint8_t *) malloc(4 * g->w * g->h);
+      if (g->out == 0)                      return epuc("outofmem", "Out of memory");
+      stbi_fill_gif_background(g);
+   } else {
+      // animated-gif-only path
+      if (((g->eflags & 0x1C) >> 2) == 3) {
+         old_out = g->out;
+         g->out = (uint8_t *) malloc(4 * g->w * g->h);
+         if (g->out == 0)                   return epuc("outofmem", "Out of memory");
+         memcpy(g->out, old_out, g->w*g->h*4);
+      }
+   }
+    
+   for (;;) {
+      switch (get8(s)) {
+         case 0x2C: /* Image Descriptor */
+         {
+            int32_t x, y, w, h;
+            uint8_t *o;
+
+            x = get16le(s);
+            y = get16le(s);
+            w = get16le(s);
+            h = get16le(s);
+            if (((x + w) > (g->w)) || ((y + h) > (g->h)))
+               return epuc("bad Image Descriptor", "Corrupt GIF");
+
+            g->line_size = g->w * 4;
+            g->start_x = x * 4;
+            g->start_y = y * g->line_size;
+            g->max_x   = g->start_x + w * 4;
+            g->max_y   = g->start_y + h * g->line_size;
+            g->cur_x   = g->start_x;
+            g->cur_y   = g->start_y;
+
+            g->lflags = get8(s);
+
+            if (g->lflags & 0x40) {
+               g->step = 8 * g->line_size; // first interlaced spacing
+               g->parse = 3;
+            } else {
+               g->step = g->line_size;
+               g->parse = 0;
+            }
+
+            if (g->lflags & 0x80) {
+               stbi_gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent 
: -1);
+               g->color_table = (uint8_t *) g->lpal;       
+            } else if (g->flags & 0x80) {
+               for (i=0; i < 256; ++i)  // @OPTIMIZE: reset only the previous transparent
+                  g->pal[i][3] = 255; 
+               if (g->transparent >= 0 && (g->eflags & 0x01))
+                  g->pal[g->transparent][3] = 0;
+               g->color_table = (uint8_t *) g->pal;
+            } else
+               return epuc("missing color table", "Corrupt GIF");
+   
+            o = stbi_process_gif_raster(s, g);
+            if (o == NULL) return NULL;
+
+            if (req_comp && req_comp != 4)
+               o = convert_format(o, 4, req_comp, g->w, g->h);
+            return o;
+         }
+
+         case 0x21: // Comment Extension.
+         {
+            int len;
+            if (get8(s) == 0xF9) { // Graphic Control Extension.
+               len = get8(s);
+               if (len == 4) {
+                  g->eflags = get8(s);
+                  get16le(s); // delay
+                  g->transparent = get8(s);
+               } else {
+                  skip(s, len);
+                  break;
+               }
+            }
+            while ((len = get8(s)) != 0)
+               skip(s, len);
+            break;
+         }
+
+         case 0x3B: // gif stream termination code
+            return (uint8_t *) 1;
+
+         default:
+            return epuc("unknown code", "Corrupt GIF");
+      }
+   }
 }
-#endif
 
-stbi_uc *stbi_psd_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
+static stbi_uc *stbi_gif_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
-   stbi s;
-   start_mem(&s, buffer, len);
-   return psd_load(&s, x,y,comp,req_comp);
+   uint8_t *u = 0;
+   stbi_gif g={0};
+
+   u = stbi_gif_load_next(s, &g, comp, req_comp);
+   if (u == (void *) 1) u = 0;  // end of animated gif marker
+   if (u) {
+      *x = g.w;
+      *y = g.h;
+   }
+
+   return u;
+}
+
+static int stbi_gif_info(stbi *s, int *x, int *y, int *comp)
+{
+   return stbi_gif_info_raw(s,x,y,comp);
 }
 
 
@@ -3575,62 +4197,50 @@ stbi_uc *stbi_psd_load_from_memory (stbi_uc const *buffer, int len, int *x, int
 #ifndef STBI_NO_HDR
 static int hdr_test(stbi *s)
 {
-   char *signature = "#?RADIANCE\n";
+   const char *signature = "#?RADIANCE\n";
    int i;
    for (i=0; signature[i]; ++i)
       if (get8(s) != signature[i])
          return 0;
-       return 1;
-}
-
-int stbi_hdr_test_memory(stbi_uc const *buffer, int len)
-{
-   stbi s;
-       start_mem(&s, buffer, len);
-       return hdr_test(&s);
+   return 1;
 }
 
-#ifndef STBI_NO_STDIO
-int stbi_hdr_test_file(FILE *f)
+static int stbi_hdr_test(stbi* s)
 {
-   stbi s;
-   int r,n = ftell(f);
-   start_file(&s, f);
-   r = hdr_test(&s);
-   fseek(f,n,SEEK_SET);
+   int r = hdr_test(s);
+   stbi_rewind(s);
    return r;
 }
-#endif
 
 #define HDR_BUFLEN  1024
 static char *hdr_gettoken(stbi *z, char *buffer)
 {
    int len=0;
-       char c = '\2';
+   char c = '\0';
 
-   c = get8(z);
+   c = (char) get8(z);
 
-       while (!at_eof(z) && c != '\n') {
-               buffer[len++] = c;
+   while (!at_eof(z) && c != '\n') {
+      buffer[len++] = c;
       if (len == HDR_BUFLEN-1) {
          // flush to end of line
          while (!at_eof(z) && get8(z) != '\n')
             ;
          break;
       }
-      c = get8(z);
-       }
+      c = (char) get8(z);
+   }
 
    buffer[len] = 0;
-       return buffer;
+   return buffer;
 }
 
 static void hdr_convert(float *output, stbi_uc *input, int req_comp)
 {
-       if( input[3] != 0 ) {
+   if ( input[3] != 0 ) {
       float f1;
-               // Exponent
-               f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
+      // Exponent
+      f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
       if (req_comp <= 2)
          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
       else {
@@ -3640,7 +4250,7 @@ static void hdr_convert(float *output, stbi_uc *input, int req_comp)
       }
       if (req_comp == 2) output[1] = 1;
       if (req_comp == 4) output[3] = 1;
-       } else {
+   } else {
       switch (req_comp) {
          case 4: output[3] = 1; /* fallthrough */
          case 3: output[0] = output[1] = output[2] = 0;
@@ -3649,35 +4259,34 @@ static void hdr_convert(float *output, stbi_uc *input, int req_comp)
          case 1: output[0] = 0;
                  break;
       }
-       }
+   }
 }
 
-
 static float *hdr_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
    char buffer[HDR_BUFLEN];
-       char *token;
-       int valid = 0;
-       int width, height;
+   char *token;
+   int valid = 0;
+   int width, height;
    stbi_uc *scanline;
-       float *hdr_data;
-       int len;
-       unsigned char count, value;
-       int i, j, k, c1,c2, z;
+   float *hdr_data;
+   int len;
+   unsigned char count, value;
+   int i, j, k, c1,c2, z;
 
 
-       // Check identifier
-       if (strcmp(hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
-               return epf("not HDR", "Corrupt HDR image");
-
-       // Parse header
-       while(1) {
-               token = hdr_gettoken(s,buffer);
+   // Check identifier
+   if (strcmp(hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
+      return epf("not HDR", "Corrupt HDR image");
+   
+   // Parse header
+   for(;;) {
+      token = hdr_gettoken(s,buffer);
       if (token[0] == 0) break;
-               if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
+      if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
    }
 
-       if (!valid)    return epf("unsupported format", "Unsupported HDR format");
+   if (!valid)    return epf("unsupported format", "Unsupported HDR format");
 
    // Parse width and height
    // can't use sscanf() if we're not using stdio!
@@ -3690,19 +4299,19 @@ static float *hdr_load(stbi *s, int *x, int *y, int *comp, int req_comp)
    token += 3;
    width = strtol(token, NULL, 10);
 
-       *x = width;
-       *y = height;
+   *x = width;
+   *y = height;
 
    *comp = 3;
-       if (req_comp == 0) req_comp = 3;
+   if (req_comp == 0) req_comp = 3;
 
-       // Read data
-       hdr_data = (float *) malloc(height * width * req_comp * sizeof(float));
+   // Read data
+   hdr_data = (float *) malloc(height * width * req_comp * sizeof(float));
 
-       // Load image data
+   // Load image data
    // image data is stored as some number of sca
-       if( width < 8 || width >= 32768) {
-               // Read flat data
+   if ( width < 8 || width >= 32768) {
+      // Read flat data
       for (j=0; j < height; ++j) {
          for (i=0; i < width; ++i) {
             stbi_uc rgbe[4];
@@ -3711,178 +4320,354 @@ static float *hdr_load(stbi *s, int *x, int *y, int *comp, int req_comp)
             hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
          }
       }
-       } else {
-               // Read RLE-encoded data
-               scanline = NULL;
+   } else {
+      // Read RLE-encoded data
+      scanline = NULL;
 
-               for (j = 0; j < height; ++j) {
+      for (j = 0; j < height; ++j) {
          c1 = get8(s);
          c2 = get8(s);
          len = get8(s);
          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
             // not run-length encoded, so we have to actually use THIS data as a decoded
             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
-            stbi_uc rgbe[4] = { c1,c2,len, get8(s) };
+            uint8_t rgbe[4];
+            rgbe[0] = (uint8_t) c1;
+            rgbe[1] = (uint8_t) c2;
+            rgbe[2] = (uint8_t) len;
+            rgbe[3] = (uint8_t) get8u(s);
             hdr_convert(hdr_data, rgbe, req_comp);
             i = 1;
             j = 0;
             free(scanline);
-            goto main_decode_loop; // yes, this is fucking insane; blame the fucking insane format
+            goto main_decode_loop; // yes, this makes no sense
          }
          len <<= 8;
          len |= get8(s);
          if (len != width) { free(hdr_data); free(scanline); return epf("invalid decoded scanline length", 
"corrupt HDR"); }
          if (scanline == NULL) scanline = (stbi_uc *) malloc(width * 4);
-
-                       for (k = 0; k < 4; ++k) {
-                               i = 0;
-                               while (i < width) {
-                                       count = get8(s);
-                                       if (count > 128) {
-                                               // Run
-                                               value = get8(s);
+            
+         for (k = 0; k < 4; ++k) {
+            i = 0;
+            while (i < width) {
+               count = get8u(s);
+               if (count > 128) {
+                  // Run
+                  value = get8u(s);
                   count -= 128;
-                                               for (z = 0; z < count; ++z)
-                                                       scanline[i++ * 4 + k] = value;
-                                       } else {
-                                               // Dump
-                                               for (z = 0; z < count; ++z)
-                                                       scanline[i++ * 4 + k] = get8(s);
-                                       }
-                               }
-                       }
+                  for (z = 0; z < count; ++z)
+                     scanline[i++ * 4 + k] = value;
+               } else {
+                  // Dump
+                  for (z = 0; z < count; ++z)
+                     scanline[i++ * 4 + k] = get8u(s);
+               }
+            }
+         }
          for (i=0; i < width; ++i)
             hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
-               }
+      }
       free(scanline);
-       }
+   }
 
    return hdr_data;
 }
 
-#ifndef STBI_NO_STDIO
-float *stbi_hdr_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
+static float *stbi_hdr_load(stbi *s, int *x, int *y, int *comp, int req_comp)
 {
-   stbi s;
-   start_file(&s,f);
-   return hdr_load(&s,x,y,comp,req_comp);
+   return hdr_load(s,x,y,comp,req_comp);
 }
-#endif
 
-float *stbi_hdr_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
+static int stbi_hdr_info(stbi *s, int *x, int *y, int *comp)
 {
-   stbi s;
-   start_mem(&s,buffer, len);
-   return hdr_load(&s,x,y,comp,req_comp);
-}
-
-#endif // STBI_NO_HDR
+   char buffer[HDR_BUFLEN];
+   char *token;
+   int valid = 0;
 
-/////////////////////// write image ///////////////////////
+   if (strcmp(hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
+       stbi_rewind( s );
+       return 0;
+   }
 
-#ifndef STBI_NO_WRITE
+   for(;;) {
+      token = hdr_gettoken(s,buffer);
+      if (token[0] == 0) break;
+      if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
+   }
 
-static void write8(FILE *f, int x) { uint8_t z = (uint8_t) x; fwrite(&z,1,1,f); }
+   if (!valid) {
+       stbi_rewind( s );
+       return 0;
+   }
+   token = hdr_gettoken(s,buffer);
+   if (strncmp(token, "-Y ", 3)) {
+       stbi_rewind( s );
+       return 0;
+   }
+   token += 3;
+   *y = strtol(token, &token, 10);
+   while (*token == ' ') ++token;
+   if (strncmp(token, "+X ", 3)) {
+       stbi_rewind( s );
+       return 0;
+   }
+   token += 3;
+   *x = strtol(token, NULL, 10);
+   *comp = 3;
+   return 1;
+}
+#endif // STBI_NO_HDR
 
-static void writefv(FILE *f, char *fmt, va_list v)
+static int stbi_bmp_info(stbi *s, int *x, int *y, int *comp)
 {
-   while (*fmt) {
-      switch (*fmt++) {
-         case ' ': break;
-         case '1': { uint8_t x = va_arg(v, int); write8(f,x); break; }
-         case '2': { int16_t x = va_arg(v, int); write8(f,x); write8(f,x>>8); break; }
-         case '4': { int32_t x = va_arg(v, int); write8(f,x); write8(f,x>>8); write8(f,x>>16); 
write8(f,x>>24); break; }
-         default:
-            assert(0);
-            va_end(v);
-            return;
-      }
+   int hsz;
+   if (get8(s) != 'B' || get8(s) != 'M') {
+       stbi_rewind( s );
+       return 0;
+   }
+   skip(s,12);
+   hsz = get32le(s);
+   if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108) {
+       stbi_rewind( s );
+       return 0;
+   }
+   if (hsz == 12) {
+      *x = get16le(s);
+      *y = get16le(s);
+   } else {
+      *x = get32le(s);
+      *y = get32le(s);
+   }
+   if (get16le(s) != 1) {
+       stbi_rewind( s );
+       return 0;
    }
+   *comp = get16le(s) / 8;
+   return 1;
 }
 
-static void writef(FILE *f, char *fmt, ...)
+static int stbi_psd_info(stbi *s, int *x, int *y, int *comp)
 {
-   va_list v;
-   va_start(v, fmt);
-   writefv(f,fmt,v);
-   va_end(v);
+   int channelCount;
+   if (get32(s) != 0x38425053) {
+       stbi_rewind( s );
+       return 0;
+   }
+   if (get16(s) != 1) {
+       stbi_rewind( s );
+       return 0;
+   }
+   skip(s, 6);
+   channelCount = get16(s);
+   if (channelCount < 0 || channelCount > 16) {
+       stbi_rewind( s );
+       return 0;
+   }
+   *y = get32(s);
+   *x = get32(s);
+   if (get16(s) != 8) {
+       stbi_rewind( s );
+       return 0;
+   }
+   if (get16(s) != 3) {
+       stbi_rewind( s );
+       return 0;
+   }
+   *comp = 4;
+   return 1;
 }
 
-static void write_pixels(FILE *f, int rgb_dir, int vdir, int x, int y, int comp, void *data, int 
write_alpha, int scanline_pad)
+static int stbi_pic_info(stbi *s, int *x, int *y, int *comp)
 {
-   uint8_t bg[3] = { 255, 0, 255}, px[3];
-   uint32_t zero = 0;
-   int i,j,k, j_end;
+   int act_comp=0,num_packets=0,chained;
+   pic_packet_t packets[10];
 
-   if (vdir < 0)
-      j_end = -1, j = y-1;
-   else
-      j_end =  y, j = 0;
-
-   for (; j != j_end; j += vdir) {
-      for (i=0; i < x; ++i) {
-         uint8_t *d = (uint8_t *) data + (j*x+i)*comp;
-         if (write_alpha < 0)
-            fwrite(&d[comp-1], 1, 1, f);
-         switch (comp) {
-            case 1:
-            case 2: writef(f, "111", d[0],d[0],d[0]);
-                    break;
-            case 4:
-               if (!write_alpha) {
-                  for (k=0; k < 3; ++k)
-                     px[k] = bg[k] + ((d[k] - bg[k]) * d[3])/255;
-                  writef(f, "111", px[1-rgb_dir],px[1],px[1+rgb_dir]);
-                  break;
-               }
-               /* FALLTHROUGH */
-            case 3:
-               writef(f, "111", d[1-rgb_dir],d[1],d[1+rgb_dir]);
-               break;
-         }
-         if (write_alpha > 0)
-            fwrite(&d[comp-1], 1, 1, f);
-      }
-      fwrite(&zero,scanline_pad,1,f);
+   skip(s, 92);
+
+   *x = get16(s);
+   *y = get16(s);
+   if (at_eof(s))  return 0;
+   if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
+       stbi_rewind( s );
+       return 0;
    }
+
+   skip(s, 8);
+
+   do {
+      pic_packet_t *packet;
+
+      if (num_packets==sizeof(packets)/sizeof(packets[0]))
+         return 0;
+
+      packet = &packets[num_packets++];
+      chained = get8(s);
+      packet->size    = get8u(s);
+      packet->type    = get8u(s);
+      packet->channel = get8u(s);
+      act_comp |= packet->channel;
+
+      if (at_eof(s)) {
+          stbi_rewind( s );
+          return 0;
+      }
+      if (packet->size != 8) {
+          stbi_rewind( s );
+          return 0;
+      }
+   } while (chained);
+
+   *comp = (act_comp & 0x10 ? 4 : 3);
+
+   return 1;
 }
 
-static int outfile(char const *filename, int rgb_dir, int vdir, int x, int y, int comp, void *data, int 
alpha, int pad, char *fmt, ...)
-{
-   FILE *f = fopen(filename, "wb");
-   if (f) {
-      va_list v;
-      va_start(v, fmt);
-      writefv(f, fmt, v);
-      va_end(v);
-      write_pixels(f,rgb_dir,vdir,x,y,comp,data,alpha,pad);
-      fclose(f);
-   }
-   return f != NULL;
+static int stbi_info_main(stbi *s, int *x, int *y, int *comp)
+{
+   if (stbi_jpeg_info(s, x, y, comp))
+       return 1;
+   if (stbi_png_info(s, x, y, comp))
+       return 1;
+   if (stbi_gif_info(s, x, y, comp))
+       return 1;
+   if (stbi_bmp_info(s, x, y, comp))
+       return 1;
+   if (stbi_psd_info(s, x, y, comp))
+       return 1;
+   if (stbi_pic_info(s, x, y, comp))
+       return 1;
+   #ifndef STBI_NO_HDR
+   if (stbi_hdr_info(s, x, y, comp))
+       return 1;
+   #endif
+   // test tga last because it's a crappy test!
+   if (stbi_tga_info(s, x, y, comp))
+       return 1;
+   return e("unknown image type", "Image not of any known type, or corrupt");
 }
 
-int stbi_write_bmp(char const *filename, int x, int y, int comp, void *data)
+#ifndef STBI_NO_STDIO
+int stbi_info(char const *filename, int *x, int *y, int *comp)
 {
-   int pad = (-x*3) & 3;
-   return outfile(filename,-1,-1,x,y,comp,data,0,pad,
-           "11 4 22 4" "4 44 22 444444",
-           'B', 'M', 14+40+(x*3+pad)*y, 0,0, 14+40,  // file header
-            40, x,y, 1,24, 0,0,0,0,0,0);             // bitmap header
+    FILE *f = fopen(filename, "rb");
+    int result;
+    if (!f) return e("can't fopen", "Unable to open file");
+    result = stbi_info_from_file(f, x, y, comp);
+    fclose(f);
+    return result;
 }
 
-int stbi_write_tga(char const *filename, int x, int y, int comp, void *data)
+int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
 {
-   int has_alpha = !(comp & 1);
-   return outfile(filename, -1,-1, x, y, comp, data, has_alpha, 0,
-                  "111 221 2222 11", 0,0,2, 0,0,0, 0,0,x,y, 24+8*has_alpha, 8*has_alpha);
+   int r;
+   stbi s;
+   long pos = ftell(f);
+   start_file(&s, f);
+   r = stbi_info_main(&s,x,y,comp);
+   fseek(f,pos,SEEK_SET);
+   return r;
 }
+#endif // !STBI_NO_STDIO
 
-// any other image formats that do interleaved rgb data?
-//    PNG: requires adler32,crc32 -- significant amount of code
-//    PSD: no, channels output separately
-//    TIFF: no, stripwise-interleaved... i think
+int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
+{
+   stbi s;
+   start_mem(&s,buffer,len);
+   return stbi_info_main(&s,x,y,comp);
+}
 
-#endif // STBI_NO_WRITE
+int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
+{
+   stbi s;
+   start_callbacks(&s, (stbi_io_callbacks *) c, user);
+   return stbi_info_main(&s,x,y,comp);
+}
 
 #endif // STBI_HEADER_FILE_ONLY
 
+/*
+   revision history:
+      1.33 (2011-07-14)
+             make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
+      1.32 (2011-07-13)
+             support for "info" function for all supported filetypes (SpartanJ)
+      1.31 (2011-06-20)
+             a few more leak fixes, bug in PNG handling (SpartanJ)
+      1.30 (2011-06-11)
+             added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
+             removed deprecated format-specific test/load functions
+             removed support for installable file formats (stbi_loader) -- would have been broken for IO 
callbacks anyway
+             error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
+             fix inefficiency in decoding 32-bit BMP (David Woo)
+      1.29 (2010-08-16)
+             various warning fixes from Aurelien Pocheville 
+      1.28 (2010-08-01)
+             fix bug in GIF palette transparency (SpartanJ)
+      1.27 (2010-08-01)
+             cast-to-uint8_t to fix warnings
+      1.26 (2010-07-24)
+             fix bug in file buffering for PNG reported by SpartanJ
+      1.25 (2010-07-17)
+             refix trans_data warning (Won Chun)
+      1.24 (2010-07-12)
+             perf improvements reading from files on platforms with lock-heavy fgetc()
+             minor perf improvements for jpeg
+             deprecated type-specific functions so we'll get feedback if they're needed
+             attempt to fix trans_data warning (Won Chun)
+      1.23   fixed bug in iPhone support
+      1.22 (2010-07-10)
+             removed image *writing* support
+             stbi_info support from Jetro Lauha
+             GIF support from Jean-Marc Lienher
+             iPhone PNG-extensions from James Brown
+             warning-fixes from Nicolas Schulz and Janez Zemva (i.e. Janez (U+017D)emva)
+      1.21   fix use of 'uint8' in header (reported by jon blow)
+      1.20   added support for Softimage PIC, by Tom Seddon
+      1.19   bug in interlaced PNG corruption check (found by ryg)
+      1.18 2008-08-02
+             fix a threading bug (local mutable static)
+      1.17   support interlaced PNG
+      1.16   major bugfix - convert_format converted one too many pixels
+      1.15   initialize some fields for thread safety
+      1.14   fix threadsafe conversion bug
+             header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
+      1.13   threadsafe
+      1.12   const qualifiers in the API
+      1.11   Support installable IDCT, colorspace conversion routines
+      1.10   Fixes for 64-bit (don't use "unsigned long")
+             optimized upsampling by Fabian "ryg" Giesen
+      1.09   Fix format-conversion for PSD code (bad global variables!)
+      1.08   Thatcher Ulrich's PSD code integrated by Nicolas Schulz
+      1.07   attempt to fix C++ warning/errors again
+      1.06   attempt to fix C++ warning/errors again
+      1.05   fix TGA loading to return correct *comp and use good luminance calc
+      1.04   default float alpha is 1, not 255; use 'void *' for stbi_image_free
+      1.03   bugfixes to STBI_NO_STDIO, STBI_NO_HDR
+      1.02   support for (subset of) HDR files, float interface for preferred access to them
+      1.01   fix bug: possible bug in handling right-side up bmps... not sure
+             fix bug: the stbi_bmp_load() and stbi_tga_load() functions didn't work at all
+      1.00   interface to zlib that skips zlib header
+      0.99   correct handling of alpha in palette
+      0.98   TGA loader by lonesock; dynamically add loaders (untested)
+      0.97   jpeg errors on too large a file; also catch another malloc failure
+      0.96   fix detection of invalid v value - particleman mollyrocket forum
+      0.95   during header scan, seek to markers in case of padding
+      0.94   STBI_NO_STDIO to disable stdio usage; rename all #defines the same
+      0.93   handle jpegtran output; verbose errors
+      0.92   read 4,8,16,24,32-bit BMP files of several formats
+      0.91   output 24-bit Windows 3.0 BMP files
+      0.90   fix a few more warnings; bump version number to approach 1.0
+      0.61   bugfixes due to Marc LeBlanc, Christopher Lloyd
+      0.60   fix compiling as c++
+      0.59   fix warnings: merge Dave Moore's -Wall fixes
+      0.58   fix bug: zlib uncompressed mode len/nlen was wrong endian
+      0.57   fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
+      0.56   fix bug: zlib uncompressed mode len vs. nlen
+      0.55   fix bug: restart_interval not initialized to 0
+      0.54   allow NULL for 'int *comp'
+      0.53   fix bug in png 3->4; speedup png decoding
+      0.52   png handles req_comp=3,4 directly; minor cleanup; jpeg comments
+      0.51   obey req_comp requests, 1-component jpegs return as 1-component,
+             on 'test' only check type, not whether we support this variant
+      0.50   first released version
+*/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]