pietp

Piet code painter
git clone https://git.sinitax.com/sinitax/pietp
Log | Files | Refs | LICENSE | sfeed.txt

stb_image.h (279339B)


      1/* stb_image - v2.27 - public domain image loader - http://nothings.org/stb
      2                                  no warranty implied; use at your own risk
      3
      4   Do this:
      5      #define STB_IMAGE_IMPLEMENTATION
      6   before you include this file in *one* C or C++ file to create the implementation.
      7
      8   // i.e. it should look like this:
      9   #include ...
     10   #include ...
     11   #include ...
     12   #define STB_IMAGE_IMPLEMENTATION
     13   #include "stb_image.h"
     14
     15   You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
     16   And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
     17
     18
     19   QUICK NOTES:
     20      Primarily of interest to game developers and other people who can
     21          avoid problematic images and only need the trivial interface
     22
     23      JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
     24      PNG 1/2/4/8/16-bit-per-channel
     25
     26      TGA (not sure what subset, if a subset)
     27      BMP non-1bpp, non-RLE
     28      PSD (composited view only, no extra channels, 8/16 bit-per-channel)
     29
     30      GIF (*comp always reports as 4-channel)
     31      HDR (radiance rgbE format)
     32      PIC (Softimage PIC)
     33      PNM (PPM and PGM binary only)
     34
     35      Animated GIF still needs a proper API, but here's one way to do it:
     36          http://gist.github.com/urraka/685d9a6340b26b830d49
     37
     38      - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
     39      - decode from arbitrary I/O callbacks
     40      - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
     41
     42   Full documentation under "DOCUMENTATION" below.
     43
     44
     45LICENSE
     46
     47  See end of file for license information.
     48
     49RECENT REVISION HISTORY:
     50
     51      2.27  (2021-07-11) document stbi_info better, 16-bit PNM support, bug fixes
     52      2.26  (2020-07-13) many minor fixes
     53      2.25  (2020-02-02) fix warnings
     54      2.24  (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically
     55      2.23  (2019-08-11) fix clang static analysis warning
     56      2.22  (2019-03-04) gif fixes, fix warnings
     57      2.21  (2019-02-25) fix typo in comment
     58      2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
     59      2.19  (2018-02-11) fix warning
     60      2.18  (2018-01-30) fix warnings
     61      2.17  (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
     62      2.16  (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
     63      2.15  (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
     64      2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
     65      2.13  (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
     66      2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
     67      2.11  (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
     68                         RGB-format JPEG; remove white matting in PSD;
     69                         allocate large structures on the stack;
     70                         correct channel count for PNG & BMP
     71      2.10  (2016-01-22) avoid warning introduced in 2.09
     72      2.09  (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
     73
     74   See end of file for full revision history.
     75
     76
     77 ============================    Contributors    =========================
     78
     79 Image formats                          Extensions, features
     80    Sean Barrett (jpeg, png, bmp)          Jetro Lauha (stbi_info)
     81    Nicolas Schulz (hdr, psd)              Martin "SpartanJ" Golini (stbi_info)
     82    Jonathan Dummer (tga)                  James "moose2000" Brown (iPhone PNG)
     83    Jean-Marc Lienher (gif)                Ben "Disch" Wenger (io callbacks)
     84    Tom Seddon (pic)                       Omar Cornut (1/2/4-bit PNG)
     85    Thatcher Ulrich (psd)                  Nicolas Guillemot (vertical flip)
     86    Ken Miller (pgm, ppm)                  Richard Mitton (16-bit PSD)
     87    github:urraka (animated gif)           Junggon Kim (PNM comments)
     88    Christopher Forseth (animated gif)     Daniel Gibson (16-bit TGA)
     89                                           socks-the-fox (16-bit PNG)
     90                                           Jeremy Sawicki (handle all ImageNet JPGs)
     91 Optimizations & bugfixes                  Mikhail Morozov (1-bit BMP)
     92    Fabian "ryg" Giesen                    Anael Seghezzi (is-16-bit query)
     93    Arseny Kapoulkine                      Simon Breuss (16-bit PNM)
     94    John-Mark Allen
     95    Carmelo J Fdez-Aguera
     96
     97 Bug & warning fixes
     98    Marc LeBlanc            David Woo          Guillaume George     Martins Mozeiko
     99    Christpher Lloyd        Jerry Jansson      Joseph Thomson       Blazej Dariusz Roszkowski
    100    Phil Jordan                                Dave Moore           Roy Eltham
    101    Hayaki Saito            Nathan Reed        Won Chun
    102    Luke Graham             Johan Duparc       Nick Verigakis       the Horde3D community
    103    Thomas Ruf              Ronny Chevalier                         github:rlyeh
    104    Janez Zemva             John Bartholomew   Michal Cichon        github:romigrou
    105    Jonathan Blow           Ken Hamada         Tero Hanninen        github:svdijk
    106    Eugene Golushkov        Laurent Gomila     Cort Stratton        github:snagar
    107    Aruelien Pocheville     Sergio Gonzalez    Thibault Reuille     github:Zelex
    108    Cass Everitt            Ryamond Barbiero                        github:grim210
    109    Paul Du Bois            Engin Manap        Aldo Culquicondor    github:sammyhw
    110    Philipp Wiesemann       Dale Weiler        Oriol Ferrer Mesia   github:phprus
    111    Josh Tobin                                 Matthew Gregan       github:poppolopoppo
    112    Julian Raschke          Gregory Mullen     Christian Floisand   github:darealshinji
    113    Baldur Karlsson         Kevin Schmidt      JR Smith             github:Michaelangel007
    114                            Brad Weinberger    Matvey Cherevko      github:mosra
    115    Luca Sas                Alexander Veselov  Zack Middleton       [reserved]
    116    Ryan C. Gordon          [reserved]                              [reserved]
    117                     DO NOT ADD YOUR NAME HERE
    118
    119                     Jacko Dirks
    120
    121  To add your name to the credits, pick a random blank space in the middle and fill it.
    122  80% of merge conflicts on stb PRs are due to people adding their name at the end
    123  of the credits.
    124*/
    125
    126#ifndef STBI_INCLUDE_STB_IMAGE_H
    127#define STBI_INCLUDE_STB_IMAGE_H
    128
    129// DOCUMENTATION
    130//
    131// Limitations:
    132//    - no 12-bit-per-channel JPEG
    133//    - no JPEGs with arithmetic coding
    134//    - GIF always returns *comp=4
    135//
    136// Basic usage (see HDR discussion below for HDR usage):
    137//    int x,y,n;
    138//    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
    139//    // ... process data if not NULL ...
    140//    // ... x = width, y = height, n = # 8-bit components per pixel ...
    141//    // ... replace '0' with '1'..'4' to force that many components per pixel
    142//    // ... but 'n' will always be the number that it would have been if you said 0
    143//    stbi_image_free(data)
    144//
    145// Standard parameters:
    146//    int *x                 -- outputs image width in pixels
    147//    int *y                 -- outputs image height in pixels
    148//    int *channels_in_file  -- outputs # of image components in image file
    149//    int desired_channels   -- if non-zero, # of image components requested in result
    150//
    151// The return value from an image loader is an 'unsigned char *' which points
    152// to the pixel data, or NULL on an allocation failure or if the image is
    153// corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
    154// with each pixel consisting of N interleaved 8-bit components; the first
    155// pixel pointed to is top-left-most in the image. There is no padding between
    156// image scanlines or between pixels, regardless of format. The number of
    157// components N is 'desired_channels' if desired_channels is non-zero, or
    158// *channels_in_file otherwise. If desired_channels is non-zero,
    159// *channels_in_file has the number of components that _would_ have been
    160// output otherwise. E.g. if you set desired_channels to 4, you will always
    161// get RGBA output, but you can check *channels_in_file to see if it's trivially
    162// opaque because e.g. there were only 3 channels in the source image.
    163//
    164// An output image with N components has the following components interleaved
    165// in this order in each pixel:
    166//
    167//     N=#comp     components
    168//       1           grey
    169//       2           grey, alpha
    170//       3           red, green, blue
    171//       4           red, green, blue, alpha
    172//
    173// If image loading fails for any reason, the return value will be NULL,
    174// and *x, *y, *channels_in_file will be unchanged. The function
    175// stbi_failure_reason() can be queried for an extremely brief, end-user
    176// unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
    177// to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
    178// more user-friendly ones.
    179//
    180// Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
    181//
    182// To query the width, height and component count of an image without having to
    183// decode the full file, you can use the stbi_info family of functions:
    184//
    185//   int x,y,n,ok;
    186//   ok = stbi_info(filename, &x, &y, &n);
    187//   // returns ok=1 and sets x, y, n if image is a supported format,
    188//   // 0 otherwise.
    189//
    190// Note that stb_image pervasively uses ints in its public API for sizes,
    191// including sizes of memory buffers. This is now part of the API and thus
    192// hard to change without causing breakage. As a result, the various image
    193// loaders all have certain limits on image size; these differ somewhat
    194// by format but generally boil down to either just under 2GB or just under
    195// 1GB. When the decoded image would be larger than this, stb_image decoding
    196// will fail.
    197//
    198// Additionally, stb_image will reject image files that have any of their
    199// dimensions set to a larger value than the configurable STBI_MAX_DIMENSIONS,
    200// which defaults to 2**24 = 16777216 pixels. Due to the above memory limit,
    201// the only way to have an image with such dimensions load correctly
    202// is for it to have a rather extreme aspect ratio. Either way, the
    203// assumption here is that such larger images are likely to be malformed
    204// or malicious. If you do need to load an image with individual dimensions
    205// larger than that, and it still fits in the overall size limit, you can
    206// #define STBI_MAX_DIMENSIONS on your own to be something larger.
    207//
    208// ===========================================================================
    209//
    210// UNICODE:
    211//
    212//   If compiling for Windows and you wish to use Unicode filenames, compile
    213//   with
    214//       #define STBI_WINDOWS_UTF8
    215//   and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert
    216//   Windows wchar_t filenames to utf8.
    217//
    218// ===========================================================================
    219//
    220// Philosophy
    221//
    222// stb libraries are designed with the following priorities:
    223//
    224//    1. easy to use
    225//    2. easy to maintain
    226//    3. good performance
    227//
    228// Sometimes I let "good performance" creep up in priority over "easy to maintain",
    229// and for best performance I may provide less-easy-to-use APIs that give higher
    230// performance, in addition to the easy-to-use ones. Nevertheless, it's important
    231// to keep in mind that from the standpoint of you, a client of this library,
    232// all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
    233//
    234// Some secondary priorities arise directly from the first two, some of which
    235// provide more explicit reasons why performance can't be emphasized.
    236//
    237//    - Portable ("ease of use")
    238//    - Small source code footprint ("easy to maintain")
    239//    - No dependencies ("ease of use")
    240//
    241// ===========================================================================
    242//
    243// I/O callbacks
    244//
    245// I/O callbacks allow you to read from arbitrary sources, like packaged
    246// files or some other source. Data read from callbacks are processed
    247// through a small internal buffer (currently 128 bytes) to try to reduce
    248// overhead.
    249//
    250// The three functions you must define are "read" (reads some bytes of data),
    251// "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
    252//
    253// ===========================================================================
    254//
    255// SIMD support
    256//
    257// The JPEG decoder will try to automatically use SIMD kernels on x86 when
    258// supported by the compiler. For ARM Neon support, you must explicitly
    259// request it.
    260//
    261// (The old do-it-yourself SIMD API is no longer supported in the current
    262// code.)
    263//
    264// On x86, SSE2 will automatically be used when available based on a run-time
    265// test; if not, the generic C versions are used as a fall-back. On ARM targets,
    266// the typical path is to have separate builds for NEON and non-NEON devices
    267// (at least this is true for iOS and Android). Therefore, the NEON support is
    268// toggled by a build flag: define STBI_NEON to get NEON loops.
    269//
    270// If for some reason you do not want to use any of SIMD code, or if
    271// you have issues compiling it, you can disable it entirely by
    272// defining STBI_NO_SIMD.
    273//
    274// ===========================================================================
    275//
    276// HDR image support   (disable by defining STBI_NO_HDR)
    277//
    278// stb_image supports loading HDR images in general, and currently the Radiance
    279// .HDR file format specifically. You can still load any file through the existing
    280// interface; if you attempt to load an HDR file, it will be automatically remapped
    281// to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
    282// both of these constants can be reconfigured through this interface:
    283//
    284//     stbi_hdr_to_ldr_gamma(2.2f);
    285//     stbi_hdr_to_ldr_scale(1.0f);
    286//
    287// (note, do not use _inverse_ constants; stbi_image will invert them
    288// appropriately).
    289//
    290// Additionally, there is a new, parallel interface for loading files as
    291// (linear) floats to preserve the full dynamic range:
    292//
    293//    float *data = stbi_loadf(filename, &x, &y, &n, 0);
    294//
    295// If you load LDR images through this interface, those images will
    296// be promoted to floating point values, run through the inverse of
    297// constants corresponding to the above:
    298//
    299//     stbi_ldr_to_hdr_scale(1.0f);
    300//     stbi_ldr_to_hdr_gamma(2.2f);
    301//
    302// Finally, given a filename (or an open file or memory block--see header
    303// file for details) containing image data, you can query for the "most
    304// appropriate" interface to use (that is, whether the image is HDR or
    305// not), using:
    306//
    307//     stbi_is_hdr(char *filename);
    308//
    309// ===========================================================================
    310//
    311// iPhone PNG support:
    312//
    313// We optionally support converting iPhone-formatted PNGs (which store
    314// premultiplied BGRA) back to RGB, even though they're internally encoded
    315// differently. To enable this conversion, call
    316// stbi_convert_iphone_png_to_rgb(1).
    317//
    318// Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
    319// pixel to remove any premultiplied alpha *only* if the image file explicitly
    320// says there's premultiplied data (currently only happens in iPhone images,
    321// and only if iPhone convert-to-rgb processing is on).
    322//
    323// ===========================================================================
    324//
    325// ADDITIONAL CONFIGURATION
    326//
    327//  - You can suppress implementation of any of the decoders to reduce
    328//    your code footprint by #defining one or more of the following
    329//    symbols before creating the implementation.
    330//
    331//        STBI_NO_JPEG
    332//        STBI_NO_PNG
    333//        STBI_NO_BMP
    334//        STBI_NO_PSD
    335//        STBI_NO_TGA
    336//        STBI_NO_GIF
    337//        STBI_NO_HDR
    338//        STBI_NO_PIC
    339//        STBI_NO_PNM   (.ppm and .pgm)
    340//
    341//  - You can request *only* certain decoders and suppress all other ones
    342//    (this will be more forward-compatible, as addition of new decoders
    343//    doesn't require you to disable them explicitly):
    344//
    345//        STBI_ONLY_JPEG
    346//        STBI_ONLY_PNG
    347//        STBI_ONLY_BMP
    348//        STBI_ONLY_PSD
    349//        STBI_ONLY_TGA
    350//        STBI_ONLY_GIF
    351//        STBI_ONLY_HDR
    352//        STBI_ONLY_PIC
    353//        STBI_ONLY_PNM   (.ppm and .pgm)
    354//
    355//   - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
    356//     want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
    357//
    358//  - If you define STBI_MAX_DIMENSIONS, stb_image will reject images greater
    359//    than that size (in either width or height) without further processing.
    360//    This is to let programs in the wild set an upper bound to prevent
    361//    denial-of-service attacks on untrusted data, as one could generate a
    362//    valid image of gigantic dimensions and force stb_image to allocate a
    363//    huge block of memory and spend disproportionate time decoding it. By
    364//    default this is set to (1 << 24), which is 16777216, but that's still
    365//    very big.
    366
    367#ifndef STBI_NO_STDIO
    368#include <stdio.h>
    369#endif // STBI_NO_STDIO
    370
    371#define STBI_VERSION 1
    372
    373enum
    374{
    375   STBI_default = 0, // only used for desired_channels
    376
    377   STBI_grey       = 1,
    378   STBI_grey_alpha = 2,
    379   STBI_rgb        = 3,
    380   STBI_rgb_alpha  = 4
    381};
    382
    383#include <stdlib.h>
    384typedef unsigned char stbi_uc;
    385typedef unsigned short stbi_us;
    386
    387#ifdef __cplusplus
    388extern "C" {
    389#endif
    390
    391#ifndef STBIDEF
    392#ifdef STB_IMAGE_STATIC
    393#define STBIDEF static
    394#else
    395#define STBIDEF extern
    396#endif
    397#endif
    398
    399//////////////////////////////////////////////////////////////////////////////
    400//
    401// PRIMARY API - works on images of any type
    402//
    403
    404//
    405// load image by filename, open file, or memory buffer
    406//
    407
    408typedef struct
    409{
    410   int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
    411   void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
    412   int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
    413} stbi_io_callbacks;
    414
    415////////////////////////////////////
    416//
    417// 8-bits-per-channel interface
    418//
    419
    420STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *channels_in_file, int desired_channels);
    421STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *channels_in_file, int desired_channels);
    422
    423#ifndef STBI_NO_STDIO
    424STBIDEF stbi_uc *stbi_load            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    425STBIDEF stbi_uc *stbi_load_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    426// for stbi_load_from_file, file pointer is left pointing immediately after image
    427#endif
    428
    429#ifndef STBI_NO_GIF
    430STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
    431#endif
    432
    433#ifdef STBI_WINDOWS_UTF8
    434STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
    435#endif
    436
    437////////////////////////////////////
    438//
    439// 16-bits-per-channel interface
    440//
    441
    442STBIDEF stbi_us *stbi_load_16_from_memory   (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
    443STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
    444
    445#ifndef STBI_NO_STDIO
    446STBIDEF stbi_us *stbi_load_16          (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    447STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    448#endif
    449
    450////////////////////////////////////
    451//
    452// float-per-channel interface
    453//
    454#ifndef STBI_NO_LINEAR
    455   STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
    456   STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y,  int *channels_in_file, int desired_channels);
    457
    458   #ifndef STBI_NO_STDIO
    459   STBIDEF float *stbi_loadf            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    460   STBIDEF float *stbi_loadf_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    461   #endif
    462#endif
    463
    464#ifndef STBI_NO_HDR
    465   STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
    466   STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
    467#endif // STBI_NO_HDR
    468
    469#ifndef STBI_NO_LINEAR
    470   STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
    471   STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
    472#endif // STBI_NO_LINEAR
    473
    474// stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
    475STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
    476STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
    477#ifndef STBI_NO_STDIO
    478STBIDEF int      stbi_is_hdr          (char const *filename);
    479STBIDEF int      stbi_is_hdr_from_file(FILE *f);
    480#endif // STBI_NO_STDIO
    481
    482
    483// get a VERY brief reason for failure
    484// on most compilers (and ALL modern mainstream compilers) this is threadsafe
    485STBIDEF const char *stbi_failure_reason  (void);
    486
    487// free the loaded image -- this is just free()
    488STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
    489
    490// get image dimensions & components without fully decoding
    491STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    492STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
    493STBIDEF int      stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len);
    494STBIDEF int      stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user);
    495
    496#ifndef STBI_NO_STDIO
    497STBIDEF int      stbi_info               (char const *filename,     int *x, int *y, int *comp);
    498STBIDEF int      stbi_info_from_file     (FILE *f,                  int *x, int *y, int *comp);
    499STBIDEF int      stbi_is_16_bit          (char const *filename);
    500STBIDEF int      stbi_is_16_bit_from_file(FILE *f);
    501#endif
    502
    503
    504
    505// for image formats that explicitly notate that they have premultiplied alpha,
    506// we just return the colors as stored in the file. set this flag to force
    507// unpremultiplication. results are undefined if the unpremultiply overflow.
    508STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
    509
    510// indicate whether we should process iphone images back to canonical format,
    511// or just pass them through "as-is"
    512STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
    513
    514// flip the image vertically, so the first pixel in the output array is the bottom left
    515STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
    516
    517// as above, but only applies to images loaded on the thread that calls the function
    518// this function is only available if your compiler supports thread-local variables;
    519// calling it will fail to link if your compiler doesn't
    520STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply);
    521STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert);
    522STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip);
    523
    524// ZLIB client - used by PNG, available for other purposes
    525
    526STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
    527STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
    528STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
    529STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    530
    531STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
    532STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    533
    534
    535#ifdef __cplusplus
    536}
    537#endif
    538
    539//
    540//
    541////   end header file   /////////////////////////////////////////////////////
    542#endif // STBI_INCLUDE_STB_IMAGE_H
    543
    544#ifdef STB_IMAGE_IMPLEMENTATION
    545
    546#if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
    547  || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
    548  || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
    549  || defined(STBI_ONLY_ZLIB)
    550   #ifndef STBI_ONLY_JPEG
    551   #define STBI_NO_JPEG
    552   #endif
    553   #ifndef STBI_ONLY_PNG
    554   #define STBI_NO_PNG
    555   #endif
    556   #ifndef STBI_ONLY_BMP
    557   #define STBI_NO_BMP
    558   #endif
    559   #ifndef STBI_ONLY_PSD
    560   #define STBI_NO_PSD
    561   #endif
    562   #ifndef STBI_ONLY_TGA
    563   #define STBI_NO_TGA
    564   #endif
    565   #ifndef STBI_ONLY_GIF
    566   #define STBI_NO_GIF
    567   #endif
    568   #ifndef STBI_ONLY_HDR
    569   #define STBI_NO_HDR
    570   #endif
    571   #ifndef STBI_ONLY_PIC
    572   #define STBI_NO_PIC
    573   #endif
    574   #ifndef STBI_ONLY_PNM
    575   #define STBI_NO_PNM
    576   #endif
    577#endif
    578
    579#if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
    580#define STBI_NO_ZLIB
    581#endif
    582
    583
    584#include <stdarg.h>
    585#include <stddef.h> // ptrdiff_t on osx
    586#include <stdlib.h>
    587#include <string.h>
    588#include <limits.h>
    589
    590#if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
    591#include <math.h>  // ldexp, pow
    592#endif
    593
    594#ifndef STBI_NO_STDIO
    595#include <stdio.h>
    596#endif
    597
    598#ifndef STBI_ASSERT
    599#include <assert.h>
    600#define STBI_ASSERT(x) assert(x)
    601#endif
    602
    603#ifdef __cplusplus
    604#define STBI_EXTERN extern "C"
    605#else
    606#define STBI_EXTERN extern
    607#endif
    608
    609
    610#ifndef _MSC_VER
    611   #ifdef __cplusplus
    612   #define stbi_inline inline
    613   #else
    614   #define stbi_inline
    615   #endif
    616#else
    617   #define stbi_inline __forceinline
    618#endif
    619
    620#ifndef STBI_NO_THREAD_LOCALS
    621   #if defined(__cplusplus) &&  __cplusplus >= 201103L
    622      #define STBI_THREAD_LOCAL       thread_local
    623   #elif defined(__GNUC__) && __GNUC__ < 5
    624      #define STBI_THREAD_LOCAL       __thread
    625   #elif defined(_MSC_VER)
    626      #define STBI_THREAD_LOCAL       __declspec(thread)
    627   #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__)
    628      #define STBI_THREAD_LOCAL       _Thread_local
    629   #endif
    630
    631   #ifndef STBI_THREAD_LOCAL
    632      #if defined(__GNUC__)
    633        #define STBI_THREAD_LOCAL       __thread
    634      #endif
    635   #endif
    636#endif
    637
    638#ifdef _MSC_VER
    639typedef unsigned short stbi__uint16;
    640typedef   signed short stbi__int16;
    641typedef unsigned int   stbi__uint32;
    642typedef   signed int   stbi__int32;
    643#else
    644#include <stdint.h>
    645typedef uint16_t stbi__uint16;
    646typedef int16_t  stbi__int16;
    647typedef uint32_t stbi__uint32;
    648typedef int32_t  stbi__int32;
    649#endif
    650
    651// should produce compiler error if size is wrong
    652typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
    653
    654#ifdef _MSC_VER
    655#define STBI_NOTUSED(v)  (void)(v)
    656#else
    657#define STBI_NOTUSED(v)  (void)sizeof(v)
    658#endif
    659
    660#ifdef _MSC_VER
    661#define STBI_HAS_LROTL
    662#endif
    663
    664#ifdef STBI_HAS_LROTL
    665   #define stbi_lrot(x,y)  _lrotl(x,y)
    666#else
    667   #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (-(y) & 31)))
    668#endif
    669
    670#if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
    671// ok
    672#elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
    673// ok
    674#else
    675#error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
    676#endif
    677
    678#ifndef STBI_MALLOC
    679#define STBI_MALLOC(sz)           malloc(sz)
    680#define STBI_REALLOC(p,newsz)     realloc(p,newsz)
    681#define STBI_FREE(p)              free(p)
    682#endif
    683
    684#ifndef STBI_REALLOC_SIZED
    685#define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
    686#endif
    687
    688// x86/x64 detection
    689#if defined(__x86_64__) || defined(_M_X64)
    690#define STBI__X64_TARGET
    691#elif defined(__i386) || defined(_M_IX86)
    692#define STBI__X86_TARGET
    693#endif
    694
    695#if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
    696// gcc doesn't support sse2 intrinsics unless you compile with -msse2,
    697// which in turn means it gets to use SSE2 everywhere. This is unfortunate,
    698// but previous attempts to provide the SSE2 functions with runtime
    699// detection caused numerous issues. The way architecture extensions are
    700// exposed in GCC/Clang is, sadly, not really suited for one-file libs.
    701// New behavior: if compiled with -msse2, we use SSE2 without any
    702// detection; if not, we don't use it at all.
    703#define STBI_NO_SIMD
    704#endif
    705
    706#if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
    707// Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
    708//
    709// 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
    710// Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
    711// As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
    712// simultaneously enabling "-mstackrealign".
    713//
    714// See https://github.com/nothings/stb/issues/81 for more information.
    715//
    716// So default to no SSE2 on 32-bit MinGW. If you've read this far and added
    717// -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
    718#define STBI_NO_SIMD
    719#endif
    720
    721#if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
    722#define STBI_SSE2
    723#include <emmintrin.h>
    724
    725#ifdef _MSC_VER
    726
    727#if _MSC_VER >= 1400  // not VC6
    728#include <intrin.h> // __cpuid
    729static int stbi__cpuid3(void)
    730{
    731   int info[4];
    732   __cpuid(info,1);
    733   return info[3];
    734}
    735#else
    736static int stbi__cpuid3(void)
    737{
    738   int res;
    739   __asm {
    740      mov  eax,1
    741      cpuid
    742      mov  res,edx
    743   }
    744   return res;
    745}
    746#endif
    747
    748#define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
    749
    750#if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
    751static int stbi__sse2_available(void)
    752{
    753   int info3 = stbi__cpuid3();
    754   return ((info3 >> 26) & 1) != 0;
    755}
    756#endif
    757
    758#else // assume GCC-style if not VC++
    759#define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
    760
    761#if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
    762static int stbi__sse2_available(void)
    763{
    764   // If we're even attempting to compile this on GCC/Clang, that means
    765   // -msse2 is on, which means the compiler is allowed to use SSE2
    766   // instructions at will, and so are we.
    767   return 1;
    768}
    769#endif
    770
    771#endif
    772#endif
    773
    774// ARM NEON
    775#if defined(STBI_NO_SIMD) && defined(STBI_NEON)
    776#undef STBI_NEON
    777#endif
    778
    779#ifdef STBI_NEON
    780#include <arm_neon.h>
    781#ifdef _MSC_VER
    782#define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
    783#else
    784#define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
    785#endif
    786#endif
    787
    788#ifndef STBI_SIMD_ALIGN
    789#define STBI_SIMD_ALIGN(type, name) type name
    790#endif
    791
    792#ifndef STBI_MAX_DIMENSIONS
    793#define STBI_MAX_DIMENSIONS (1 << 24)
    794#endif
    795
    796///////////////////////////////////////////////
    797//
    798//  stbi__context struct and start_xxx functions
    799
    800// stbi__context structure is our basic context used by all images, so it
    801// contains all the IO context, plus some basic image information
    802typedef struct
    803{
    804   stbi__uint32 img_x, img_y;
    805   int img_n, img_out_n;
    806
    807   stbi_io_callbacks io;
    808   void *io_user_data;
    809
    810   int read_from_callbacks;
    811   int buflen;
    812   stbi_uc buffer_start[128];
    813   int callback_already_read;
    814
    815   stbi_uc *img_buffer, *img_buffer_end;
    816   stbi_uc *img_buffer_original, *img_buffer_original_end;
    817} stbi__context;
    818
    819
    820static void stbi__refill_buffer(stbi__context *s);
    821
    822// initialize a memory-decode context
    823static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
    824{
    825   s->io.read = NULL;
    826   s->read_from_callbacks = 0;
    827   s->callback_already_read = 0;
    828   s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
    829   s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
    830}
    831
    832// initialize a callback-based context
    833static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
    834{
    835   s->io = *c;
    836   s->io_user_data = user;
    837   s->buflen = sizeof(s->buffer_start);
    838   s->read_from_callbacks = 1;
    839   s->callback_already_read = 0;
    840   s->img_buffer = s->img_buffer_original = s->buffer_start;
    841   stbi__refill_buffer(s);
    842   s->img_buffer_original_end = s->img_buffer_end;
    843}
    844
    845#ifndef STBI_NO_STDIO
    846
    847static int stbi__stdio_read(void *user, char *data, int size)
    848{
    849   return (int) fread(data,1,size,(FILE*) user);
    850}
    851
    852static void stbi__stdio_skip(void *user, int n)
    853{
    854   int ch;
    855   fseek((FILE*) user, n, SEEK_CUR);
    856   ch = fgetc((FILE*) user);  /* have to read a byte to reset feof()'s flag */
    857   if (ch != EOF) {
    858      ungetc(ch, (FILE *) user);  /* push byte back onto stream if valid. */
    859   }
    860}
    861
    862static int stbi__stdio_eof(void *user)
    863{
    864   return feof((FILE*) user) || ferror((FILE *) user);
    865}
    866
    867static stbi_io_callbacks stbi__stdio_callbacks =
    868{
    869   stbi__stdio_read,
    870   stbi__stdio_skip,
    871   stbi__stdio_eof,
    872};
    873
    874static void stbi__start_file(stbi__context *s, FILE *f)
    875{
    876   stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
    877}
    878
    879//static void stop_file(stbi__context *s) { }
    880
    881#endif // !STBI_NO_STDIO
    882
    883static void stbi__rewind(stbi__context *s)
    884{
    885   // conceptually rewind SHOULD rewind to the beginning of the stream,
    886   // but we just rewind to the beginning of the initial buffer, because
    887   // we only use it after doing 'test', which only ever looks at at most 92 bytes
    888   s->img_buffer = s->img_buffer_original;
    889   s->img_buffer_end = s->img_buffer_original_end;
    890}
    891
    892enum
    893{
    894   STBI_ORDER_RGB,
    895   STBI_ORDER_BGR
    896};
    897
    898typedef struct
    899{
    900   int bits_per_channel;
    901   int num_channels;
    902   int channel_order;
    903} stbi__result_info;
    904
    905#ifndef STBI_NO_JPEG
    906static int      stbi__jpeg_test(stbi__context *s);
    907static void    *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    908static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
    909#endif
    910
    911#ifndef STBI_NO_PNG
    912static int      stbi__png_test(stbi__context *s);
    913static void    *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    914static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
    915static int      stbi__png_is16(stbi__context *s);
    916#endif
    917
    918#ifndef STBI_NO_BMP
    919static int      stbi__bmp_test(stbi__context *s);
    920static void    *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    921static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
    922#endif
    923
    924#ifndef STBI_NO_TGA
    925static int      stbi__tga_test(stbi__context *s);
    926static void    *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    927static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
    928#endif
    929
    930#ifndef STBI_NO_PSD
    931static int      stbi__psd_test(stbi__context *s);
    932static void    *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc);
    933static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
    934static int      stbi__psd_is16(stbi__context *s);
    935#endif
    936
    937#ifndef STBI_NO_HDR
    938static int      stbi__hdr_test(stbi__context *s);
    939static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    940static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
    941#endif
    942
    943#ifndef STBI_NO_PIC
    944static int      stbi__pic_test(stbi__context *s);
    945static void    *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    946static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
    947#endif
    948
    949#ifndef STBI_NO_GIF
    950static int      stbi__gif_test(stbi__context *s);
    951static void    *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    952static void    *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
    953static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
    954#endif
    955
    956#ifndef STBI_NO_PNM
    957static int      stbi__pnm_test(stbi__context *s);
    958static void    *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    959static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
    960static int      stbi__pnm_is16(stbi__context *s);
    961#endif
    962
    963static
    964#ifdef STBI_THREAD_LOCAL
    965STBI_THREAD_LOCAL
    966#endif
    967const char *stbi__g_failure_reason;
    968
    969STBIDEF const char *stbi_failure_reason(void)
    970{
    971   return stbi__g_failure_reason;
    972}
    973
    974#ifndef STBI_NO_FAILURE_STRINGS
    975static int stbi__err(const char *str)
    976{
    977   stbi__g_failure_reason = str;
    978   return 0;
    979}
    980#endif
    981
    982static void *stbi__malloc(size_t size)
    983{
    984    return STBI_MALLOC(size);
    985}
    986
    987// stb_image uses ints pervasively, including for offset calculations.
    988// therefore the largest decoded image size we can support with the
    989// current code, even on 64-bit targets, is INT_MAX. this is not a
    990// significant limitation for the intended use case.
    991//
    992// we do, however, need to make sure our size calculations don't
    993// overflow. hence a few helper functions for size calculations that
    994// multiply integers together, making sure that they're non-negative
    995// and no overflow occurs.
    996
    997// return 1 if the sum is valid, 0 on overflow.
    998// negative terms are considered invalid.
    999static int stbi__addsizes_valid(int a, int b)
   1000{
   1001   if (b < 0) return 0;
   1002   // now 0 <= b <= INT_MAX, hence also
   1003   // 0 <= INT_MAX - b <= INTMAX.
   1004   // And "a + b <= INT_MAX" (which might overflow) is the
   1005   // same as a <= INT_MAX - b (no overflow)
   1006   return a <= INT_MAX - b;
   1007}
   1008
   1009// returns 1 if the product is valid, 0 on overflow.
   1010// negative factors are considered invalid.
   1011static int stbi__mul2sizes_valid(int a, int b)
   1012{
   1013   if (a < 0 || b < 0) return 0;
   1014   if (b == 0) return 1; // mul-by-0 is always safe
   1015   // portable way to check for no overflows in a*b
   1016   return a <= INT_MAX/b;
   1017}
   1018
   1019#if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
   1020// returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
   1021static int stbi__mad2sizes_valid(int a, int b, int add)
   1022{
   1023   return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add);
   1024}
   1025#endif
   1026
   1027// returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
   1028static int stbi__mad3sizes_valid(int a, int b, int c, int add)
   1029{
   1030   return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
   1031      stbi__addsizes_valid(a*b*c, add);
   1032}
   1033
   1034// returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
   1035#if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
   1036static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
   1037{
   1038   return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
   1039      stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add);
   1040}
   1041#endif
   1042
   1043#if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
   1044// mallocs with size overflow checking
   1045static void *stbi__malloc_mad2(int a, int b, int add)
   1046{
   1047   if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
   1048   return stbi__malloc(a*b + add);
   1049}
   1050#endif
   1051
   1052static void *stbi__malloc_mad3(int a, int b, int c, int add)
   1053{
   1054   if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
   1055   return stbi__malloc(a*b*c + add);
   1056}
   1057
   1058#if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
   1059static void *stbi__malloc_mad4(int a, int b, int c, int d, int add)
   1060{
   1061   if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
   1062   return stbi__malloc(a*b*c*d + add);
   1063}
   1064#endif
   1065
   1066// stbi__err - error
   1067// stbi__errpf - error returning pointer to float
   1068// stbi__errpuc - error returning pointer to unsigned char
   1069
   1070#ifdef STBI_NO_FAILURE_STRINGS
   1071   #define stbi__err(x,y)  0
   1072#elif defined(STBI_FAILURE_USERMSG)
   1073   #define stbi__err(x,y)  stbi__err(y)
   1074#else
   1075   #define stbi__err(x,y)  stbi__err(x)
   1076#endif
   1077
   1078#define stbi__errpf(x,y)   ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
   1079#define stbi__errpuc(x,y)  ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
   1080
   1081STBIDEF void stbi_image_free(void *retval_from_stbi_load)
   1082{
   1083   STBI_FREE(retval_from_stbi_load);
   1084}
   1085
   1086#ifndef STBI_NO_LINEAR
   1087static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
   1088#endif
   1089
   1090#ifndef STBI_NO_HDR
   1091static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
   1092#endif
   1093
   1094static int stbi__vertically_flip_on_load_global = 0;
   1095
   1096STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
   1097{
   1098   stbi__vertically_flip_on_load_global = flag_true_if_should_flip;
   1099}
   1100
   1101#ifndef STBI_THREAD_LOCAL
   1102#define stbi__vertically_flip_on_load  stbi__vertically_flip_on_load_global
   1103#else
   1104static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set;
   1105
   1106STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip)
   1107{
   1108   stbi__vertically_flip_on_load_local = flag_true_if_should_flip;
   1109   stbi__vertically_flip_on_load_set = 1;
   1110}
   1111
   1112#define stbi__vertically_flip_on_load  (stbi__vertically_flip_on_load_set       \
   1113                                         ? stbi__vertically_flip_on_load_local  \
   1114                                         : stbi__vertically_flip_on_load_global)
   1115#endif // STBI_THREAD_LOCAL
   1116
   1117static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
   1118{
   1119   memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
   1120   ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
   1121   ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
   1122   ri->num_channels = 0;
   1123
   1124   // test the formats with a very explicit header first (at least a FOURCC
   1125   // or distinctive magic number first)
   1126   #ifndef STBI_NO_PNG
   1127   if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp, ri);
   1128   #endif
   1129   #ifndef STBI_NO_BMP
   1130   if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp, ri);
   1131   #endif
   1132   #ifndef STBI_NO_GIF
   1133   if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp, ri);
   1134   #endif
   1135   #ifndef STBI_NO_PSD
   1136   if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc);
   1137   #else
   1138   STBI_NOTUSED(bpc);
   1139   #endif
   1140   #ifndef STBI_NO_PIC
   1141   if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp, ri);
   1142   #endif
   1143
   1144   // then the formats that can end up attempting to load with just 1 or 2
   1145   // bytes matching expectations; these are prone to false positives, so
   1146   // try them later
   1147   #ifndef STBI_NO_JPEG
   1148   if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri);
   1149   #endif
   1150   #ifndef STBI_NO_PNM
   1151   if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp, ri);
   1152   #endif
   1153
   1154   #ifndef STBI_NO_HDR
   1155   if (stbi__hdr_test(s)) {
   1156      float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri);
   1157      return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
   1158   }
   1159   #endif
   1160
   1161   #ifndef STBI_NO_TGA
   1162   // test tga last because it's a crappy test!
   1163   if (stbi__tga_test(s))
   1164      return stbi__tga_load(s,x,y,comp,req_comp, ri);
   1165   #endif
   1166
   1167   return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
   1168}
   1169
   1170static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels)
   1171{
   1172   int i;
   1173   int img_len = w * h * channels;
   1174   stbi_uc *reduced;
   1175
   1176   reduced = (stbi_uc *) stbi__malloc(img_len);
   1177   if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
   1178
   1179   for (i = 0; i < img_len; ++i)
   1180      reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
   1181
   1182   STBI_FREE(orig);
   1183   return reduced;
   1184}
   1185
   1186static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels)
   1187{
   1188   int i;
   1189   int img_len = w * h * channels;
   1190   stbi__uint16 *enlarged;
   1191
   1192   enlarged = (stbi__uint16 *) stbi__malloc(img_len*2);
   1193   if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
   1194
   1195   for (i = 0; i < img_len; ++i)
   1196      enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
   1197
   1198   STBI_FREE(orig);
   1199   return enlarged;
   1200}
   1201
   1202static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel)
   1203{
   1204   int row;
   1205   size_t bytes_per_row = (size_t)w * bytes_per_pixel;
   1206   stbi_uc temp[2048];
   1207   stbi_uc *bytes = (stbi_uc *)image;
   1208
   1209   for (row = 0; row < (h>>1); row++) {
   1210      stbi_uc *row0 = bytes + row*bytes_per_row;
   1211      stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row;
   1212      // swap row0 with row1
   1213      size_t bytes_left = bytes_per_row;
   1214      while (bytes_left) {
   1215         size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
   1216         memcpy(temp, row0, bytes_copy);
   1217         memcpy(row0, row1, bytes_copy);
   1218         memcpy(row1, temp, bytes_copy);
   1219         row0 += bytes_copy;
   1220         row1 += bytes_copy;
   1221         bytes_left -= bytes_copy;
   1222      }
   1223   }
   1224}
   1225
   1226#ifndef STBI_NO_GIF
   1227static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel)
   1228{
   1229   int slice;
   1230   int slice_size = w * h * bytes_per_pixel;
   1231
   1232   stbi_uc *bytes = (stbi_uc *)image;
   1233   for (slice = 0; slice < z; ++slice) {
   1234      stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
   1235      bytes += slice_size;
   1236   }
   1237}
   1238#endif
   1239
   1240static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1241{
   1242   stbi__result_info ri;
   1243   void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
   1244
   1245   if (result == NULL)
   1246      return NULL;
   1247
   1248   // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
   1249   STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
   1250
   1251   if (ri.bits_per_channel != 8) {
   1252      result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
   1253      ri.bits_per_channel = 8;
   1254   }
   1255
   1256   // @TODO: move stbi__convert_format to here
   1257
   1258   if (stbi__vertically_flip_on_load) {
   1259      int channels = req_comp ? req_comp : *comp;
   1260      stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
   1261   }
   1262
   1263   return (unsigned char *) result;
   1264}
   1265
   1266static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1267{
   1268   stbi__result_info ri;
   1269   void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
   1270
   1271   if (result == NULL)
   1272      return NULL;
   1273
   1274   // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
   1275   STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
   1276
   1277   if (ri.bits_per_channel != 16) {
   1278      result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
   1279      ri.bits_per_channel = 16;
   1280   }
   1281
   1282   // @TODO: move stbi__convert_format16 to here
   1283   // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
   1284
   1285   if (stbi__vertically_flip_on_load) {
   1286      int channels = req_comp ? req_comp : *comp;
   1287      stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
   1288   }
   1289
   1290   return (stbi__uint16 *) result;
   1291}
   1292
   1293#if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR)
   1294static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
   1295{
   1296   if (stbi__vertically_flip_on_load && result != NULL) {
   1297      int channels = req_comp ? req_comp : *comp;
   1298      stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
   1299   }
   1300}
   1301#endif
   1302
   1303#ifndef STBI_NO_STDIO
   1304
   1305#if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
   1306STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
   1307STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
   1308#endif
   1309
   1310#if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
   1311STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
   1312{
   1313	return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
   1314}
   1315#endif
   1316
   1317static FILE *stbi__fopen(char const *filename, char const *mode)
   1318{
   1319   FILE *f;
   1320#if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
   1321   wchar_t wMode[64];
   1322   wchar_t wFilename[1024];
   1323	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)/sizeof(*wFilename)))
   1324      return 0;
   1325
   1326	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)/sizeof(*wMode)))
   1327      return 0;
   1328
   1329#if defined(_MSC_VER) && _MSC_VER >= 1400
   1330	if (0 != _wfopen_s(&f, wFilename, wMode))
   1331		f = 0;
   1332#else
   1333   f = _wfopen(wFilename, wMode);
   1334#endif
   1335
   1336#elif defined(_MSC_VER) && _MSC_VER >= 1400
   1337   if (0 != fopen_s(&f, filename, mode))
   1338      f=0;
   1339#else
   1340   f = fopen(filename, mode);
   1341#endif
   1342   return f;
   1343}
   1344
   1345
   1346STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
   1347{
   1348   FILE *f = stbi__fopen(filename, "rb");
   1349   unsigned char *result;
   1350   if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
   1351   result = stbi_load_from_file(f,x,y,comp,req_comp);
   1352   fclose(f);
   1353   return result;
   1354}
   1355
   1356STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   1357{
   1358   unsigned char *result;
   1359   stbi__context s;
   1360   stbi__start_file(&s,f);
   1361   result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1362   if (result) {
   1363      // need to 'unget' all the characters in the IO buffer
   1364      fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   1365   }
   1366   return result;
   1367}
   1368
   1369STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp)
   1370{
   1371   stbi__uint16 *result;
   1372   stbi__context s;
   1373   stbi__start_file(&s,f);
   1374   result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp);
   1375   if (result) {
   1376      // need to 'unget' all the characters in the IO buffer
   1377      fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   1378   }
   1379   return result;
   1380}
   1381
   1382STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp)
   1383{
   1384   FILE *f = stbi__fopen(filename, "rb");
   1385   stbi__uint16 *result;
   1386   if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file");
   1387   result = stbi_load_from_file_16(f,x,y,comp,req_comp);
   1388   fclose(f);
   1389   return result;
   1390}
   1391
   1392
   1393#endif //!STBI_NO_STDIO
   1394
   1395STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels)
   1396{
   1397   stbi__context s;
   1398   stbi__start_mem(&s,buffer,len);
   1399   return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
   1400}
   1401
   1402STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels)
   1403{
   1404   stbi__context s;
   1405   stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user);
   1406   return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
   1407}
   1408
   1409STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   1410{
   1411   stbi__context s;
   1412   stbi__start_mem(&s,buffer,len);
   1413   return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1414}
   1415
   1416STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
   1417{
   1418   stbi__context s;
   1419   stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1420   return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1421}
   1422
   1423#ifndef STBI_NO_GIF
   1424STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
   1425{
   1426   unsigned char *result;
   1427   stbi__context s;
   1428   stbi__start_mem(&s,buffer,len);
   1429
   1430   result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
   1431   if (stbi__vertically_flip_on_load) {
   1432      stbi__vertical_flip_slices( result, *x, *y, *z, *comp );
   1433   }
   1434
   1435   return result;
   1436}
   1437#endif
   1438
   1439#ifndef STBI_NO_LINEAR
   1440static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1441{
   1442   unsigned char *data;
   1443   #ifndef STBI_NO_HDR
   1444   if (stbi__hdr_test(s)) {
   1445      stbi__result_info ri;
   1446      float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri);
   1447      if (hdr_data)
   1448         stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
   1449      return hdr_data;
   1450   }
   1451   #endif
   1452   data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
   1453   if (data)
   1454      return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
   1455   return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
   1456}
   1457
   1458STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   1459{
   1460   stbi__context s;
   1461   stbi__start_mem(&s,buffer,len);
   1462   return stbi__loadf_main(&s,x,y,comp,req_comp);
   1463}
   1464
   1465STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
   1466{
   1467   stbi__context s;
   1468   stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1469   return stbi__loadf_main(&s,x,y,comp,req_comp);
   1470}
   1471
   1472#ifndef STBI_NO_STDIO
   1473STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
   1474{
   1475   float *result;
   1476   FILE *f = stbi__fopen(filename, "rb");
   1477   if (!f) return stbi__errpf("can't fopen", "Unable to open file");
   1478   result = stbi_loadf_from_file(f,x,y,comp,req_comp);
   1479   fclose(f);
   1480   return result;
   1481}
   1482
   1483STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   1484{
   1485   stbi__context s;
   1486   stbi__start_file(&s,f);
   1487   return stbi__loadf_main(&s,x,y,comp,req_comp);
   1488}
   1489#endif // !STBI_NO_STDIO
   1490
   1491#endif // !STBI_NO_LINEAR
   1492
   1493// these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
   1494// defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
   1495// reports false!
   1496
   1497STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
   1498{
   1499   #ifndef STBI_NO_HDR
   1500   stbi__context s;
   1501   stbi__start_mem(&s,buffer,len);
   1502   return stbi__hdr_test(&s);
   1503   #else
   1504   STBI_NOTUSED(buffer);
   1505   STBI_NOTUSED(len);
   1506   return 0;
   1507   #endif
   1508}
   1509
   1510#ifndef STBI_NO_STDIO
   1511STBIDEF int      stbi_is_hdr          (char const *filename)
   1512{
   1513   FILE *f = stbi__fopen(filename, "rb");
   1514   int result=0;
   1515   if (f) {
   1516      result = stbi_is_hdr_from_file(f);
   1517      fclose(f);
   1518   }
   1519   return result;
   1520}
   1521
   1522STBIDEF int stbi_is_hdr_from_file(FILE *f)
   1523{
   1524   #ifndef STBI_NO_HDR
   1525   long pos = ftell(f);
   1526   int res;
   1527   stbi__context s;
   1528   stbi__start_file(&s,f);
   1529   res = stbi__hdr_test(&s);
   1530   fseek(f, pos, SEEK_SET);
   1531   return res;
   1532   #else
   1533   STBI_NOTUSED(f);
   1534   return 0;
   1535   #endif
   1536}
   1537#endif // !STBI_NO_STDIO
   1538
   1539STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
   1540{
   1541   #ifndef STBI_NO_HDR
   1542   stbi__context s;
   1543   stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1544   return stbi__hdr_test(&s);
   1545   #else
   1546   STBI_NOTUSED(clbk);
   1547   STBI_NOTUSED(user);
   1548   return 0;
   1549   #endif
   1550}
   1551
   1552#ifndef STBI_NO_LINEAR
   1553static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
   1554
   1555STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
   1556STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
   1557#endif
   1558
   1559static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
   1560
   1561STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
   1562STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
   1563
   1564
   1565//////////////////////////////////////////////////////////////////////////////
   1566//
   1567// Common code used by all image loaders
   1568//
   1569
   1570enum
   1571{
   1572   STBI__SCAN_load=0,
   1573   STBI__SCAN_type,
   1574   STBI__SCAN_header
   1575};
   1576
   1577static void stbi__refill_buffer(stbi__context *s)
   1578{
   1579   int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
   1580   s->callback_already_read += (int) (s->img_buffer - s->img_buffer_original);
   1581   if (n == 0) {
   1582      // at end of file, treat same as if from memory, but need to handle case
   1583      // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
   1584      s->read_from_callbacks = 0;
   1585      s->img_buffer = s->buffer_start;
   1586      s->img_buffer_end = s->buffer_start+1;
   1587      *s->img_buffer = 0;
   1588   } else {
   1589      s->img_buffer = s->buffer_start;
   1590      s->img_buffer_end = s->buffer_start + n;
   1591   }
   1592}
   1593
   1594stbi_inline static stbi_uc stbi__get8(stbi__context *s)
   1595{
   1596   if (s->img_buffer < s->img_buffer_end)
   1597      return *s->img_buffer++;
   1598   if (s->read_from_callbacks) {
   1599      stbi__refill_buffer(s);
   1600      return *s->img_buffer++;
   1601   }
   1602   return 0;
   1603}
   1604
   1605#if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1606// nothing
   1607#else
   1608stbi_inline static int stbi__at_eof(stbi__context *s)
   1609{
   1610   if (s->io.read) {
   1611      if (!(s->io.eof)(s->io_user_data)) return 0;
   1612      // if feof() is true, check if buffer = end
   1613      // special case: we've only got the special 0 character at the end
   1614      if (s->read_from_callbacks == 0) return 1;
   1615   }
   1616
   1617   return s->img_buffer >= s->img_buffer_end;
   1618}
   1619#endif
   1620
   1621#if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC)
   1622// nothing
   1623#else
   1624static void stbi__skip(stbi__context *s, int n)
   1625{
   1626   if (n == 0) return;  // already there!
   1627   if (n < 0) {
   1628      s->img_buffer = s->img_buffer_end;
   1629      return;
   1630   }
   1631   if (s->io.read) {
   1632      int blen = (int) (s->img_buffer_end - s->img_buffer);
   1633      if (blen < n) {
   1634         s->img_buffer = s->img_buffer_end;
   1635         (s->io.skip)(s->io_user_data, n - blen);
   1636         return;
   1637      }
   1638   }
   1639   s->img_buffer += n;
   1640}
   1641#endif
   1642
   1643#if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM)
   1644// nothing
   1645#else
   1646static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
   1647{
   1648   if (s->io.read) {
   1649      int blen = (int) (s->img_buffer_end - s->img_buffer);
   1650      if (blen < n) {
   1651         int res, count;
   1652
   1653         memcpy(buffer, s->img_buffer, blen);
   1654
   1655         count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
   1656         res = (count == (n-blen));
   1657         s->img_buffer = s->img_buffer_end;
   1658         return res;
   1659      }
   1660   }
   1661
   1662   if (s->img_buffer+n <= s->img_buffer_end) {
   1663      memcpy(buffer, s->img_buffer, n);
   1664      s->img_buffer += n;
   1665      return 1;
   1666   } else
   1667      return 0;
   1668}
   1669#endif
   1670
   1671#if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
   1672// nothing
   1673#else
   1674static int stbi__get16be(stbi__context *s)
   1675{
   1676   int z = stbi__get8(s);
   1677   return (z << 8) + stbi__get8(s);
   1678}
   1679#endif
   1680
   1681#if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
   1682// nothing
   1683#else
   1684static stbi__uint32 stbi__get32be(stbi__context *s)
   1685{
   1686   stbi__uint32 z = stbi__get16be(s);
   1687   return (z << 16) + stbi__get16be(s);
   1688}
   1689#endif
   1690
   1691#if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
   1692// nothing
   1693#else
   1694static int stbi__get16le(stbi__context *s)
   1695{
   1696   int z = stbi__get8(s);
   1697   return z + (stbi__get8(s) << 8);
   1698}
   1699#endif
   1700
   1701#ifndef STBI_NO_BMP
   1702static stbi__uint32 stbi__get32le(stbi__context *s)
   1703{
   1704   stbi__uint32 z = stbi__get16le(s);
   1705   z += (stbi__uint32)stbi__get16le(s) << 16;
   1706   return z;
   1707}
   1708#endif
   1709
   1710#define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
   1711
   1712#if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1713// nothing
   1714#else
   1715//////////////////////////////////////////////////////////////////////////////
   1716//
   1717//  generic converter from built-in img_n to req_comp
   1718//    individual types do this automatically as much as possible (e.g. jpeg
   1719//    does all cases internally since it needs to colorspace convert anyway,
   1720//    and it never has alpha, so very few cases ). png can automatically
   1721//    interleave an alpha=255 channel, but falls back to this for other cases
   1722//
   1723//  assume data buffer is malloced, so malloc a new one and free that one
   1724//  only failure mode is malloc failing
   1725
   1726static stbi_uc stbi__compute_y(int r, int g, int b)
   1727{
   1728   return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
   1729}
   1730#endif
   1731
   1732#if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1733// nothing
   1734#else
   1735static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
   1736{
   1737   int i,j;
   1738   unsigned char *good;
   1739
   1740   if (req_comp == img_n) return data;
   1741   STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
   1742
   1743   good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0);
   1744   if (good == NULL) {
   1745      STBI_FREE(data);
   1746      return stbi__errpuc("outofmem", "Out of memory");
   1747   }
   1748
   1749   for (j=0; j < (int) y; ++j) {
   1750      unsigned char *src  = data + j * x * img_n   ;
   1751      unsigned char *dest = good + j * x * req_comp;
   1752
   1753      #define STBI__COMBO(a,b)  ((a)*8+(b))
   1754      #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1755      // convert source image with img_n components to one with req_comp components;
   1756      // avoid switch per pixel, so use switch per scanline and massive macros
   1757      switch (STBI__COMBO(img_n, req_comp)) {
   1758         STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255;                                     } break;
   1759         STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
   1760         STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255;                     } break;
   1761         STBI__CASE(2,1) { dest[0]=src[0];                                                  } break;
   1762         STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
   1763         STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                  } break;
   1764         STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255;        } break;
   1765         STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
   1766         STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255;    } break;
   1767         STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
   1768         STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break;
   1769         STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                    } break;
   1770         default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return stbi__errpuc("unsupported", "Unsupported format conversion");
   1771      }
   1772      #undef STBI__CASE
   1773   }
   1774
   1775   STBI_FREE(data);
   1776   return good;
   1777}
   1778#endif
   1779
   1780#if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
   1781// nothing
   1782#else
   1783static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
   1784{
   1785   return (stbi__uint16) (((r*77) + (g*150) +  (29*b)) >> 8);
   1786}
   1787#endif
   1788
   1789#if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
   1790// nothing
   1791#else
   1792static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y)
   1793{
   1794   int i,j;
   1795   stbi__uint16 *good;
   1796
   1797   if (req_comp == img_n) return data;
   1798   STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
   1799
   1800   good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2);
   1801   if (good == NULL) {
   1802      STBI_FREE(data);
   1803      return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
   1804   }
   1805
   1806   for (j=0; j < (int) y; ++j) {
   1807      stbi__uint16 *src  = data + j * x * img_n   ;
   1808      stbi__uint16 *dest = good + j * x * req_comp;
   1809
   1810      #define STBI__COMBO(a,b)  ((a)*8+(b))
   1811      #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1812      // convert source image with img_n components to one with req_comp components;
   1813      // avoid switch per pixel, so use switch per scanline and massive macros
   1814      switch (STBI__COMBO(img_n, req_comp)) {
   1815         STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff;                                     } break;
   1816         STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
   1817         STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff;                     } break;
   1818         STBI__CASE(2,1) { dest[0]=src[0];                                                     } break;
   1819         STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
   1820         STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                     } break;
   1821         STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff;        } break;
   1822         STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
   1823         STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break;
   1824         STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
   1825         STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break;
   1826         STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                       } break;
   1827         default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return (stbi__uint16*) stbi__errpuc("unsupported", "Unsupported format conversion");
   1828      }
   1829      #undef STBI__CASE
   1830   }
   1831
   1832   STBI_FREE(data);
   1833   return good;
   1834}
   1835#endif
   1836
   1837#ifndef STBI_NO_LINEAR
   1838static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
   1839{
   1840   int i,k,n;
   1841   float *output;
   1842   if (!data) return NULL;
   1843   output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
   1844   if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
   1845   // compute number of non-alpha components
   1846   if (comp & 1) n = comp; else n = comp-1;
   1847   for (i=0; i < x*y; ++i) {
   1848      for (k=0; k < n; ++k) {
   1849         output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
   1850      }
   1851   }
   1852   if (n < comp) {
   1853      for (i=0; i < x*y; ++i) {
   1854         output[i*comp + n] = data[i*comp + n]/255.0f;
   1855      }
   1856   }
   1857   STBI_FREE(data);
   1858   return output;
   1859}
   1860#endif
   1861
   1862#ifndef STBI_NO_HDR
   1863#define stbi__float2int(x)   ((int) (x))
   1864static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
   1865{
   1866   int i,k,n;
   1867   stbi_uc *output;
   1868   if (!data) return NULL;
   1869   output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0);
   1870   if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
   1871   // compute number of non-alpha components
   1872   if (comp & 1) n = comp; else n = comp-1;
   1873   for (i=0; i < x*y; ++i) {
   1874      for (k=0; k < n; ++k) {
   1875         float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
   1876         if (z < 0) z = 0;
   1877         if (z > 255) z = 255;
   1878         output[i*comp + k] = (stbi_uc) stbi__float2int(z);
   1879      }
   1880      if (k < comp) {
   1881         float z = data[i*comp+k] * 255 + 0.5f;
   1882         if (z < 0) z = 0;
   1883         if (z > 255) z = 255;
   1884         output[i*comp + k] = (stbi_uc) stbi__float2int(z);
   1885      }
   1886   }
   1887   STBI_FREE(data);
   1888   return output;
   1889}
   1890#endif
   1891
   1892//////////////////////////////////////////////////////////////////////////////
   1893//
   1894//  "baseline" JPEG/JFIF decoder
   1895//
   1896//    simple implementation
   1897//      - doesn't support delayed output of y-dimension
   1898//      - simple interface (only one output format: 8-bit interleaved RGB)
   1899//      - doesn't try to recover corrupt jpegs
   1900//      - doesn't allow partial loading, loading multiple at once
   1901//      - still fast on x86 (copying globals into locals doesn't help x86)
   1902//      - allocates lots of intermediate memory (full size of all components)
   1903//        - non-interleaved case requires this anyway
   1904//        - allows good upsampling (see next)
   1905//    high-quality
   1906//      - upsampled channels are bilinearly interpolated, even across blocks
   1907//      - quality integer IDCT derived from IJG's 'slow'
   1908//    performance
   1909//      - fast huffman; reasonable integer IDCT
   1910//      - some SIMD kernels for common paths on targets with SSE2/NEON
   1911//      - uses a lot of intermediate memory, could cache poorly
   1912
   1913#ifndef STBI_NO_JPEG
   1914
   1915// huffman decoding acceleration
   1916#define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
   1917
   1918typedef struct
   1919{
   1920   stbi_uc  fast[1 << FAST_BITS];
   1921   // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
   1922   stbi__uint16 code[256];
   1923   stbi_uc  values[256];
   1924   stbi_uc  size[257];
   1925   unsigned int maxcode[18];
   1926   int    delta[17];   // old 'firstsymbol' - old 'firstcode'
   1927} stbi__huffman;
   1928
   1929typedef struct
   1930{
   1931   stbi__context *s;
   1932   stbi__huffman huff_dc[4];
   1933   stbi__huffman huff_ac[4];
   1934   stbi__uint16 dequant[4][64];
   1935   stbi__int16 fast_ac[4][1 << FAST_BITS];
   1936
   1937// sizes for components, interleaved MCUs
   1938   int img_h_max, img_v_max;
   1939   int img_mcu_x, img_mcu_y;
   1940   int img_mcu_w, img_mcu_h;
   1941
   1942// definition of jpeg image component
   1943   struct
   1944   {
   1945      int id;
   1946      int h,v;
   1947      int tq;
   1948      int hd,ha;
   1949      int dc_pred;
   1950
   1951      int x,y,w2,h2;
   1952      stbi_uc *data;
   1953      void *raw_data, *raw_coeff;
   1954      stbi_uc *linebuf;
   1955      short   *coeff;   // progressive only
   1956      int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
   1957   } img_comp[4];
   1958
   1959   stbi__uint32   code_buffer; // jpeg entropy-coded buffer
   1960   int            code_bits;   // number of valid bits
   1961   unsigned char  marker;      // marker seen while filling entropy buffer
   1962   int            nomore;      // flag if we saw a marker so must stop
   1963
   1964   int            progressive;
   1965   int            spec_start;
   1966   int            spec_end;
   1967   int            succ_high;
   1968   int            succ_low;
   1969   int            eob_run;
   1970   int            jfif;
   1971   int            app14_color_transform; // Adobe APP14 tag
   1972   int            rgb;
   1973
   1974   int scan_n, order[4];
   1975   int restart_interval, todo;
   1976
   1977// kernels
   1978   void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
   1979   void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
   1980   stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
   1981} stbi__jpeg;
   1982
   1983static int stbi__build_huffman(stbi__huffman *h, int *count)
   1984{
   1985   int i,j,k=0;
   1986   unsigned int code;
   1987   // build size list for each symbol (from JPEG spec)
   1988   for (i=0; i < 16; ++i)
   1989      for (j=0; j < count[i]; ++j)
   1990         h->size[k++] = (stbi_uc) (i+1);
   1991   h->size[k] = 0;
   1992
   1993   // compute actual symbols (from jpeg spec)
   1994   code = 0;
   1995   k = 0;
   1996   for(j=1; j <= 16; ++j) {
   1997      // compute delta to add to code to compute symbol id
   1998      h->delta[j] = k - code;
   1999      if (h->size[k] == j) {
   2000         while (h->size[k] == j)
   2001            h->code[k++] = (stbi__uint16) (code++);
   2002         if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG");
   2003      }
   2004      // compute largest code + 1 for this size, preshifted as needed later
   2005      h->maxcode[j] = code << (16-j);
   2006      code <<= 1;
   2007   }
   2008   h->maxcode[j] = 0xffffffff;
   2009
   2010   // build non-spec acceleration table; 255 is flag for not-accelerated
   2011   memset(h->fast, 255, 1 << FAST_BITS);
   2012   for (i=0; i < k; ++i) {
   2013      int s = h->size[i];
   2014      if (s <= FAST_BITS) {
   2015         int c = h->code[i] << (FAST_BITS-s);
   2016         int m = 1 << (FAST_BITS-s);
   2017         for (j=0; j < m; ++j) {
   2018            h->fast[c+j] = (stbi_uc) i;
   2019         }
   2020      }
   2021   }
   2022   return 1;
   2023}
   2024
   2025// build a table that decodes both magnitude and value of small ACs in
   2026// one go.
   2027static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
   2028{
   2029   int i;
   2030   for (i=0; i < (1 << FAST_BITS); ++i) {
   2031      stbi_uc fast = h->fast[i];
   2032      fast_ac[i] = 0;
   2033      if (fast < 255) {
   2034         int rs = h->values[fast];
   2035         int run = (rs >> 4) & 15;
   2036         int magbits = rs & 15;
   2037         int len = h->size[fast];
   2038
   2039         if (magbits && len + magbits <= FAST_BITS) {
   2040            // magnitude code followed by receive_extend code
   2041            int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
   2042            int m = 1 << (magbits - 1);
   2043            if (k < m) k += (~0U << magbits) + 1;
   2044            // if the result is small enough, we can fit it in fast_ac table
   2045            if (k >= -128 && k <= 127)
   2046               fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits));
   2047         }
   2048      }
   2049   }
   2050}
   2051
   2052static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
   2053{
   2054   do {
   2055      unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
   2056      if (b == 0xff) {
   2057         int c = stbi__get8(j->s);
   2058         while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
   2059         if (c != 0) {
   2060            j->marker = (unsigned char) c;
   2061            j->nomore = 1;
   2062            return;
   2063         }
   2064      }
   2065      j->code_buffer |= b << (24 - j->code_bits);
   2066      j->code_bits += 8;
   2067   } while (j->code_bits <= 24);
   2068}
   2069
   2070// (1 << n) - 1
   2071static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
   2072
   2073// decode a jpeg huffman value from the bitstream
   2074stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
   2075{
   2076   unsigned int temp;
   2077   int c,k;
   2078
   2079   if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2080
   2081   // look at the top FAST_BITS and determine what symbol ID it is,
   2082   // if the code is <= FAST_BITS
   2083   c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2084   k = h->fast[c];
   2085   if (k < 255) {
   2086      int s = h->size[k];
   2087      if (s > j->code_bits)
   2088         return -1;
   2089      j->code_buffer <<= s;
   2090      j->code_bits -= s;
   2091      return h->values[k];
   2092   }
   2093
   2094   // naive test is to shift the code_buffer down so k bits are
   2095   // valid, then test against maxcode. To speed this up, we've
   2096   // preshifted maxcode left so that it has (16-k) 0s at the
   2097   // end; in other words, regardless of the number of bits, it
   2098   // wants to be compared against something shifted to have 16;
   2099   // that way we don't need to shift inside the loop.
   2100   temp = j->code_buffer >> 16;
   2101   for (k=FAST_BITS+1 ; ; ++k)
   2102      if (temp < h->maxcode[k])
   2103         break;
   2104   if (k == 17) {
   2105      // error! code not found
   2106      j->code_bits -= 16;
   2107      return -1;
   2108   }
   2109
   2110   if (k > j->code_bits)
   2111      return -1;
   2112
   2113   // convert the huffman code to the symbol id
   2114   c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
   2115   STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
   2116
   2117   // convert the id to a symbol
   2118   j->code_bits -= k;
   2119   j->code_buffer <<= k;
   2120   return h->values[c];
   2121}
   2122
   2123// bias[n] = (-1<<n) + 1
   2124static const int stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
   2125
   2126// combined JPEG 'receive' and JPEG 'extend', since baseline
   2127// always extends everything it receives.
   2128stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
   2129{
   2130   unsigned int k;
   2131   int sgn;
   2132   if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
   2133
   2134   sgn = j->code_buffer >> 31; // sign bit always in MSB; 0 if MSB clear (positive), 1 if MSB set (negative)
   2135   k = stbi_lrot(j->code_buffer, n);
   2136   j->code_buffer = k & ~stbi__bmask[n];
   2137   k &= stbi__bmask[n];
   2138   j->code_bits -= n;
   2139   return k + (stbi__jbias[n] & (sgn - 1));
   2140}
   2141
   2142// get some unsigned bits
   2143stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
   2144{
   2145   unsigned int k;
   2146   if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
   2147   k = stbi_lrot(j->code_buffer, n);
   2148   j->code_buffer = k & ~stbi__bmask[n];
   2149   k &= stbi__bmask[n];
   2150   j->code_bits -= n;
   2151   return k;
   2152}
   2153
   2154stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
   2155{
   2156   unsigned int k;
   2157   if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
   2158   k = j->code_buffer;
   2159   j->code_buffer <<= 1;
   2160   --j->code_bits;
   2161   return k & 0x80000000;
   2162}
   2163
   2164// given a value that's at position X in the zigzag stream,
   2165// where does it appear in the 8x8 matrix coded as row-major?
   2166static const stbi_uc stbi__jpeg_dezigzag[64+15] =
   2167{
   2168    0,  1,  8, 16,  9,  2,  3, 10,
   2169   17, 24, 32, 25, 18, 11,  4,  5,
   2170   12, 19, 26, 33, 40, 48, 41, 34,
   2171   27, 20, 13,  6,  7, 14, 21, 28,
   2172   35, 42, 49, 56, 57, 50, 43, 36,
   2173   29, 22, 15, 23, 30, 37, 44, 51,
   2174   58, 59, 52, 45, 38, 31, 39, 46,
   2175   53, 60, 61, 54, 47, 55, 62, 63,
   2176   // let corrupt input sample past end
   2177   63, 63, 63, 63, 63, 63, 63, 63,
   2178   63, 63, 63, 63, 63, 63, 63
   2179};
   2180
   2181// decode one 64-entry block--
   2182static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant)
   2183{
   2184   int diff,dc,k;
   2185   int t;
   2186
   2187   if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2188   t = stbi__jpeg_huff_decode(j, hdc);
   2189   if (t < 0 || t > 15) return stbi__err("bad huffman code","Corrupt JPEG");
   2190
   2191   // 0 all the ac values now so we can do it 32-bits at a time
   2192   memset(data,0,64*sizeof(data[0]));
   2193
   2194   diff = t ? stbi__extend_receive(j, t) : 0;
   2195   dc = j->img_comp[b].dc_pred + diff;
   2196   j->img_comp[b].dc_pred = dc;
   2197   data[0] = (short) (dc * dequant[0]);
   2198
   2199   // decode AC components, see JPEG spec
   2200   k = 1;
   2201   do {
   2202      unsigned int zig;
   2203      int c,r,s;
   2204      if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2205      c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2206      r = fac[c];
   2207      if (r) { // fast-AC path
   2208         k += (r >> 4) & 15; // run
   2209         s = r & 15; // combined length
   2210         j->code_buffer <<= s;
   2211         j->code_bits -= s;
   2212         // decode into unzigzag'd location
   2213         zig = stbi__jpeg_dezigzag[k++];
   2214         data[zig] = (short) ((r >> 8) * dequant[zig]);
   2215      } else {
   2216         int rs = stbi__jpeg_huff_decode(j, hac);
   2217         if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2218         s = rs & 15;
   2219         r = rs >> 4;
   2220         if (s == 0) {
   2221            if (rs != 0xf0) break; // end block
   2222            k += 16;
   2223         } else {
   2224            k += r;
   2225            // decode into unzigzag'd location
   2226            zig = stbi__jpeg_dezigzag[k++];
   2227            data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
   2228         }
   2229      }
   2230   } while (k < 64);
   2231   return 1;
   2232}
   2233
   2234static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
   2235{
   2236   int diff,dc;
   2237   int t;
   2238   if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2239
   2240   if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2241
   2242   if (j->succ_high == 0) {
   2243      // first scan for DC coefficient, must be first
   2244      memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
   2245      t = stbi__jpeg_huff_decode(j, hdc);
   2246      if (t < 0 || t > 15) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2247      diff = t ? stbi__extend_receive(j, t) : 0;
   2248
   2249      dc = j->img_comp[b].dc_pred + diff;
   2250      j->img_comp[b].dc_pred = dc;
   2251      data[0] = (short) (dc * (1 << j->succ_low));
   2252   } else {
   2253      // refinement scan for DC coefficient
   2254      if (stbi__jpeg_get_bit(j))
   2255         data[0] += (short) (1 << j->succ_low);
   2256   }
   2257   return 1;
   2258}
   2259
   2260// @OPTIMIZE: store non-zigzagged during the decode passes,
   2261// and only de-zigzag when dequantizing
   2262static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
   2263{
   2264   int k;
   2265   if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2266
   2267   if (j->succ_high == 0) {
   2268      int shift = j->succ_low;
   2269
   2270      if (j->eob_run) {
   2271         --j->eob_run;
   2272         return 1;
   2273      }
   2274
   2275      k = j->spec_start;
   2276      do {
   2277         unsigned int zig;
   2278         int c,r,s;
   2279         if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2280         c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2281         r = fac[c];
   2282         if (r) { // fast-AC path
   2283            k += (r >> 4) & 15; // run
   2284            s = r & 15; // combined length
   2285            j->code_buffer <<= s;
   2286            j->code_bits -= s;
   2287            zig = stbi__jpeg_dezigzag[k++];
   2288            data[zig] = (short) ((r >> 8) * (1 << shift));
   2289         } else {
   2290            int rs = stbi__jpeg_huff_decode(j, hac);
   2291            if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2292            s = rs & 15;
   2293            r = rs >> 4;
   2294            if (s == 0) {
   2295               if (r < 15) {
   2296                  j->eob_run = (1 << r);
   2297                  if (r)
   2298                     j->eob_run += stbi__jpeg_get_bits(j, r);
   2299                  --j->eob_run;
   2300                  break;
   2301               }
   2302               k += 16;
   2303            } else {
   2304               k += r;
   2305               zig = stbi__jpeg_dezigzag[k++];
   2306               data[zig] = (short) (stbi__extend_receive(j,s) * (1 << shift));
   2307            }
   2308         }
   2309      } while (k <= j->spec_end);
   2310   } else {
   2311      // refinement scan for these AC coefficients
   2312
   2313      short bit = (short) (1 << j->succ_low);
   2314
   2315      if (j->eob_run) {
   2316         --j->eob_run;
   2317         for (k = j->spec_start; k <= j->spec_end; ++k) {
   2318            short *p = &data[stbi__jpeg_dezigzag[k]];
   2319            if (*p != 0)
   2320               if (stbi__jpeg_get_bit(j))
   2321                  if ((*p & bit)==0) {
   2322                     if (*p > 0)
   2323                        *p += bit;
   2324                     else
   2325                        *p -= bit;
   2326                  }
   2327         }
   2328      } else {
   2329         k = j->spec_start;
   2330         do {
   2331            int r,s;
   2332            int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
   2333            if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2334            s = rs & 15;
   2335            r = rs >> 4;
   2336            if (s == 0) {
   2337               if (r < 15) {
   2338                  j->eob_run = (1 << r) - 1;
   2339                  if (r)
   2340                     j->eob_run += stbi__jpeg_get_bits(j, r);
   2341                  r = 64; // force end of block
   2342               } else {
   2343                  // r=15 s=0 should write 16 0s, so we just do
   2344                  // a run of 15 0s and then write s (which is 0),
   2345                  // so we don't have to do anything special here
   2346               }
   2347            } else {
   2348               if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
   2349               // sign bit
   2350               if (stbi__jpeg_get_bit(j))
   2351                  s = bit;
   2352               else
   2353                  s = -bit;
   2354            }
   2355
   2356            // advance by r
   2357            while (k <= j->spec_end) {
   2358               short *p = &data[stbi__jpeg_dezigzag[k++]];
   2359               if (*p != 0) {
   2360                  if (stbi__jpeg_get_bit(j))
   2361                     if ((*p & bit)==0) {
   2362                        if (*p > 0)
   2363                           *p += bit;
   2364                        else
   2365                           *p -= bit;
   2366                     }
   2367               } else {
   2368                  if (r == 0) {
   2369                     *p = (short) s;
   2370                     break;
   2371                  }
   2372                  --r;
   2373               }
   2374            }
   2375         } while (k <= j->spec_end);
   2376      }
   2377   }
   2378   return 1;
   2379}
   2380
   2381// take a -128..127 value and stbi__clamp it and convert to 0..255
   2382stbi_inline static stbi_uc stbi__clamp(int x)
   2383{
   2384   // trick to use a single test to catch both cases
   2385   if ((unsigned int) x > 255) {
   2386      if (x < 0) return 0;
   2387      if (x > 255) return 255;
   2388   }
   2389   return (stbi_uc) x;
   2390}
   2391
   2392#define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
   2393#define stbi__fsh(x)  ((x) * 4096)
   2394
   2395// derived from jidctint -- DCT_ISLOW
   2396#define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
   2397   int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
   2398   p2 = s2;                                    \
   2399   p3 = s6;                                    \
   2400   p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
   2401   t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
   2402   t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
   2403   p2 = s0;                                    \
   2404   p3 = s4;                                    \
   2405   t0 = stbi__fsh(p2+p3);                      \
   2406   t1 = stbi__fsh(p2-p3);                      \
   2407   x0 = t0+t3;                                 \
   2408   x3 = t0-t3;                                 \
   2409   x1 = t1+t2;                                 \
   2410   x2 = t1-t2;                                 \
   2411   t0 = s7;                                    \
   2412   t1 = s5;                                    \
   2413   t2 = s3;                                    \
   2414   t3 = s1;                                    \
   2415   p3 = t0+t2;                                 \
   2416   p4 = t1+t3;                                 \
   2417   p1 = t0+t3;                                 \
   2418   p2 = t1+t2;                                 \
   2419   p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
   2420   t0 = t0*stbi__f2f( 0.298631336f);           \
   2421   t1 = t1*stbi__f2f( 2.053119869f);           \
   2422   t2 = t2*stbi__f2f( 3.072711026f);           \
   2423   t3 = t3*stbi__f2f( 1.501321110f);           \
   2424   p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
   2425   p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
   2426   p3 = p3*stbi__f2f(-1.961570560f);           \
   2427   p4 = p4*stbi__f2f(-0.390180644f);           \
   2428   t3 += p1+p4;                                \
   2429   t2 += p2+p3;                                \
   2430   t1 += p2+p4;                                \
   2431   t0 += p1+p3;
   2432
   2433static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
   2434{
   2435   int i,val[64],*v=val;
   2436   stbi_uc *o;
   2437   short *d = data;
   2438
   2439   // columns
   2440   for (i=0; i < 8; ++i,++d, ++v) {
   2441      // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
   2442      if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
   2443           && d[40]==0 && d[48]==0 && d[56]==0) {
   2444         //    no shortcut                 0     seconds
   2445         //    (1|2|3|4|5|6|7)==0          0     seconds
   2446         //    all separate               -0.047 seconds
   2447         //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
   2448         int dcterm = d[0]*4;
   2449         v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
   2450      } else {
   2451         STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
   2452         // constants scaled things up by 1<<12; let's bring them back
   2453         // down, but keep 2 extra bits of precision
   2454         x0 += 512; x1 += 512; x2 += 512; x3 += 512;
   2455         v[ 0] = (x0+t3) >> 10;
   2456         v[56] = (x0-t3) >> 10;
   2457         v[ 8] = (x1+t2) >> 10;
   2458         v[48] = (x1-t2) >> 10;
   2459         v[16] = (x2+t1) >> 10;
   2460         v[40] = (x2-t1) >> 10;
   2461         v[24] = (x3+t0) >> 10;
   2462         v[32] = (x3-t0) >> 10;
   2463      }
   2464   }
   2465
   2466   for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
   2467      // no fast case since the first 1D IDCT spread components out
   2468      STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
   2469      // constants scaled things up by 1<<12, plus we had 1<<2 from first
   2470      // loop, plus horizontal and vertical each scale by sqrt(8) so together
   2471      // we've got an extra 1<<3, so 1<<17 total we need to remove.
   2472      // so we want to round that, which means adding 0.5 * 1<<17,
   2473      // aka 65536. Also, we'll end up with -128 to 127 that we want
   2474      // to encode as 0..255 by adding 128, so we'll add that before the shift
   2475      x0 += 65536 + (128<<17);
   2476      x1 += 65536 + (128<<17);
   2477      x2 += 65536 + (128<<17);
   2478      x3 += 65536 + (128<<17);
   2479      // tried computing the shifts into temps, or'ing the temps to see
   2480      // if any were out of range, but that was slower
   2481      o[0] = stbi__clamp((x0+t3) >> 17);
   2482      o[7] = stbi__clamp((x0-t3) >> 17);
   2483      o[1] = stbi__clamp((x1+t2) >> 17);
   2484      o[6] = stbi__clamp((x1-t2) >> 17);
   2485      o[2] = stbi__clamp((x2+t1) >> 17);
   2486      o[5] = stbi__clamp((x2-t1) >> 17);
   2487      o[3] = stbi__clamp((x3+t0) >> 17);
   2488      o[4] = stbi__clamp((x3-t0) >> 17);
   2489   }
   2490}
   2491
   2492#ifdef STBI_SSE2
   2493// sse2 integer IDCT. not the fastest possible implementation but it
   2494// produces bit-identical results to the generic C version so it's
   2495// fully "transparent".
   2496static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
   2497{
   2498   // This is constructed to match our regular (generic) integer IDCT exactly.
   2499   __m128i row0, row1, row2, row3, row4, row5, row6, row7;
   2500   __m128i tmp;
   2501
   2502   // dot product constant: even elems=x, odd elems=y
   2503   #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
   2504
   2505   // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
   2506   // out(1) = c1[even]*x + c1[odd]*y
   2507   #define dct_rot(out0,out1, x,y,c0,c1) \
   2508      __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
   2509      __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
   2510      __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
   2511      __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
   2512      __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
   2513      __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
   2514
   2515   // out = in << 12  (in 16-bit, out 32-bit)
   2516   #define dct_widen(out, in) \
   2517      __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
   2518      __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
   2519
   2520   // wide add
   2521   #define dct_wadd(out, a, b) \
   2522      __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
   2523      __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
   2524
   2525   // wide sub
   2526   #define dct_wsub(out, a, b) \
   2527      __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
   2528      __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
   2529
   2530   // butterfly a/b, add bias, then shift by "s" and pack
   2531   #define dct_bfly32o(out0, out1, a,b,bias,s) \
   2532      { \
   2533         __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
   2534         __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
   2535         dct_wadd(sum, abiased, b); \
   2536         dct_wsub(dif, abiased, b); \
   2537         out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
   2538         out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
   2539      }
   2540
   2541   // 8-bit interleave step (for transposes)
   2542   #define dct_interleave8(a, b) \
   2543      tmp = a; \
   2544      a = _mm_unpacklo_epi8(a, b); \
   2545      b = _mm_unpackhi_epi8(tmp, b)
   2546
   2547   // 16-bit interleave step (for transposes)
   2548   #define dct_interleave16(a, b) \
   2549      tmp = a; \
   2550      a = _mm_unpacklo_epi16(a, b); \
   2551      b = _mm_unpackhi_epi16(tmp, b)
   2552
   2553   #define dct_pass(bias,shift) \
   2554      { \
   2555         /* even part */ \
   2556         dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
   2557         __m128i sum04 = _mm_add_epi16(row0, row4); \
   2558         __m128i dif04 = _mm_sub_epi16(row0, row4); \
   2559         dct_widen(t0e, sum04); \
   2560         dct_widen(t1e, dif04); \
   2561         dct_wadd(x0, t0e, t3e); \
   2562         dct_wsub(x3, t0e, t3e); \
   2563         dct_wadd(x1, t1e, t2e); \
   2564         dct_wsub(x2, t1e, t2e); \
   2565         /* odd part */ \
   2566         dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
   2567         dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
   2568         __m128i sum17 = _mm_add_epi16(row1, row7); \
   2569         __m128i sum35 = _mm_add_epi16(row3, row5); \
   2570         dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
   2571         dct_wadd(x4, y0o, y4o); \
   2572         dct_wadd(x5, y1o, y5o); \
   2573         dct_wadd(x6, y2o, y5o); \
   2574         dct_wadd(x7, y3o, y4o); \
   2575         dct_bfly32o(row0,row7, x0,x7,bias,shift); \
   2576         dct_bfly32o(row1,row6, x1,x6,bias,shift); \
   2577         dct_bfly32o(row2,row5, x2,x5,bias,shift); \
   2578         dct_bfly32o(row3,row4, x3,x4,bias,shift); \
   2579      }
   2580
   2581   __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
   2582   __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
   2583   __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
   2584   __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
   2585   __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
   2586   __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
   2587   __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
   2588   __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
   2589
   2590   // rounding biases in column/row passes, see stbi__idct_block for explanation.
   2591   __m128i bias_0 = _mm_set1_epi32(512);
   2592   __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
   2593
   2594   // load
   2595   row0 = _mm_load_si128((const __m128i *) (data + 0*8));
   2596   row1 = _mm_load_si128((const __m128i *) (data + 1*8));
   2597   row2 = _mm_load_si128((const __m128i *) (data + 2*8));
   2598   row3 = _mm_load_si128((const __m128i *) (data + 3*8));
   2599   row4 = _mm_load_si128((const __m128i *) (data + 4*8));
   2600   row5 = _mm_load_si128((const __m128i *) (data + 5*8));
   2601   row6 = _mm_load_si128((const __m128i *) (data + 6*8));
   2602   row7 = _mm_load_si128((const __m128i *) (data + 7*8));
   2603
   2604   // column pass
   2605   dct_pass(bias_0, 10);
   2606
   2607   {
   2608      // 16bit 8x8 transpose pass 1
   2609      dct_interleave16(row0, row4);
   2610      dct_interleave16(row1, row5);
   2611      dct_interleave16(row2, row6);
   2612      dct_interleave16(row3, row7);
   2613
   2614      // transpose pass 2
   2615      dct_interleave16(row0, row2);
   2616      dct_interleave16(row1, row3);
   2617      dct_interleave16(row4, row6);
   2618      dct_interleave16(row5, row7);
   2619
   2620      // transpose pass 3
   2621      dct_interleave16(row0, row1);
   2622      dct_interleave16(row2, row3);
   2623      dct_interleave16(row4, row5);
   2624      dct_interleave16(row6, row7);
   2625   }
   2626
   2627   // row pass
   2628   dct_pass(bias_1, 17);
   2629
   2630   {
   2631      // pack
   2632      __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
   2633      __m128i p1 = _mm_packus_epi16(row2, row3);
   2634      __m128i p2 = _mm_packus_epi16(row4, row5);
   2635      __m128i p3 = _mm_packus_epi16(row6, row7);
   2636
   2637      // 8bit 8x8 transpose pass 1
   2638      dct_interleave8(p0, p2); // a0e0a1e1...
   2639      dct_interleave8(p1, p3); // c0g0c1g1...
   2640
   2641      // transpose pass 2
   2642      dct_interleave8(p0, p1); // a0c0e0g0...
   2643      dct_interleave8(p2, p3); // b0d0f0h0...
   2644
   2645      // transpose pass 3
   2646      dct_interleave8(p0, p2); // a0b0c0d0...
   2647      dct_interleave8(p1, p3); // a4b4c4d4...
   2648
   2649      // store
   2650      _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
   2651      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
   2652      _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
   2653      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
   2654      _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
   2655      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
   2656      _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
   2657      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
   2658   }
   2659
   2660#undef dct_const
   2661#undef dct_rot
   2662#undef dct_widen
   2663#undef dct_wadd
   2664#undef dct_wsub
   2665#undef dct_bfly32o
   2666#undef dct_interleave8
   2667#undef dct_interleave16
   2668#undef dct_pass
   2669}
   2670
   2671#endif // STBI_SSE2
   2672
   2673#ifdef STBI_NEON
   2674
   2675// NEON integer IDCT. should produce bit-identical
   2676// results to the generic C version.
   2677static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
   2678{
   2679   int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
   2680
   2681   int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
   2682   int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
   2683   int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
   2684   int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
   2685   int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
   2686   int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
   2687   int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
   2688   int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
   2689   int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
   2690   int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
   2691   int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
   2692   int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
   2693
   2694#define dct_long_mul(out, inq, coeff) \
   2695   int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
   2696   int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
   2697
   2698#define dct_long_mac(out, acc, inq, coeff) \
   2699   int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
   2700   int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
   2701
   2702#define dct_widen(out, inq) \
   2703   int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
   2704   int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
   2705
   2706// wide add
   2707#define dct_wadd(out, a, b) \
   2708   int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
   2709   int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
   2710
   2711// wide sub
   2712#define dct_wsub(out, a, b) \
   2713   int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
   2714   int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
   2715
   2716// butterfly a/b, then shift using "shiftop" by "s" and pack
   2717#define dct_bfly32o(out0,out1, a,b,shiftop,s) \
   2718   { \
   2719      dct_wadd(sum, a, b); \
   2720      dct_wsub(dif, a, b); \
   2721      out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
   2722      out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
   2723   }
   2724
   2725#define dct_pass(shiftop, shift) \
   2726   { \
   2727      /* even part */ \
   2728      int16x8_t sum26 = vaddq_s16(row2, row6); \
   2729      dct_long_mul(p1e, sum26, rot0_0); \
   2730      dct_long_mac(t2e, p1e, row6, rot0_1); \
   2731      dct_long_mac(t3e, p1e, row2, rot0_2); \
   2732      int16x8_t sum04 = vaddq_s16(row0, row4); \
   2733      int16x8_t dif04 = vsubq_s16(row0, row4); \
   2734      dct_widen(t0e, sum04); \
   2735      dct_widen(t1e, dif04); \
   2736      dct_wadd(x0, t0e, t3e); \
   2737      dct_wsub(x3, t0e, t3e); \
   2738      dct_wadd(x1, t1e, t2e); \
   2739      dct_wsub(x2, t1e, t2e); \
   2740      /* odd part */ \
   2741      int16x8_t sum15 = vaddq_s16(row1, row5); \
   2742      int16x8_t sum17 = vaddq_s16(row1, row7); \
   2743      int16x8_t sum35 = vaddq_s16(row3, row5); \
   2744      int16x8_t sum37 = vaddq_s16(row3, row7); \
   2745      int16x8_t sumodd = vaddq_s16(sum17, sum35); \
   2746      dct_long_mul(p5o, sumodd, rot1_0); \
   2747      dct_long_mac(p1o, p5o, sum17, rot1_1); \
   2748      dct_long_mac(p2o, p5o, sum35, rot1_2); \
   2749      dct_long_mul(p3o, sum37, rot2_0); \
   2750      dct_long_mul(p4o, sum15, rot2_1); \
   2751      dct_wadd(sump13o, p1o, p3o); \
   2752      dct_wadd(sump24o, p2o, p4o); \
   2753      dct_wadd(sump23o, p2o, p3o); \
   2754      dct_wadd(sump14o, p1o, p4o); \
   2755      dct_long_mac(x4, sump13o, row7, rot3_0); \
   2756      dct_long_mac(x5, sump24o, row5, rot3_1); \
   2757      dct_long_mac(x6, sump23o, row3, rot3_2); \
   2758      dct_long_mac(x7, sump14o, row1, rot3_3); \
   2759      dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
   2760      dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
   2761      dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
   2762      dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
   2763   }
   2764
   2765   // load
   2766   row0 = vld1q_s16(data + 0*8);
   2767   row1 = vld1q_s16(data + 1*8);
   2768   row2 = vld1q_s16(data + 2*8);
   2769   row3 = vld1q_s16(data + 3*8);
   2770   row4 = vld1q_s16(data + 4*8);
   2771   row5 = vld1q_s16(data + 5*8);
   2772   row6 = vld1q_s16(data + 6*8);
   2773   row7 = vld1q_s16(data + 7*8);
   2774
   2775   // add DC bias
   2776   row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
   2777
   2778   // column pass
   2779   dct_pass(vrshrn_n_s32, 10);
   2780
   2781   // 16bit 8x8 transpose
   2782   {
   2783// these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
   2784// whether compilers actually get this is another story, sadly.
   2785#define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
   2786#define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
   2787#define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
   2788
   2789      // pass 1
   2790      dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
   2791      dct_trn16(row2, row3);
   2792      dct_trn16(row4, row5);
   2793      dct_trn16(row6, row7);
   2794
   2795      // pass 2
   2796      dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
   2797      dct_trn32(row1, row3);
   2798      dct_trn32(row4, row6);
   2799      dct_trn32(row5, row7);
   2800
   2801      // pass 3
   2802      dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
   2803      dct_trn64(row1, row5);
   2804      dct_trn64(row2, row6);
   2805      dct_trn64(row3, row7);
   2806
   2807#undef dct_trn16
   2808#undef dct_trn32
   2809#undef dct_trn64
   2810   }
   2811
   2812   // row pass
   2813   // vrshrn_n_s32 only supports shifts up to 16, we need
   2814   // 17. so do a non-rounding shift of 16 first then follow
   2815   // up with a rounding shift by 1.
   2816   dct_pass(vshrn_n_s32, 16);
   2817
   2818   {
   2819      // pack and round
   2820      uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
   2821      uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
   2822      uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
   2823      uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
   2824      uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
   2825      uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
   2826      uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
   2827      uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
   2828
   2829      // again, these can translate into one instruction, but often don't.
   2830#define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
   2831#define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
   2832#define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
   2833
   2834      // sadly can't use interleaved stores here since we only write
   2835      // 8 bytes to each scan line!
   2836
   2837      // 8x8 8-bit transpose pass 1
   2838      dct_trn8_8(p0, p1);
   2839      dct_trn8_8(p2, p3);
   2840      dct_trn8_8(p4, p5);
   2841      dct_trn8_8(p6, p7);
   2842
   2843      // pass 2
   2844      dct_trn8_16(p0, p2);
   2845      dct_trn8_16(p1, p3);
   2846      dct_trn8_16(p4, p6);
   2847      dct_trn8_16(p5, p7);
   2848
   2849      // pass 3
   2850      dct_trn8_32(p0, p4);
   2851      dct_trn8_32(p1, p5);
   2852      dct_trn8_32(p2, p6);
   2853      dct_trn8_32(p3, p7);
   2854
   2855      // store
   2856      vst1_u8(out, p0); out += out_stride;
   2857      vst1_u8(out, p1); out += out_stride;
   2858      vst1_u8(out, p2); out += out_stride;
   2859      vst1_u8(out, p3); out += out_stride;
   2860      vst1_u8(out, p4); out += out_stride;
   2861      vst1_u8(out, p5); out += out_stride;
   2862      vst1_u8(out, p6); out += out_stride;
   2863      vst1_u8(out, p7);
   2864
   2865#undef dct_trn8_8
   2866#undef dct_trn8_16
   2867#undef dct_trn8_32
   2868   }
   2869
   2870#undef dct_long_mul
   2871#undef dct_long_mac
   2872#undef dct_widen
   2873#undef dct_wadd
   2874#undef dct_wsub
   2875#undef dct_bfly32o
   2876#undef dct_pass
   2877}
   2878
   2879#endif // STBI_NEON
   2880
   2881#define STBI__MARKER_none  0xff
   2882// if there's a pending marker from the entropy stream, return that
   2883// otherwise, fetch from the stream and get a marker. if there's no
   2884// marker, return 0xff, which is never a valid marker value
   2885static stbi_uc stbi__get_marker(stbi__jpeg *j)
   2886{
   2887   stbi_uc x;
   2888   if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
   2889   x = stbi__get8(j->s);
   2890   if (x != 0xff) return STBI__MARKER_none;
   2891   while (x == 0xff)
   2892      x = stbi__get8(j->s); // consume repeated 0xff fill bytes
   2893   return x;
   2894}
   2895
   2896// in each scan, we'll have scan_n components, and the order
   2897// of the components is specified by order[]
   2898#define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
   2899
   2900// after a restart interval, stbi__jpeg_reset the entropy decoder and
   2901// the dc prediction
   2902static void stbi__jpeg_reset(stbi__jpeg *j)
   2903{
   2904   j->code_bits = 0;
   2905   j->code_buffer = 0;
   2906   j->nomore = 0;
   2907   j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
   2908   j->marker = STBI__MARKER_none;
   2909   j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
   2910   j->eob_run = 0;
   2911   // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
   2912   // since we don't even allow 1<<30 pixels
   2913}
   2914
   2915static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
   2916{
   2917   stbi__jpeg_reset(z);
   2918   if (!z->progressive) {
   2919      if (z->scan_n == 1) {
   2920         int i,j;
   2921         STBI_SIMD_ALIGN(short, data[64]);
   2922         int n = z->order[0];
   2923         // non-interleaved data, we just need to process one block at a time,
   2924         // in trivial scanline order
   2925         // number of blocks to do just depends on how many actual "pixels" this
   2926         // component has, independent of interleaved MCU blocking and such
   2927         int w = (z->img_comp[n].x+7) >> 3;
   2928         int h = (z->img_comp[n].y+7) >> 3;
   2929         for (j=0; j < h; ++j) {
   2930            for (i=0; i < w; ++i) {
   2931               int ha = z->img_comp[n].ha;
   2932               if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
   2933               z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
   2934               // every data block is an MCU, so countdown the restart interval
   2935               if (--z->todo <= 0) {
   2936                  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2937                  // if it's NOT a restart, then just bail, so we get corrupt data
   2938                  // rather than no data
   2939                  if (!STBI__RESTART(z->marker)) return 1;
   2940                  stbi__jpeg_reset(z);
   2941               }
   2942            }
   2943         }
   2944         return 1;
   2945      } else { // interleaved
   2946         int i,j,k,x,y;
   2947         STBI_SIMD_ALIGN(short, data[64]);
   2948         for (j=0; j < z->img_mcu_y; ++j) {
   2949            for (i=0; i < z->img_mcu_x; ++i) {
   2950               // scan an interleaved mcu... process scan_n components in order
   2951               for (k=0; k < z->scan_n; ++k) {
   2952                  int n = z->order[k];
   2953                  // scan out an mcu's worth of this component; that's just determined
   2954                  // by the basic H and V specified for the component
   2955                  for (y=0; y < z->img_comp[n].v; ++y) {
   2956                     for (x=0; x < z->img_comp[n].h; ++x) {
   2957                        int x2 = (i*z->img_comp[n].h + x)*8;
   2958                        int y2 = (j*z->img_comp[n].v + y)*8;
   2959                        int ha = z->img_comp[n].ha;
   2960                        if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
   2961                        z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
   2962                     }
   2963                  }
   2964               }
   2965               // after all interleaved components, that's an interleaved MCU,
   2966               // so now count down the restart interval
   2967               if (--z->todo <= 0) {
   2968                  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2969                  if (!STBI__RESTART(z->marker)) return 1;
   2970                  stbi__jpeg_reset(z);
   2971               }
   2972            }
   2973         }
   2974         return 1;
   2975      }
   2976   } else {
   2977      if (z->scan_n == 1) {
   2978         int i,j;
   2979         int n = z->order[0];
   2980         // non-interleaved data, we just need to process one block at a time,
   2981         // in trivial scanline order
   2982         // number of blocks to do just depends on how many actual "pixels" this
   2983         // component has, independent of interleaved MCU blocking and such
   2984         int w = (z->img_comp[n].x+7) >> 3;
   2985         int h = (z->img_comp[n].y+7) >> 3;
   2986         for (j=0; j < h; ++j) {
   2987            for (i=0; i < w; ++i) {
   2988               short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
   2989               if (z->spec_start == 0) {
   2990                  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
   2991                     return 0;
   2992               } else {
   2993                  int ha = z->img_comp[n].ha;
   2994                  if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
   2995                     return 0;
   2996               }
   2997               // every data block is an MCU, so countdown the restart interval
   2998               if (--z->todo <= 0) {
   2999                  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   3000                  if (!STBI__RESTART(z->marker)) return 1;
   3001                  stbi__jpeg_reset(z);
   3002               }
   3003            }
   3004         }
   3005         return 1;
   3006      } else { // interleaved
   3007         int i,j,k,x,y;
   3008         for (j=0; j < z->img_mcu_y; ++j) {
   3009            for (i=0; i < z->img_mcu_x; ++i) {
   3010               // scan an interleaved mcu... process scan_n components in order
   3011               for (k=0; k < z->scan_n; ++k) {
   3012                  int n = z->order[k];
   3013                  // scan out an mcu's worth of this component; that's just determined
   3014                  // by the basic H and V specified for the component
   3015                  for (y=0; y < z->img_comp[n].v; ++y) {
   3016                     for (x=0; x < z->img_comp[n].h; ++x) {
   3017                        int x2 = (i*z->img_comp[n].h + x);
   3018                        int y2 = (j*z->img_comp[n].v + y);
   3019                        short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
   3020                        if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
   3021                           return 0;
   3022                     }
   3023                  }
   3024               }
   3025               // after all interleaved components, that's an interleaved MCU,
   3026               // so now count down the restart interval
   3027               if (--z->todo <= 0) {
   3028                  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   3029                  if (!STBI__RESTART(z->marker)) return 1;
   3030                  stbi__jpeg_reset(z);
   3031               }
   3032            }
   3033         }
   3034         return 1;
   3035      }
   3036   }
   3037}
   3038
   3039static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant)
   3040{
   3041   int i;
   3042   for (i=0; i < 64; ++i)
   3043      data[i] *= dequant[i];
   3044}
   3045
   3046static void stbi__jpeg_finish(stbi__jpeg *z)
   3047{
   3048   if (z->progressive) {
   3049      // dequantize and idct the data
   3050      int i,j,n;
   3051      for (n=0; n < z->s->img_n; ++n) {
   3052         int w = (z->img_comp[n].x+7) >> 3;
   3053         int h = (z->img_comp[n].y+7) >> 3;
   3054         for (j=0; j < h; ++j) {
   3055            for (i=0; i < w; ++i) {
   3056               short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
   3057               stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
   3058               z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
   3059            }
   3060         }
   3061      }
   3062   }
   3063}
   3064
   3065static int stbi__process_marker(stbi__jpeg *z, int m)
   3066{
   3067   int L;
   3068   switch (m) {
   3069      case STBI__MARKER_none: // no marker found
   3070         return stbi__err("expected marker","Corrupt JPEG");
   3071
   3072      case 0xDD: // DRI - specify restart interval
   3073         if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
   3074         z->restart_interval = stbi__get16be(z->s);
   3075         return 1;
   3076
   3077      case 0xDB: // DQT - define quantization table
   3078         L = stbi__get16be(z->s)-2;
   3079         while (L > 0) {
   3080            int q = stbi__get8(z->s);
   3081            int p = q >> 4, sixteen = (p != 0);
   3082            int t = q & 15,i;
   3083            if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG");
   3084            if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
   3085
   3086            for (i=0; i < 64; ++i)
   3087               z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
   3088            L -= (sixteen ? 129 : 65);
   3089         }
   3090         return L==0;
   3091
   3092      case 0xC4: // DHT - define huffman table
   3093         L = stbi__get16be(z->s)-2;
   3094         while (L > 0) {
   3095            stbi_uc *v;
   3096            int sizes[16],i,n=0;
   3097            int q = stbi__get8(z->s);
   3098            int tc = q >> 4;
   3099            int th = q & 15;
   3100            if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
   3101            for (i=0; i < 16; ++i) {
   3102               sizes[i] = stbi__get8(z->s);
   3103               n += sizes[i];
   3104            }
   3105            L -= 17;
   3106            if (tc == 0) {
   3107               if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
   3108               v = z->huff_dc[th].values;
   3109            } else {
   3110               if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
   3111               v = z->huff_ac[th].values;
   3112            }
   3113            for (i=0; i < n; ++i)
   3114               v[i] = stbi__get8(z->s);
   3115            if (tc != 0)
   3116               stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
   3117            L -= n;
   3118         }
   3119         return L==0;
   3120   }
   3121
   3122   // check for comment block or APP blocks
   3123   if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
   3124      L = stbi__get16be(z->s);
   3125      if (L < 2) {
   3126         if (m == 0xFE)
   3127            return stbi__err("bad COM len","Corrupt JPEG");
   3128         else
   3129            return stbi__err("bad APP len","Corrupt JPEG");
   3130      }
   3131      L -= 2;
   3132
   3133      if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
   3134         static const unsigned char tag[5] = {'J','F','I','F','\0'};
   3135         int ok = 1;
   3136         int i;
   3137         for (i=0; i < 5; ++i)
   3138            if (stbi__get8(z->s) != tag[i])
   3139               ok = 0;
   3140         L -= 5;
   3141         if (ok)
   3142            z->jfif = 1;
   3143      } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
   3144         static const unsigned char tag[6] = {'A','d','o','b','e','\0'};
   3145         int ok = 1;
   3146         int i;
   3147         for (i=0; i < 6; ++i)
   3148            if (stbi__get8(z->s) != tag[i])
   3149               ok = 0;
   3150         L -= 6;
   3151         if (ok) {
   3152            stbi__get8(z->s); // version
   3153            stbi__get16be(z->s); // flags0
   3154            stbi__get16be(z->s); // flags1
   3155            z->app14_color_transform = stbi__get8(z->s); // color transform
   3156            L -= 6;
   3157         }
   3158      }
   3159
   3160      stbi__skip(z->s, L);
   3161      return 1;
   3162   }
   3163
   3164   return stbi__err("unknown marker","Corrupt JPEG");
   3165}
   3166
   3167// after we see SOS
   3168static int stbi__process_scan_header(stbi__jpeg *z)
   3169{
   3170   int i;
   3171   int Ls = stbi__get16be(z->s);
   3172   z->scan_n = stbi__get8(z->s);
   3173   if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
   3174   if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
   3175   for (i=0; i < z->scan_n; ++i) {
   3176      int id = stbi__get8(z->s), which;
   3177      int q = stbi__get8(z->s);
   3178      for (which = 0; which < z->s->img_n; ++which)
   3179         if (z->img_comp[which].id == id)
   3180            break;
   3181      if (which == z->s->img_n) return 0; // no match
   3182      z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
   3183      z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
   3184      z->order[i] = which;
   3185   }
   3186
   3187   {
   3188      int aa;
   3189      z->spec_start = stbi__get8(z->s);
   3190      z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
   3191      aa = stbi__get8(z->s);
   3192      z->succ_high = (aa >> 4);
   3193      z->succ_low  = (aa & 15);
   3194      if (z->progressive) {
   3195         if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
   3196            return stbi__err("bad SOS", "Corrupt JPEG");
   3197      } else {
   3198         if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
   3199         if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
   3200         z->spec_end = 63;
   3201      }
   3202   }
   3203
   3204   return 1;
   3205}
   3206
   3207static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why)
   3208{
   3209   int i;
   3210   for (i=0; i < ncomp; ++i) {
   3211      if (z->img_comp[i].raw_data) {
   3212         STBI_FREE(z->img_comp[i].raw_data);
   3213         z->img_comp[i].raw_data = NULL;
   3214         z->img_comp[i].data = NULL;
   3215      }
   3216      if (z->img_comp[i].raw_coeff) {
   3217         STBI_FREE(z->img_comp[i].raw_coeff);
   3218         z->img_comp[i].raw_coeff = 0;
   3219         z->img_comp[i].coeff = 0;
   3220      }
   3221      if (z->img_comp[i].linebuf) {
   3222         STBI_FREE(z->img_comp[i].linebuf);
   3223         z->img_comp[i].linebuf = NULL;
   3224      }
   3225   }
   3226   return why;
   3227}
   3228
   3229static int stbi__process_frame_header(stbi__jpeg *z, int scan)
   3230{
   3231   stbi__context *s = z->s;
   3232   int Lf,p,i,q, h_max=1,v_max=1,c;
   3233   Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
   3234   p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
   3235   s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
   3236   s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
   3237   if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   3238   if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   3239   c = stbi__get8(s);
   3240   if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG");
   3241   s->img_n = c;
   3242   for (i=0; i < c; ++i) {
   3243      z->img_comp[i].data = NULL;
   3244      z->img_comp[i].linebuf = NULL;
   3245   }
   3246
   3247   if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
   3248
   3249   z->rgb = 0;
   3250   for (i=0; i < s->img_n; ++i) {
   3251      static const unsigned char rgb[3] = { 'R', 'G', 'B' };
   3252      z->img_comp[i].id = stbi__get8(s);
   3253      if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
   3254         ++z->rgb;
   3255      q = stbi__get8(s);
   3256      z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
   3257      z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
   3258      z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
   3259   }
   3260
   3261   if (scan != STBI__SCAN_load) return 1;
   3262
   3263   if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
   3264
   3265   for (i=0; i < s->img_n; ++i) {
   3266      if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
   3267      if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
   3268   }
   3269
   3270   // check that plane subsampling factors are integer ratios; our resamplers can't deal with fractional ratios
   3271   // and I've never seen a non-corrupted JPEG file actually use them
   3272   for (i=0; i < s->img_n; ++i) {
   3273      if (h_max % z->img_comp[i].h != 0) return stbi__err("bad H","Corrupt JPEG");
   3274      if (v_max % z->img_comp[i].v != 0) return stbi__err("bad V","Corrupt JPEG");
   3275   }
   3276
   3277   // compute interleaved mcu info
   3278   z->img_h_max = h_max;
   3279   z->img_v_max = v_max;
   3280   z->img_mcu_w = h_max * 8;
   3281   z->img_mcu_h = v_max * 8;
   3282   // these sizes can't be more than 17 bits
   3283   z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
   3284   z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
   3285
   3286   for (i=0; i < s->img_n; ++i) {
   3287      // number of effective pixels (e.g. for non-interleaved MCU)
   3288      z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
   3289      z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
   3290      // to simplify generation, we'll allocate enough memory to decode
   3291      // the bogus oversized data from using interleaved MCUs and their
   3292      // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
   3293      // discard the extra data until colorspace conversion
   3294      //
   3295      // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
   3296      // so these muls can't overflow with 32-bit ints (which we require)
   3297      z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
   3298      z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
   3299      z->img_comp[i].coeff = 0;
   3300      z->img_comp[i].raw_coeff = 0;
   3301      z->img_comp[i].linebuf = NULL;
   3302      z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
   3303      if (z->img_comp[i].raw_data == NULL)
   3304         return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
   3305      // align blocks for idct using mmx/sse
   3306      z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
   3307      if (z->progressive) {
   3308         // w2, h2 are multiples of 8 (see above)
   3309         z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
   3310         z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
   3311         z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
   3312         if (z->img_comp[i].raw_coeff == NULL)
   3313            return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
   3314         z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
   3315      }
   3316   }
   3317
   3318   return 1;
   3319}
   3320
   3321// use comparisons since in some cases we handle more than one case (e.g. SOF)
   3322#define stbi__DNL(x)         ((x) == 0xdc)
   3323#define stbi__SOI(x)         ((x) == 0xd8)
   3324#define stbi__EOI(x)         ((x) == 0xd9)
   3325#define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
   3326#define stbi__SOS(x)         ((x) == 0xda)
   3327
   3328#define stbi__SOF_progressive(x)   ((x) == 0xc2)
   3329
   3330static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
   3331{
   3332   int m;
   3333   z->jfif = 0;
   3334   z->app14_color_transform = -1; // valid values are 0,1,2
   3335   z->marker = STBI__MARKER_none; // initialize cached marker to empty
   3336   m = stbi__get_marker(z);
   3337   if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
   3338   if (scan == STBI__SCAN_type) return 1;
   3339   m = stbi__get_marker(z);
   3340   while (!stbi__SOF(m)) {
   3341      if (!stbi__process_marker(z,m)) return 0;
   3342      m = stbi__get_marker(z);
   3343      while (m == STBI__MARKER_none) {
   3344         // some files have extra padding after their blocks, so ok, we'll scan
   3345         if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
   3346         m = stbi__get_marker(z);
   3347      }
   3348   }
   3349   z->progressive = stbi__SOF_progressive(m);
   3350   if (!stbi__process_frame_header(z, scan)) return 0;
   3351   return 1;
   3352}
   3353
   3354// decode image to YCbCr format
   3355static int stbi__decode_jpeg_image(stbi__jpeg *j)
   3356{
   3357   int m;
   3358   for (m = 0; m < 4; m++) {
   3359      j->img_comp[m].raw_data = NULL;
   3360      j->img_comp[m].raw_coeff = NULL;
   3361   }
   3362   j->restart_interval = 0;
   3363   if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
   3364   m = stbi__get_marker(j);
   3365   while (!stbi__EOI(m)) {
   3366      if (stbi__SOS(m)) {
   3367         if (!stbi__process_scan_header(j)) return 0;
   3368         if (!stbi__parse_entropy_coded_data(j)) return 0;
   3369         if (j->marker == STBI__MARKER_none ) {
   3370            // handle 0s at the end of image data from IP Kamera 9060
   3371            while (!stbi__at_eof(j->s)) {
   3372               int x = stbi__get8(j->s);
   3373               if (x == 255) {
   3374                  j->marker = stbi__get8(j->s);
   3375                  break;
   3376               }
   3377            }
   3378            // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
   3379         }
   3380      } else if (stbi__DNL(m)) {
   3381         int Ld = stbi__get16be(j->s);
   3382         stbi__uint32 NL = stbi__get16be(j->s);
   3383         if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
   3384         if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
   3385      } else {
   3386         if (!stbi__process_marker(j, m)) return 0;
   3387      }
   3388      m = stbi__get_marker(j);
   3389   }
   3390   if (j->progressive)
   3391      stbi__jpeg_finish(j);
   3392   return 1;
   3393}
   3394
   3395// static jfif-centered resampling (across block boundaries)
   3396
   3397typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
   3398                                    int w, int hs);
   3399
   3400#define stbi__div4(x) ((stbi_uc) ((x) >> 2))
   3401
   3402static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3403{
   3404   STBI_NOTUSED(out);
   3405   STBI_NOTUSED(in_far);
   3406   STBI_NOTUSED(w);
   3407   STBI_NOTUSED(hs);
   3408   return in_near;
   3409}
   3410
   3411static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3412{
   3413   // need to generate two samples vertically for every one in input
   3414   int i;
   3415   STBI_NOTUSED(hs);
   3416   for (i=0; i < w; ++i)
   3417      out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
   3418   return out;
   3419}
   3420
   3421static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3422{
   3423   // need to generate two samples horizontally for every one in input
   3424   int i;
   3425   stbi_uc *input = in_near;
   3426
   3427   if (w == 1) {
   3428      // if only one sample, can't do any interpolation
   3429      out[0] = out[1] = input[0];
   3430      return out;
   3431   }
   3432
   3433   out[0] = input[0];
   3434   out[1] = stbi__div4(input[0]*3 + input[1] + 2);
   3435   for (i=1; i < w-1; ++i) {
   3436      int n = 3*input[i]+2;
   3437      out[i*2+0] = stbi__div4(n+input[i-1]);
   3438      out[i*2+1] = stbi__div4(n+input[i+1]);
   3439   }
   3440   out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
   3441   out[i*2+1] = input[w-1];
   3442
   3443   STBI_NOTUSED(in_far);
   3444   STBI_NOTUSED(hs);
   3445
   3446   return out;
   3447}
   3448
   3449#define stbi__div16(x) ((stbi_uc) ((x) >> 4))
   3450
   3451static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3452{
   3453   // need to generate 2x2 samples for every one in input
   3454   int i,t0,t1;
   3455   if (w == 1) {
   3456      out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
   3457      return out;
   3458   }
   3459
   3460   t1 = 3*in_near[0] + in_far[0];
   3461   out[0] = stbi__div4(t1+2);
   3462   for (i=1; i < w; ++i) {
   3463      t0 = t1;
   3464      t1 = 3*in_near[i]+in_far[i];
   3465      out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
   3466      out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   3467   }
   3468   out[w*2-1] = stbi__div4(t1+2);
   3469
   3470   STBI_NOTUSED(hs);
   3471
   3472   return out;
   3473}
   3474
   3475#if defined(STBI_SSE2) || defined(STBI_NEON)
   3476static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3477{
   3478   // need to generate 2x2 samples for every one in input
   3479   int i=0,t0,t1;
   3480
   3481   if (w == 1) {
   3482      out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
   3483      return out;
   3484   }
   3485
   3486   t1 = 3*in_near[0] + in_far[0];
   3487   // process groups of 8 pixels for as long as we can.
   3488   // note we can't handle the last pixel in a row in this loop
   3489   // because we need to handle the filter boundary conditions.
   3490   for (; i < ((w-1) & ~7); i += 8) {
   3491#if defined(STBI_SSE2)
   3492      // load and perform the vertical filtering pass
   3493      // this uses 3*x + y = 4*x + (y - x)
   3494      __m128i zero  = _mm_setzero_si128();
   3495      __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
   3496      __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
   3497      __m128i farw  = _mm_unpacklo_epi8(farb, zero);
   3498      __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
   3499      __m128i diff  = _mm_sub_epi16(farw, nearw);
   3500      __m128i nears = _mm_slli_epi16(nearw, 2);
   3501      __m128i curr  = _mm_add_epi16(nears, diff); // current row
   3502
   3503      // horizontal filter works the same based on shifted vers of current
   3504      // row. "prev" is current row shifted right by 1 pixel; we need to
   3505      // insert the previous pixel value (from t1).
   3506      // "next" is current row shifted left by 1 pixel, with first pixel
   3507      // of next block of 8 pixels added in.
   3508      __m128i prv0 = _mm_slli_si128(curr, 2);
   3509      __m128i nxt0 = _mm_srli_si128(curr, 2);
   3510      __m128i prev = _mm_insert_epi16(prv0, t1, 0);
   3511      __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
   3512
   3513      // horizontal filter, polyphase implementation since it's convenient:
   3514      // even pixels = 3*cur + prev = cur*4 + (prev - cur)
   3515      // odd  pixels = 3*cur + next = cur*4 + (next - cur)
   3516      // note the shared term.
   3517      __m128i bias  = _mm_set1_epi16(8);
   3518      __m128i curs = _mm_slli_epi16(curr, 2);
   3519      __m128i prvd = _mm_sub_epi16(prev, curr);
   3520      __m128i nxtd = _mm_sub_epi16(next, curr);
   3521      __m128i curb = _mm_add_epi16(curs, bias);
   3522      __m128i even = _mm_add_epi16(prvd, curb);
   3523      __m128i odd  = _mm_add_epi16(nxtd, curb);
   3524
   3525      // interleave even and odd pixels, then undo scaling.
   3526      __m128i int0 = _mm_unpacklo_epi16(even, odd);
   3527      __m128i int1 = _mm_unpackhi_epi16(even, odd);
   3528      __m128i de0  = _mm_srli_epi16(int0, 4);
   3529      __m128i de1  = _mm_srli_epi16(int1, 4);
   3530
   3531      // pack and write output
   3532      __m128i outv = _mm_packus_epi16(de0, de1);
   3533      _mm_storeu_si128((__m128i *) (out + i*2), outv);
   3534#elif defined(STBI_NEON)
   3535      // load and perform the vertical filtering pass
   3536      // this uses 3*x + y = 4*x + (y - x)
   3537      uint8x8_t farb  = vld1_u8(in_far + i);
   3538      uint8x8_t nearb = vld1_u8(in_near + i);
   3539      int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
   3540      int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
   3541      int16x8_t curr  = vaddq_s16(nears, diff); // current row
   3542
   3543      // horizontal filter works the same based on shifted vers of current
   3544      // row. "prev" is current row shifted right by 1 pixel; we need to
   3545      // insert the previous pixel value (from t1).
   3546      // "next" is current row shifted left by 1 pixel, with first pixel
   3547      // of next block of 8 pixels added in.
   3548      int16x8_t prv0 = vextq_s16(curr, curr, 7);
   3549      int16x8_t nxt0 = vextq_s16(curr, curr, 1);
   3550      int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
   3551      int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
   3552
   3553      // horizontal filter, polyphase implementation since it's convenient:
   3554      // even pixels = 3*cur + prev = cur*4 + (prev - cur)
   3555      // odd  pixels = 3*cur + next = cur*4 + (next - cur)
   3556      // note the shared term.
   3557      int16x8_t curs = vshlq_n_s16(curr, 2);
   3558      int16x8_t prvd = vsubq_s16(prev, curr);
   3559      int16x8_t nxtd = vsubq_s16(next, curr);
   3560      int16x8_t even = vaddq_s16(curs, prvd);
   3561      int16x8_t odd  = vaddq_s16(curs, nxtd);
   3562
   3563      // undo scaling and round, then store with even/odd phases interleaved
   3564      uint8x8x2_t o;
   3565      o.val[0] = vqrshrun_n_s16(even, 4);
   3566      o.val[1] = vqrshrun_n_s16(odd,  4);
   3567      vst2_u8(out + i*2, o);
   3568#endif
   3569
   3570      // "previous" value for next iter
   3571      t1 = 3*in_near[i+7] + in_far[i+7];
   3572   }
   3573
   3574   t0 = t1;
   3575   t1 = 3*in_near[i] + in_far[i];
   3576   out[i*2] = stbi__div16(3*t1 + t0 + 8);
   3577
   3578   for (++i; i < w; ++i) {
   3579      t0 = t1;
   3580      t1 = 3*in_near[i]+in_far[i];
   3581      out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
   3582      out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   3583   }
   3584   out[w*2-1] = stbi__div4(t1+2);
   3585
   3586   STBI_NOTUSED(hs);
   3587
   3588   return out;
   3589}
   3590#endif
   3591
   3592static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3593{
   3594   // resample with nearest-neighbor
   3595   int i,j;
   3596   STBI_NOTUSED(in_far);
   3597   for (i=0; i < w; ++i)
   3598      for (j=0; j < hs; ++j)
   3599         out[i*hs+j] = in_near[i];
   3600   return out;
   3601}
   3602
   3603// this is a reduced-precision calculation of YCbCr-to-RGB introduced
   3604// to make sure the code produces the same results in both SIMD and scalar
   3605#define stbi__float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
   3606static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
   3607{
   3608   int i;
   3609   for (i=0; i < count; ++i) {
   3610      int y_fixed = (y[i] << 20) + (1<<19); // rounding
   3611      int r,g,b;
   3612      int cr = pcr[i] - 128;
   3613      int cb = pcb[i] - 128;
   3614      r = y_fixed +  cr* stbi__float2fixed(1.40200f);
   3615      g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
   3616      b = y_fixed                                     +   cb* stbi__float2fixed(1.77200f);
   3617      r >>= 20;
   3618      g >>= 20;
   3619      b >>= 20;
   3620      if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   3621      if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   3622      if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   3623      out[0] = (stbi_uc)r;
   3624      out[1] = (stbi_uc)g;
   3625      out[2] = (stbi_uc)b;
   3626      out[3] = 255;
   3627      out += step;
   3628   }
   3629}
   3630
   3631#if defined(STBI_SSE2) || defined(STBI_NEON)
   3632static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
   3633{
   3634   int i = 0;
   3635
   3636#ifdef STBI_SSE2
   3637   // step == 3 is pretty ugly on the final interleave, and i'm not convinced
   3638   // it's useful in practice (you wouldn't use it for textures, for example).
   3639   // so just accelerate step == 4 case.
   3640   if (step == 4) {
   3641      // this is a fairly straightforward implementation and not super-optimized.
   3642      __m128i signflip  = _mm_set1_epi8(-0x80);
   3643      __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
   3644      __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
   3645      __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
   3646      __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
   3647      __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
   3648      __m128i xw = _mm_set1_epi16(255); // alpha channel
   3649
   3650      for (; i+7 < count; i += 8) {
   3651         // load
   3652         __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
   3653         __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
   3654         __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
   3655         __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
   3656         __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
   3657
   3658         // unpack to short (and left-shift cr, cb by 8)
   3659         __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
   3660         __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
   3661         __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
   3662
   3663         // color transform
   3664         __m128i yws = _mm_srli_epi16(yw, 4);
   3665         __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
   3666         __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
   3667         __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
   3668         __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
   3669         __m128i rws = _mm_add_epi16(cr0, yws);
   3670         __m128i gwt = _mm_add_epi16(cb0, yws);
   3671         __m128i bws = _mm_add_epi16(yws, cb1);
   3672         __m128i gws = _mm_add_epi16(gwt, cr1);
   3673
   3674         // descale
   3675         __m128i rw = _mm_srai_epi16(rws, 4);
   3676         __m128i bw = _mm_srai_epi16(bws, 4);
   3677         __m128i gw = _mm_srai_epi16(gws, 4);
   3678
   3679         // back to byte, set up for transpose
   3680         __m128i brb = _mm_packus_epi16(rw, bw);
   3681         __m128i gxb = _mm_packus_epi16(gw, xw);
   3682
   3683         // transpose to interleave channels
   3684         __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
   3685         __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
   3686         __m128i o0 = _mm_unpacklo_epi16(t0, t1);
   3687         __m128i o1 = _mm_unpackhi_epi16(t0, t1);
   3688
   3689         // store
   3690         _mm_storeu_si128((__m128i *) (out + 0), o0);
   3691         _mm_storeu_si128((__m128i *) (out + 16), o1);
   3692         out += 32;
   3693      }
   3694   }
   3695#endif
   3696
   3697#ifdef STBI_NEON
   3698   // in this version, step=3 support would be easy to add. but is there demand?
   3699   if (step == 4) {
   3700      // this is a fairly straightforward implementation and not super-optimized.
   3701      uint8x8_t signflip = vdup_n_u8(0x80);
   3702      int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
   3703      int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
   3704      int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
   3705      int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
   3706
   3707      for (; i+7 < count; i += 8) {
   3708         // load
   3709         uint8x8_t y_bytes  = vld1_u8(y + i);
   3710         uint8x8_t cr_bytes = vld1_u8(pcr + i);
   3711         uint8x8_t cb_bytes = vld1_u8(pcb + i);
   3712         int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
   3713         int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
   3714
   3715         // expand to s16
   3716         int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
   3717         int16x8_t crw = vshll_n_s8(cr_biased, 7);
   3718         int16x8_t cbw = vshll_n_s8(cb_biased, 7);
   3719
   3720         // color transform
   3721         int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
   3722         int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
   3723         int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
   3724         int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
   3725         int16x8_t rws = vaddq_s16(yws, cr0);
   3726         int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
   3727         int16x8_t bws = vaddq_s16(yws, cb1);
   3728
   3729         // undo scaling, round, convert to byte
   3730         uint8x8x4_t o;
   3731         o.val[0] = vqrshrun_n_s16(rws, 4);
   3732         o.val[1] = vqrshrun_n_s16(gws, 4);
   3733         o.val[2] = vqrshrun_n_s16(bws, 4);
   3734         o.val[3] = vdup_n_u8(255);
   3735
   3736         // store, interleaving r/g/b/a
   3737         vst4_u8(out, o);
   3738         out += 8*4;
   3739      }
   3740   }
   3741#endif
   3742
   3743   for (; i < count; ++i) {
   3744      int y_fixed = (y[i] << 20) + (1<<19); // rounding
   3745      int r,g,b;
   3746      int cr = pcr[i] - 128;
   3747      int cb = pcb[i] - 128;
   3748      r = y_fixed + cr* stbi__float2fixed(1.40200f);
   3749      g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
   3750      b = y_fixed                                   +   cb* stbi__float2fixed(1.77200f);
   3751      r >>= 20;
   3752      g >>= 20;
   3753      b >>= 20;
   3754      if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   3755      if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   3756      if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   3757      out[0] = (stbi_uc)r;
   3758      out[1] = (stbi_uc)g;
   3759      out[2] = (stbi_uc)b;
   3760      out[3] = 255;
   3761      out += step;
   3762   }
   3763}
   3764#endif
   3765
   3766// set up the kernels
   3767static void stbi__setup_jpeg(stbi__jpeg *j)
   3768{
   3769   j->idct_block_kernel = stbi__idct_block;
   3770   j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
   3771   j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
   3772
   3773#ifdef STBI_SSE2
   3774   if (stbi__sse2_available()) {
   3775      j->idct_block_kernel = stbi__idct_simd;
   3776      j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
   3777      j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
   3778   }
   3779#endif
   3780
   3781#ifdef STBI_NEON
   3782   j->idct_block_kernel = stbi__idct_simd;
   3783   j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
   3784   j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
   3785#endif
   3786}
   3787
   3788// clean up the temporary component buffers
   3789static void stbi__cleanup_jpeg(stbi__jpeg *j)
   3790{
   3791   stbi__free_jpeg_components(j, j->s->img_n, 0);
   3792}
   3793
   3794typedef struct
   3795{
   3796   resample_row_func resample;
   3797   stbi_uc *line0,*line1;
   3798   int hs,vs;   // expansion factor in each axis
   3799   int w_lores; // horizontal pixels pre-expansion
   3800   int ystep;   // how far through vertical expansion we are
   3801   int ypos;    // which pre-expansion row we're on
   3802} stbi__resample;
   3803
   3804// fast 0..255 * 0..255 => 0..255 rounded multiplication
   3805static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
   3806{
   3807   unsigned int t = x*y + 128;
   3808   return (stbi_uc) ((t + (t >>8)) >> 8);
   3809}
   3810
   3811static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
   3812{
   3813   int n, decode_n, is_rgb;
   3814   z->s->img_n = 0; // make stbi__cleanup_jpeg safe
   3815
   3816   // validate req_comp
   3817   if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
   3818
   3819   // load a jpeg image from whichever source, but leave in YCbCr format
   3820   if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
   3821
   3822   // determine actual number of components to generate
   3823   n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
   3824
   3825   is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
   3826
   3827   if (z->s->img_n == 3 && n < 3 && !is_rgb)
   3828      decode_n = 1;
   3829   else
   3830      decode_n = z->s->img_n;
   3831
   3832   // nothing to do if no components requested; check this now to avoid
   3833   // accessing uninitialized coutput[0] later
   3834   if (decode_n <= 0) { stbi__cleanup_jpeg(z); return NULL; }
   3835
   3836   // resample and color-convert
   3837   {
   3838      int k;
   3839      unsigned int i,j;
   3840      stbi_uc *output;
   3841      stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL };
   3842
   3843      stbi__resample res_comp[4];
   3844
   3845      for (k=0; k < decode_n; ++k) {
   3846         stbi__resample *r = &res_comp[k];
   3847
   3848         // allocate line buffer big enough for upsampling off the edges
   3849         // with upsample factor of 4
   3850         z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
   3851         if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
   3852
   3853         r->hs      = z->img_h_max / z->img_comp[k].h;
   3854         r->vs      = z->img_v_max / z->img_comp[k].v;
   3855         r->ystep   = r->vs >> 1;
   3856         r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
   3857         r->ypos    = 0;
   3858         r->line0   = r->line1 = z->img_comp[k].data;
   3859
   3860         if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
   3861         else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
   3862         else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
   3863         else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
   3864         else                               r->resample = stbi__resample_row_generic;
   3865      }
   3866
   3867      // can't error after this so, this is safe
   3868      output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
   3869      if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
   3870
   3871      // now go ahead and resample
   3872      for (j=0; j < z->s->img_y; ++j) {
   3873         stbi_uc *out = output + n * z->s->img_x * j;
   3874         for (k=0; k < decode_n; ++k) {
   3875            stbi__resample *r = &res_comp[k];
   3876            int y_bot = r->ystep >= (r->vs >> 1);
   3877            coutput[k] = r->resample(z->img_comp[k].linebuf,
   3878                                     y_bot ? r->line1 : r->line0,
   3879                                     y_bot ? r->line0 : r->line1,
   3880                                     r->w_lores, r->hs);
   3881            if (++r->ystep >= r->vs) {
   3882               r->ystep = 0;
   3883               r->line0 = r->line1;
   3884               if (++r->ypos < z->img_comp[k].y)
   3885                  r->line1 += z->img_comp[k].w2;
   3886            }
   3887         }
   3888         if (n >= 3) {
   3889            stbi_uc *y = coutput[0];
   3890            if (z->s->img_n == 3) {
   3891               if (is_rgb) {
   3892                  for (i=0; i < z->s->img_x; ++i) {
   3893                     out[0] = y[i];
   3894                     out[1] = coutput[1][i];
   3895                     out[2] = coutput[2][i];
   3896                     out[3] = 255;
   3897                     out += n;
   3898                  }
   3899               } else {
   3900                  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3901               }
   3902            } else if (z->s->img_n == 4) {
   3903               if (z->app14_color_transform == 0) { // CMYK
   3904                  for (i=0; i < z->s->img_x; ++i) {
   3905                     stbi_uc m = coutput[3][i];
   3906                     out[0] = stbi__blinn_8x8(coutput[0][i], m);
   3907                     out[1] = stbi__blinn_8x8(coutput[1][i], m);
   3908                     out[2] = stbi__blinn_8x8(coutput[2][i], m);
   3909                     out[3] = 255;
   3910                     out += n;
   3911                  }
   3912               } else if (z->app14_color_transform == 2) { // YCCK
   3913                  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3914                  for (i=0; i < z->s->img_x; ++i) {
   3915                     stbi_uc m = coutput[3][i];
   3916                     out[0] = stbi__blinn_8x8(255 - out[0], m);
   3917                     out[1] = stbi__blinn_8x8(255 - out[1], m);
   3918                     out[2] = stbi__blinn_8x8(255 - out[2], m);
   3919                     out += n;
   3920                  }
   3921               } else { // YCbCr + alpha?  Ignore the fourth channel for now
   3922                  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3923               }
   3924            } else
   3925               for (i=0; i < z->s->img_x; ++i) {
   3926                  out[0] = out[1] = out[2] = y[i];
   3927                  out[3] = 255; // not used if n==3
   3928                  out += n;
   3929               }
   3930         } else {
   3931            if (is_rgb) {
   3932               if (n == 1)
   3933                  for (i=0; i < z->s->img_x; ++i)
   3934                     *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
   3935               else {
   3936                  for (i=0; i < z->s->img_x; ++i, out += 2) {
   3937                     out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
   3938                     out[1] = 255;
   3939                  }
   3940               }
   3941            } else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
   3942               for (i=0; i < z->s->img_x; ++i) {
   3943                  stbi_uc m = coutput[3][i];
   3944                  stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
   3945                  stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
   3946                  stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
   3947                  out[0] = stbi__compute_y(r, g, b);
   3948                  out[1] = 255;
   3949                  out += n;
   3950               }
   3951            } else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
   3952               for (i=0; i < z->s->img_x; ++i) {
   3953                  out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
   3954                  out[1] = 255;
   3955                  out += n;
   3956               }
   3957            } else {
   3958               stbi_uc *y = coutput[0];
   3959               if (n == 1)
   3960                  for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
   3961               else
   3962                  for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; }
   3963            }
   3964         }
   3965      }
   3966      stbi__cleanup_jpeg(z);
   3967      *out_x = z->s->img_x;
   3968      *out_y = z->s->img_y;
   3969      if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
   3970      return output;
   3971   }
   3972}
   3973
   3974static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   3975{
   3976   unsigned char* result;
   3977   stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
   3978   if (!j) return stbi__errpuc("outofmem", "Out of memory");
   3979   STBI_NOTUSED(ri);
   3980   j->s = s;
   3981   stbi__setup_jpeg(j);
   3982   result = load_jpeg_image(j, x,y,comp,req_comp);
   3983   STBI_FREE(j);
   3984   return result;
   3985}
   3986
   3987static int stbi__jpeg_test(stbi__context *s)
   3988{
   3989   int r;
   3990   stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
   3991   if (!j) return stbi__err("outofmem", "Out of memory");
   3992   j->s = s;
   3993   stbi__setup_jpeg(j);
   3994   r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
   3995   stbi__rewind(s);
   3996   STBI_FREE(j);
   3997   return r;
   3998}
   3999
   4000static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
   4001{
   4002   if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
   4003      stbi__rewind( j->s );
   4004      return 0;
   4005   }
   4006   if (x) *x = j->s->img_x;
   4007   if (y) *y = j->s->img_y;
   4008   if (comp) *comp = j->s->img_n >= 3 ? 3 : 1;
   4009   return 1;
   4010}
   4011
   4012static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
   4013{
   4014   int result;
   4015   stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
   4016   if (!j) return stbi__err("outofmem", "Out of memory");
   4017   j->s = s;
   4018   result = stbi__jpeg_info_raw(j, x, y, comp);
   4019   STBI_FREE(j);
   4020   return result;
   4021}
   4022#endif
   4023
   4024// public domain zlib decode    v0.2  Sean Barrett 2006-11-18
   4025//    simple implementation
   4026//      - all input must be provided in an upfront buffer
   4027//      - all output is written to a single output buffer (can malloc/realloc)
   4028//    performance
   4029//      - fast huffman
   4030
   4031#ifndef STBI_NO_ZLIB
   4032
   4033// fast-way is faster to check than jpeg huffman, but slow way is slower
   4034#define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
   4035#define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
   4036#define STBI__ZNSYMS 288 // number of symbols in literal/length alphabet
   4037
   4038// zlib-style huffman encoding
   4039// (jpegs packs from left, zlib from right, so can't share code)
   4040typedef struct
   4041{
   4042   stbi__uint16 fast[1 << STBI__ZFAST_BITS];
   4043   stbi__uint16 firstcode[16];
   4044   int maxcode[17];
   4045   stbi__uint16 firstsymbol[16];
   4046   stbi_uc  size[STBI__ZNSYMS];
   4047   stbi__uint16 value[STBI__ZNSYMS];
   4048} stbi__zhuffman;
   4049
   4050stbi_inline static int stbi__bitreverse16(int n)
   4051{
   4052  n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
   4053  n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
   4054  n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
   4055  n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
   4056  return n;
   4057}
   4058
   4059stbi_inline static int stbi__bit_reverse(int v, int bits)
   4060{
   4061   STBI_ASSERT(bits <= 16);
   4062   // to bit reverse n bits, reverse 16 and shift
   4063   // e.g. 11 bits, bit reverse and shift away 5
   4064   return stbi__bitreverse16(v) >> (16-bits);
   4065}
   4066
   4067static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num)
   4068{
   4069   int i,k=0;
   4070   int code, next_code[16], sizes[17];
   4071
   4072   // DEFLATE spec for generating codes
   4073   memset(sizes, 0, sizeof(sizes));
   4074   memset(z->fast, 0, sizeof(z->fast));
   4075   for (i=0; i < num; ++i)
   4076      ++sizes[sizelist[i]];
   4077   sizes[0] = 0;
   4078   for (i=1; i < 16; ++i)
   4079      if (sizes[i] > (1 << i))
   4080         return stbi__err("bad sizes", "Corrupt PNG");
   4081   code = 0;
   4082   for (i=1; i < 16; ++i) {
   4083      next_code[i] = code;
   4084      z->firstcode[i] = (stbi__uint16) code;
   4085      z->firstsymbol[i] = (stbi__uint16) k;
   4086      code = (code + sizes[i]);
   4087      if (sizes[i])
   4088         if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
   4089      z->maxcode[i] = code << (16-i); // preshift for inner loop
   4090      code <<= 1;
   4091      k += sizes[i];
   4092   }
   4093   z->maxcode[16] = 0x10000; // sentinel
   4094   for (i=0; i < num; ++i) {
   4095      int s = sizelist[i];
   4096      if (s) {
   4097         int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
   4098         stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
   4099         z->size [c] = (stbi_uc     ) s;
   4100         z->value[c] = (stbi__uint16) i;
   4101         if (s <= STBI__ZFAST_BITS) {
   4102            int j = stbi__bit_reverse(next_code[s],s);
   4103            while (j < (1 << STBI__ZFAST_BITS)) {
   4104               z->fast[j] = fastv;
   4105               j += (1 << s);
   4106            }
   4107         }
   4108         ++next_code[s];
   4109      }
   4110   }
   4111   return 1;
   4112}
   4113
   4114// zlib-from-memory implementation for PNG reading
   4115//    because PNG allows splitting the zlib stream arbitrarily,
   4116//    and it's annoying structurally to have PNG call ZLIB call PNG,
   4117//    we require PNG read all the IDATs and combine them into a single
   4118//    memory buffer
   4119
   4120typedef struct
   4121{
   4122   stbi_uc *zbuffer, *zbuffer_end;
   4123   int num_bits;
   4124   stbi__uint32 code_buffer;
   4125
   4126   char *zout;
   4127   char *zout_start;
   4128   char *zout_end;
   4129   int   z_expandable;
   4130
   4131   stbi__zhuffman z_length, z_distance;
   4132} stbi__zbuf;
   4133
   4134stbi_inline static int stbi__zeof(stbi__zbuf *z)
   4135{
   4136   return (z->zbuffer >= z->zbuffer_end);
   4137}
   4138
   4139stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
   4140{
   4141   return stbi__zeof(z) ? 0 : *z->zbuffer++;
   4142}
   4143
   4144static void stbi__fill_bits(stbi__zbuf *z)
   4145{
   4146   do {
   4147      if (z->code_buffer >= (1U << z->num_bits)) {
   4148        z->zbuffer = z->zbuffer_end;  /* treat this as EOF so we fail. */
   4149        return;
   4150      }
   4151      z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
   4152      z->num_bits += 8;
   4153   } while (z->num_bits <= 24);
   4154}
   4155
   4156stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
   4157{
   4158   unsigned int k;
   4159   if (z->num_bits < n) stbi__fill_bits(z);
   4160   k = z->code_buffer & ((1 << n) - 1);
   4161   z->code_buffer >>= n;
   4162   z->num_bits -= n;
   4163   return k;
   4164}
   4165
   4166static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
   4167{
   4168   int b,s,k;
   4169   // not resolved by fast table, so compute it the slow way
   4170   // use jpeg approach, which requires MSbits at top
   4171   k = stbi__bit_reverse(a->code_buffer, 16);
   4172   for (s=STBI__ZFAST_BITS+1; ; ++s)
   4173      if (k < z->maxcode[s])
   4174         break;
   4175   if (s >= 16) return -1; // invalid code!
   4176   // code size is s, so:
   4177   b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
   4178   if (b >= STBI__ZNSYMS) return -1; // some data was corrupt somewhere!
   4179   if (z->size[b] != s) return -1;  // was originally an assert, but report failure instead.
   4180   a->code_buffer >>= s;
   4181   a->num_bits -= s;
   4182   return z->value[b];
   4183}
   4184
   4185stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
   4186{
   4187   int b,s;
   4188   if (a->num_bits < 16) {
   4189      if (stbi__zeof(a)) {
   4190         return -1;   /* report error for unexpected end of data. */
   4191      }
   4192      stbi__fill_bits(a);
   4193   }
   4194   b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
   4195   if (b) {
   4196      s = b >> 9;
   4197      a->code_buffer >>= s;
   4198      a->num_bits -= s;
   4199      return b & 511;
   4200   }
   4201   return stbi__zhuffman_decode_slowpath(a, z);
   4202}
   4203
   4204static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
   4205{
   4206   char *q;
   4207   unsigned int cur, limit, old_limit;
   4208   z->zout = zout;
   4209   if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
   4210   cur   = (unsigned int) (z->zout - z->zout_start);
   4211   limit = old_limit = (unsigned) (z->zout_end - z->zout_start);
   4212   if (UINT_MAX - cur < (unsigned) n) return stbi__err("outofmem", "Out of memory");
   4213   while (cur + n > limit) {
   4214      if(limit > UINT_MAX / 2) return stbi__err("outofmem", "Out of memory");
   4215      limit *= 2;
   4216   }
   4217   q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
   4218   STBI_NOTUSED(old_limit);
   4219   if (q == NULL) return stbi__err("outofmem", "Out of memory");
   4220   z->zout_start = q;
   4221   z->zout       = q + cur;
   4222   z->zout_end   = q + limit;
   4223   return 1;
   4224}
   4225
   4226static const int stbi__zlength_base[31] = {
   4227   3,4,5,6,7,8,9,10,11,13,
   4228   15,17,19,23,27,31,35,43,51,59,
   4229   67,83,99,115,131,163,195,227,258,0,0 };
   4230
   4231static const int stbi__zlength_extra[31]=
   4232{ 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
   4233
   4234static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
   4235257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
   4236
   4237static const int stbi__zdist_extra[32] =
   4238{ 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
   4239
   4240static int stbi__parse_huffman_block(stbi__zbuf *a)
   4241{
   4242   char *zout = a->zout;
   4243   for(;;) {
   4244      int z = stbi__zhuffman_decode(a, &a->z_length);
   4245      if (z < 256) {
   4246         if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
   4247         if (zout >= a->zout_end) {
   4248            if (!stbi__zexpand(a, zout, 1)) return 0;
   4249            zout = a->zout;
   4250         }
   4251         *zout++ = (char) z;
   4252      } else {
   4253         stbi_uc *p;
   4254         int len,dist;
   4255         if (z == 256) {
   4256            a->zout = zout;
   4257            return 1;
   4258         }
   4259         z -= 257;
   4260         len = stbi__zlength_base[z];
   4261         if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
   4262         z = stbi__zhuffman_decode(a, &a->z_distance);
   4263         if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
   4264         dist = stbi__zdist_base[z];
   4265         if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
   4266         if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
   4267         if (zout + len > a->zout_end) {
   4268            if (!stbi__zexpand(a, zout, len)) return 0;
   4269            zout = a->zout;
   4270         }
   4271         p = (stbi_uc *) (zout - dist);
   4272         if (dist == 1) { // run of one byte; common in images.
   4273            stbi_uc v = *p;
   4274            if (len) { do *zout++ = v; while (--len); }
   4275         } else {
   4276            if (len) { do *zout++ = *p++; while (--len); }
   4277         }
   4278      }
   4279   }
   4280}
   4281
   4282static int stbi__compute_huffman_codes(stbi__zbuf *a)
   4283{
   4284   static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
   4285   stbi__zhuffman z_codelength;
   4286   stbi_uc lencodes[286+32+137];//padding for maximum single op
   4287   stbi_uc codelength_sizes[19];
   4288   int i,n;
   4289
   4290   int hlit  = stbi__zreceive(a,5) + 257;
   4291   int hdist = stbi__zreceive(a,5) + 1;
   4292   int hclen = stbi__zreceive(a,4) + 4;
   4293   int ntot  = hlit + hdist;
   4294
   4295   memset(codelength_sizes, 0, sizeof(codelength_sizes));
   4296   for (i=0; i < hclen; ++i) {
   4297      int s = stbi__zreceive(a,3);
   4298      codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
   4299   }
   4300   if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
   4301
   4302   n = 0;
   4303   while (n < ntot) {
   4304      int c = stbi__zhuffman_decode(a, &z_codelength);
   4305      if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
   4306      if (c < 16)
   4307         lencodes[n++] = (stbi_uc) c;
   4308      else {
   4309         stbi_uc fill = 0;
   4310         if (c == 16) {
   4311            c = stbi__zreceive(a,2)+3;
   4312            if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
   4313            fill = lencodes[n-1];
   4314         } else if (c == 17) {
   4315            c = stbi__zreceive(a,3)+3;
   4316         } else if (c == 18) {
   4317            c = stbi__zreceive(a,7)+11;
   4318         } else {
   4319            return stbi__err("bad codelengths", "Corrupt PNG");
   4320         }
   4321         if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
   4322         memset(lencodes+n, fill, c);
   4323         n += c;
   4324      }
   4325   }
   4326   if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG");
   4327   if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
   4328   if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
   4329   return 1;
   4330}
   4331
   4332static int stbi__parse_uncompressed_block(stbi__zbuf *a)
   4333{
   4334   stbi_uc header[4];
   4335   int len,nlen,k;
   4336   if (a->num_bits & 7)
   4337      stbi__zreceive(a, a->num_bits & 7); // discard
   4338   // drain the bit-packed data into header
   4339   k = 0;
   4340   while (a->num_bits > 0) {
   4341      header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
   4342      a->code_buffer >>= 8;
   4343      a->num_bits -= 8;
   4344   }
   4345   if (a->num_bits < 0) return stbi__err("zlib corrupt","Corrupt PNG");
   4346   // now fill header the normal way
   4347   while (k < 4)
   4348      header[k++] = stbi__zget8(a);
   4349   len  = header[1] * 256 + header[0];
   4350   nlen = header[3] * 256 + header[2];
   4351   if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
   4352   if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
   4353   if (a->zout + len > a->zout_end)
   4354      if (!stbi__zexpand(a, a->zout, len)) return 0;
   4355   memcpy(a->zout, a->zbuffer, len);
   4356   a->zbuffer += len;
   4357   a->zout += len;
   4358   return 1;
   4359}
   4360
   4361static int stbi__parse_zlib_header(stbi__zbuf *a)
   4362{
   4363   int cmf   = stbi__zget8(a);
   4364   int cm    = cmf & 15;
   4365   /* int cinfo = cmf >> 4; */
   4366   int flg   = stbi__zget8(a);
   4367   if (stbi__zeof(a)) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
   4368   if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
   4369   if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
   4370   if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
   4371   // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
   4372   return 1;
   4373}
   4374
   4375static const stbi_uc stbi__zdefault_length[STBI__ZNSYMS] =
   4376{
   4377   8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4378   8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4379   8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4380   8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4381   8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4382   9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4383   9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4384   9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4385   7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
   4386};
   4387static const stbi_uc stbi__zdefault_distance[32] =
   4388{
   4389   5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
   4390};
   4391/*
   4392Init algorithm:
   4393{
   4394   int i;   // use <= to match clearly with spec
   4395   for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
   4396   for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
   4397   for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
   4398   for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
   4399
   4400   for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
   4401}
   4402*/
   4403
   4404static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
   4405{
   4406   int final, type;
   4407   if (parse_header)
   4408      if (!stbi__parse_zlib_header(a)) return 0;
   4409   a->num_bits = 0;
   4410   a->code_buffer = 0;
   4411   do {
   4412      final = stbi__zreceive(a,1);
   4413      type = stbi__zreceive(a,2);
   4414      if (type == 0) {
   4415         if (!stbi__parse_uncompressed_block(a)) return 0;
   4416      } else if (type == 3) {
   4417         return 0;
   4418      } else {
   4419         if (type == 1) {
   4420            // use fixed code lengths
   4421            if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , STBI__ZNSYMS)) return 0;
   4422            if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
   4423         } else {
   4424            if (!stbi__compute_huffman_codes(a)) return 0;
   4425         }
   4426         if (!stbi__parse_huffman_block(a)) return 0;
   4427      }
   4428   } while (!final);
   4429   return 1;
   4430}
   4431
   4432static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
   4433{
   4434   a->zout_start = obuf;
   4435   a->zout       = obuf;
   4436   a->zout_end   = obuf + olen;
   4437   a->z_expandable = exp;
   4438
   4439   return stbi__parse_zlib(a, parse_header);
   4440}
   4441
   4442STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
   4443{
   4444   stbi__zbuf a;
   4445   char *p = (char *) stbi__malloc(initial_size);
   4446   if (p == NULL) return NULL;
   4447   a.zbuffer = (stbi_uc *) buffer;
   4448   a.zbuffer_end = (stbi_uc *) buffer + len;
   4449   if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
   4450      if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4451      return a.zout_start;
   4452   } else {
   4453      STBI_FREE(a.zout_start);
   4454      return NULL;
   4455   }
   4456}
   4457
   4458STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
   4459{
   4460   return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
   4461}
   4462
   4463STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
   4464{
   4465   stbi__zbuf a;
   4466   char *p = (char *) stbi__malloc(initial_size);
   4467   if (p == NULL) return NULL;
   4468   a.zbuffer = (stbi_uc *) buffer;
   4469   a.zbuffer_end = (stbi_uc *) buffer + len;
   4470   if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
   4471      if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4472      return a.zout_start;
   4473   } else {
   4474      STBI_FREE(a.zout_start);
   4475      return NULL;
   4476   }
   4477}
   4478
   4479STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
   4480{
   4481   stbi__zbuf a;
   4482   a.zbuffer = (stbi_uc *) ibuffer;
   4483   a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
   4484   if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
   4485      return (int) (a.zout - a.zout_start);
   4486   else
   4487      return -1;
   4488}
   4489
   4490STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
   4491{
   4492   stbi__zbuf a;
   4493   char *p = (char *) stbi__malloc(16384);
   4494   if (p == NULL) return NULL;
   4495   a.zbuffer = (stbi_uc *) buffer;
   4496   a.zbuffer_end = (stbi_uc *) buffer+len;
   4497   if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
   4498      if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4499      return a.zout_start;
   4500   } else {
   4501      STBI_FREE(a.zout_start);
   4502      return NULL;
   4503   }
   4504}
   4505
   4506STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
   4507{
   4508   stbi__zbuf a;
   4509   a.zbuffer = (stbi_uc *) ibuffer;
   4510   a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
   4511   if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
   4512      return (int) (a.zout - a.zout_start);
   4513   else
   4514      return -1;
   4515}
   4516#endif
   4517
   4518// public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
   4519//    simple implementation
   4520//      - only 8-bit samples
   4521//      - no CRC checking
   4522//      - allocates lots of intermediate memory
   4523//        - avoids problem of streaming data between subsystems
   4524//        - avoids explicit window management
   4525//    performance
   4526//      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
   4527
   4528#ifndef STBI_NO_PNG
   4529typedef struct
   4530{
   4531   stbi__uint32 length;
   4532   stbi__uint32 type;
   4533} stbi__pngchunk;
   4534
   4535static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
   4536{
   4537   stbi__pngchunk c;
   4538   c.length = stbi__get32be(s);
   4539   c.type   = stbi__get32be(s);
   4540   return c;
   4541}
   4542
   4543static int stbi__check_png_header(stbi__context *s)
   4544{
   4545   static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
   4546   int i;
   4547   for (i=0; i < 8; ++i)
   4548      if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
   4549   return 1;
   4550}
   4551
   4552typedef struct
   4553{
   4554   stbi__context *s;
   4555   stbi_uc *idata, *expanded, *out;
   4556   int depth;
   4557} stbi__png;
   4558
   4559
   4560enum {
   4561   STBI__F_none=0,
   4562   STBI__F_sub=1,
   4563   STBI__F_up=2,
   4564   STBI__F_avg=3,
   4565   STBI__F_paeth=4,
   4566   // synthetic filters used for first scanline to avoid needing a dummy row of 0s
   4567   STBI__F_avg_first,
   4568   STBI__F_paeth_first
   4569};
   4570
   4571static stbi_uc first_row_filter[5] =
   4572{
   4573   STBI__F_none,
   4574   STBI__F_sub,
   4575   STBI__F_none,
   4576   STBI__F_avg_first,
   4577   STBI__F_paeth_first
   4578};
   4579
   4580static int stbi__paeth(int a, int b, int c)
   4581{
   4582   int p = a + b - c;
   4583   int pa = abs(p-a);
   4584   int pb = abs(p-b);
   4585   int pc = abs(p-c);
   4586   if (pa <= pb && pa <= pc) return a;
   4587   if (pb <= pc) return b;
   4588   return c;
   4589}
   4590
   4591static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
   4592
   4593// create the png data from post-deflated data
   4594static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
   4595{
   4596   int bytes = (depth == 16? 2 : 1);
   4597   stbi__context *s = a->s;
   4598   stbi__uint32 i,j,stride = x*out_n*bytes;
   4599   stbi__uint32 img_len, img_width_bytes;
   4600   int k;
   4601   int img_n = s->img_n; // copy it into a local for later
   4602
   4603   int output_bytes = out_n*bytes;
   4604   int filter_bytes = img_n*bytes;
   4605   int width = x;
   4606
   4607   STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
   4608   a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
   4609   if (!a->out) return stbi__err("outofmem", "Out of memory");
   4610
   4611   if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
   4612   img_width_bytes = (((img_n * x * depth) + 7) >> 3);
   4613   img_len = (img_width_bytes + 1) * y;
   4614
   4615   // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
   4616   // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
   4617   // so just check for raw_len < img_len always.
   4618   if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
   4619
   4620   for (j=0; j < y; ++j) {
   4621      stbi_uc *cur = a->out + stride*j;
   4622      stbi_uc *prior;
   4623      int filter = *raw++;
   4624
   4625      if (filter > 4)
   4626         return stbi__err("invalid filter","Corrupt PNG");
   4627
   4628      if (depth < 8) {
   4629         if (img_width_bytes > x) return stbi__err("invalid width","Corrupt PNG");
   4630         cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
   4631         filter_bytes = 1;
   4632         width = img_width_bytes;
   4633      }
   4634      prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above
   4635
   4636      // if first row, use special filter that doesn't sample previous row
   4637      if (j == 0) filter = first_row_filter[filter];
   4638
   4639      // handle first byte explicitly
   4640      for (k=0; k < filter_bytes; ++k) {
   4641         switch (filter) {
   4642            case STBI__F_none       : cur[k] = raw[k]; break;
   4643            case STBI__F_sub        : cur[k] = raw[k]; break;
   4644            case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
   4645            case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
   4646            case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
   4647            case STBI__F_avg_first  : cur[k] = raw[k]; break;
   4648            case STBI__F_paeth_first: cur[k] = raw[k]; break;
   4649         }
   4650      }
   4651
   4652      if (depth == 8) {
   4653         if (img_n != out_n)
   4654            cur[img_n] = 255; // first pixel
   4655         raw += img_n;
   4656         cur += out_n;
   4657         prior += out_n;
   4658      } else if (depth == 16) {
   4659         if (img_n != out_n) {
   4660            cur[filter_bytes]   = 255; // first pixel top byte
   4661            cur[filter_bytes+1] = 255; // first pixel bottom byte
   4662         }
   4663         raw += filter_bytes;
   4664         cur += output_bytes;
   4665         prior += output_bytes;
   4666      } else {
   4667         raw += 1;
   4668         cur += 1;
   4669         prior += 1;
   4670      }
   4671
   4672      // this is a little gross, so that we don't switch per-pixel or per-component
   4673      if (depth < 8 || img_n == out_n) {
   4674         int nk = (width - 1)*filter_bytes;
   4675         #define STBI__CASE(f) \
   4676             case f:     \
   4677                for (k=0; k < nk; ++k)
   4678         switch (filter) {
   4679            // "none" filter turns into a memcpy here; make that explicit.
   4680            case STBI__F_none:         memcpy(cur, raw, nk); break;
   4681            STBI__CASE(STBI__F_sub)          { cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); } break;
   4682            STBI__CASE(STBI__F_up)           { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
   4683            STBI__CASE(STBI__F_avg)          { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); } break;
   4684            STBI__CASE(STBI__F_paeth)        { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); } break;
   4685            STBI__CASE(STBI__F_avg_first)    { cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); } break;
   4686            STBI__CASE(STBI__F_paeth_first)  { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); } break;
   4687         }
   4688         #undef STBI__CASE
   4689         raw += nk;
   4690      } else {
   4691         STBI_ASSERT(img_n+1 == out_n);
   4692         #define STBI__CASE(f) \
   4693             case f:     \
   4694                for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
   4695                   for (k=0; k < filter_bytes; ++k)
   4696         switch (filter) {
   4697            STBI__CASE(STBI__F_none)         { cur[k] = raw[k]; } break;
   4698            STBI__CASE(STBI__F_sub)          { cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); } break;
   4699            STBI__CASE(STBI__F_up)           { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
   4700            STBI__CASE(STBI__F_avg)          { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); } break;
   4701            STBI__CASE(STBI__F_paeth)        { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); } break;
   4702            STBI__CASE(STBI__F_avg_first)    { cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); } break;
   4703            STBI__CASE(STBI__F_paeth_first)  { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); } break;
   4704         }
   4705         #undef STBI__CASE
   4706
   4707         // the loop above sets the high byte of the pixels' alpha, but for
   4708         // 16 bit png files we also need the low byte set. we'll do that here.
   4709         if (depth == 16) {
   4710            cur = a->out + stride*j; // start at the beginning of the row again
   4711            for (i=0; i < x; ++i,cur+=output_bytes) {
   4712               cur[filter_bytes+1] = 255;
   4713            }
   4714         }
   4715      }
   4716   }
   4717
   4718   // we make a separate pass to expand bits to pixels; for performance,
   4719   // this could run two scanlines behind the above code, so it won't
   4720   // intefere with filtering but will still be in the cache.
   4721   if (depth < 8) {
   4722      for (j=0; j < y; ++j) {
   4723         stbi_uc *cur = a->out + stride*j;
   4724         stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
   4725         // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
   4726         // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
   4727         stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
   4728
   4729         // note that the final byte might overshoot and write more data than desired.
   4730         // we can allocate enough data that this never writes out of memory, but it
   4731         // could also overwrite the next scanline. can it overwrite non-empty data
   4732         // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
   4733         // so we need to explicitly clamp the final ones
   4734
   4735         if (depth == 4) {
   4736            for (k=x*img_n; k >= 2; k-=2, ++in) {
   4737               *cur++ = scale * ((*in >> 4)       );
   4738               *cur++ = scale * ((*in     ) & 0x0f);
   4739            }
   4740            if (k > 0) *cur++ = scale * ((*in >> 4)       );
   4741         } else if (depth == 2) {
   4742            for (k=x*img_n; k >= 4; k-=4, ++in) {
   4743               *cur++ = scale * ((*in >> 6)       );
   4744               *cur++ = scale * ((*in >> 4) & 0x03);
   4745               *cur++ = scale * ((*in >> 2) & 0x03);
   4746               *cur++ = scale * ((*in     ) & 0x03);
   4747            }
   4748            if (k > 0) *cur++ = scale * ((*in >> 6)       );
   4749            if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
   4750            if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
   4751         } else if (depth == 1) {
   4752            for (k=x*img_n; k >= 8; k-=8, ++in) {
   4753               *cur++ = scale * ((*in >> 7)       );
   4754               *cur++ = scale * ((*in >> 6) & 0x01);
   4755               *cur++ = scale * ((*in >> 5) & 0x01);
   4756               *cur++ = scale * ((*in >> 4) & 0x01);
   4757               *cur++ = scale * ((*in >> 3) & 0x01);
   4758               *cur++ = scale * ((*in >> 2) & 0x01);
   4759               *cur++ = scale * ((*in >> 1) & 0x01);
   4760               *cur++ = scale * ((*in     ) & 0x01);
   4761            }
   4762            if (k > 0) *cur++ = scale * ((*in >> 7)       );
   4763            if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
   4764            if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
   4765            if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
   4766            if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
   4767            if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
   4768            if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
   4769         }
   4770         if (img_n != out_n) {
   4771            int q;
   4772            // insert alpha = 255
   4773            cur = a->out + stride*j;
   4774            if (img_n == 1) {
   4775               for (q=x-1; q >= 0; --q) {
   4776                  cur[q*2+1] = 255;
   4777                  cur[q*2+0] = cur[q];
   4778               }
   4779            } else {
   4780               STBI_ASSERT(img_n == 3);
   4781               for (q=x-1; q >= 0; --q) {
   4782                  cur[q*4+3] = 255;
   4783                  cur[q*4+2] = cur[q*3+2];
   4784                  cur[q*4+1] = cur[q*3+1];
   4785                  cur[q*4+0] = cur[q*3+0];
   4786               }
   4787            }
   4788         }
   4789      }
   4790   } else if (depth == 16) {
   4791      // force the image data from big-endian to platform-native.
   4792      // this is done in a separate pass due to the decoding relying
   4793      // on the data being untouched, but could probably be done
   4794      // per-line during decode if care is taken.
   4795      stbi_uc *cur = a->out;
   4796      stbi__uint16 *cur16 = (stbi__uint16*)cur;
   4797
   4798      for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
   4799         *cur16 = (cur[0] << 8) | cur[1];
   4800      }
   4801   }
   4802
   4803   return 1;
   4804}
   4805
   4806static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
   4807{
   4808   int bytes = (depth == 16 ? 2 : 1);
   4809   int out_bytes = out_n * bytes;
   4810   stbi_uc *final;
   4811   int p;
   4812   if (!interlaced)
   4813      return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
   4814
   4815   // de-interlacing
   4816   final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
   4817   if (!final) return stbi__err("outofmem", "Out of memory");
   4818   for (p=0; p < 7; ++p) {
   4819      int xorig[] = { 0,4,0,2,0,1,0 };
   4820      int yorig[] = { 0,0,4,0,2,0,1 };
   4821      int xspc[]  = { 8,8,4,4,2,2,1 };
   4822      int yspc[]  = { 8,8,8,4,4,2,2 };
   4823      int i,j,x,y;
   4824      // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
   4825      x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
   4826      y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
   4827      if (x && y) {
   4828         stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
   4829         if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
   4830            STBI_FREE(final);
   4831            return 0;
   4832         }
   4833         for (j=0; j < y; ++j) {
   4834            for (i=0; i < x; ++i) {
   4835               int out_y = j*yspc[p]+yorig[p];
   4836               int out_x = i*xspc[p]+xorig[p];
   4837               memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes,
   4838                      a->out + (j*x+i)*out_bytes, out_bytes);
   4839            }
   4840         }
   4841         STBI_FREE(a->out);
   4842         image_data += img_len;
   4843         image_data_len -= img_len;
   4844      }
   4845   }
   4846   a->out = final;
   4847
   4848   return 1;
   4849}
   4850
   4851static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
   4852{
   4853   stbi__context *s = z->s;
   4854   stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4855   stbi_uc *p = z->out;
   4856
   4857   // compute color-based transparency, assuming we've
   4858   // already got 255 as the alpha value in the output
   4859   STBI_ASSERT(out_n == 2 || out_n == 4);
   4860
   4861   if (out_n == 2) {
   4862      for (i=0; i < pixel_count; ++i) {
   4863         p[1] = (p[0] == tc[0] ? 0 : 255);
   4864         p += 2;
   4865      }
   4866   } else {
   4867      for (i=0; i < pixel_count; ++i) {
   4868         if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   4869            p[3] = 0;
   4870         p += 4;
   4871      }
   4872   }
   4873   return 1;
   4874}
   4875
   4876static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
   4877{
   4878   stbi__context *s = z->s;
   4879   stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4880   stbi__uint16 *p = (stbi__uint16*) z->out;
   4881
   4882   // compute color-based transparency, assuming we've
   4883   // already got 65535 as the alpha value in the output
   4884   STBI_ASSERT(out_n == 2 || out_n == 4);
   4885
   4886   if (out_n == 2) {
   4887      for (i = 0; i < pixel_count; ++i) {
   4888         p[1] = (p[0] == tc[0] ? 0 : 65535);
   4889         p += 2;
   4890      }
   4891   } else {
   4892      for (i = 0; i < pixel_count; ++i) {
   4893         if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   4894            p[3] = 0;
   4895         p += 4;
   4896      }
   4897   }
   4898   return 1;
   4899}
   4900
   4901static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
   4902{
   4903   stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
   4904   stbi_uc *p, *temp_out, *orig = a->out;
   4905
   4906   p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0);
   4907   if (p == NULL) return stbi__err("outofmem", "Out of memory");
   4908
   4909   // between here and free(out) below, exitting would leak
   4910   temp_out = p;
   4911
   4912   if (pal_img_n == 3) {
   4913      for (i=0; i < pixel_count; ++i) {
   4914         int n = orig[i]*4;
   4915         p[0] = palette[n  ];
   4916         p[1] = palette[n+1];
   4917         p[2] = palette[n+2];
   4918         p += 3;
   4919      }
   4920   } else {
   4921      for (i=0; i < pixel_count; ++i) {
   4922         int n = orig[i]*4;
   4923         p[0] = palette[n  ];
   4924         p[1] = palette[n+1];
   4925         p[2] = palette[n+2];
   4926         p[3] = palette[n+3];
   4927         p += 4;
   4928      }
   4929   }
   4930   STBI_FREE(a->out);
   4931   a->out = temp_out;
   4932
   4933   STBI_NOTUSED(len);
   4934
   4935   return 1;
   4936}
   4937
   4938static int stbi__unpremultiply_on_load_global = 0;
   4939static int stbi__de_iphone_flag_global = 0;
   4940
   4941STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
   4942{
   4943   stbi__unpremultiply_on_load_global = flag_true_if_should_unpremultiply;
   4944}
   4945
   4946STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
   4947{
   4948   stbi__de_iphone_flag_global = flag_true_if_should_convert;
   4949}
   4950
   4951#ifndef STBI_THREAD_LOCAL
   4952#define stbi__unpremultiply_on_load  stbi__unpremultiply_on_load_global
   4953#define stbi__de_iphone_flag  stbi__de_iphone_flag_global
   4954#else
   4955static STBI_THREAD_LOCAL int stbi__unpremultiply_on_load_local, stbi__unpremultiply_on_load_set;
   4956static STBI_THREAD_LOCAL int stbi__de_iphone_flag_local, stbi__de_iphone_flag_set;
   4957
   4958STBIDEF void stbi__unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply)
   4959{
   4960   stbi__unpremultiply_on_load_local = flag_true_if_should_unpremultiply;
   4961   stbi__unpremultiply_on_load_set = 1;
   4962}
   4963
   4964STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert)
   4965{
   4966   stbi__de_iphone_flag_local = flag_true_if_should_convert;
   4967   stbi__de_iphone_flag_set = 1;
   4968}
   4969
   4970#define stbi__unpremultiply_on_load  (stbi__unpremultiply_on_load_set           \
   4971                                       ? stbi__unpremultiply_on_load_local      \
   4972                                       : stbi__unpremultiply_on_load_global)
   4973#define stbi__de_iphone_flag  (stbi__de_iphone_flag_set                         \
   4974                                ? stbi__de_iphone_flag_local                    \
   4975                                : stbi__de_iphone_flag_global)
   4976#endif // STBI_THREAD_LOCAL
   4977
   4978static void stbi__de_iphone(stbi__png *z)
   4979{
   4980   stbi__context *s = z->s;
   4981   stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4982   stbi_uc *p = z->out;
   4983
   4984   if (s->img_out_n == 3) {  // convert bgr to rgb
   4985      for (i=0; i < pixel_count; ++i) {
   4986         stbi_uc t = p[0];
   4987         p[0] = p[2];
   4988         p[2] = t;
   4989         p += 3;
   4990      }
   4991   } else {
   4992      STBI_ASSERT(s->img_out_n == 4);
   4993      if (stbi__unpremultiply_on_load) {
   4994         // convert bgr to rgb and unpremultiply
   4995         for (i=0; i < pixel_count; ++i) {
   4996            stbi_uc a = p[3];
   4997            stbi_uc t = p[0];
   4998            if (a) {
   4999               stbi_uc half = a / 2;
   5000               p[0] = (p[2] * 255 + half) / a;
   5001               p[1] = (p[1] * 255 + half) / a;
   5002               p[2] = ( t   * 255 + half) / a;
   5003            } else {
   5004               p[0] = p[2];
   5005               p[2] = t;
   5006            }
   5007            p += 4;
   5008         }
   5009      } else {
   5010         // convert bgr to rgb
   5011         for (i=0; i < pixel_count; ++i) {
   5012            stbi_uc t = p[0];
   5013            p[0] = p[2];
   5014            p[2] = t;
   5015            p += 4;
   5016         }
   5017      }
   5018   }
   5019}
   5020
   5021#define STBI__PNG_TYPE(a,b,c,d)  (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
   5022
   5023static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
   5024{
   5025   stbi_uc palette[1024], pal_img_n=0;
   5026   stbi_uc has_trans=0, tc[3]={0};
   5027   stbi__uint16 tc16[3];
   5028   stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
   5029   int first=1,k,interlace=0, color=0, is_iphone=0;
   5030   stbi__context *s = z->s;
   5031
   5032   z->expanded = NULL;
   5033   z->idata = NULL;
   5034   z->out = NULL;
   5035
   5036   if (!stbi__check_png_header(s)) return 0;
   5037
   5038   if (scan == STBI__SCAN_type) return 1;
   5039
   5040   for (;;) {
   5041      stbi__pngchunk c = stbi__get_chunk_header(s);
   5042      switch (c.type) {
   5043         case STBI__PNG_TYPE('C','g','B','I'):
   5044            is_iphone = 1;
   5045            stbi__skip(s, c.length);
   5046            break;
   5047         case STBI__PNG_TYPE('I','H','D','R'): {
   5048            int comp,filter;
   5049            if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
   5050            first = 0;
   5051            if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
   5052            s->img_x = stbi__get32be(s);
   5053            s->img_y = stbi__get32be(s);
   5054            if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   5055            if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   5056            z->depth = stbi__get8(s);  if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16)  return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
   5057            color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
   5058            if (color == 3 && z->depth == 16)                  return stbi__err("bad ctype","Corrupt PNG");
   5059            if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
   5060            comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
   5061            filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
   5062            interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
   5063            if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
   5064            if (!pal_img_n) {
   5065               s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
   5066               if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
   5067               if (scan == STBI__SCAN_header) return 1;
   5068            } else {
   5069               // if paletted, then pal_n is our final components, and
   5070               // img_n is # components to decompress/filter.
   5071               s->img_n = 1;
   5072               if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
   5073               // if SCAN_header, have to scan to see if we have a tRNS
   5074            }
   5075            break;
   5076         }
   5077
   5078         case STBI__PNG_TYPE('P','L','T','E'):  {
   5079            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5080            if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
   5081            pal_len = c.length / 3;
   5082            if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
   5083            for (i=0; i < pal_len; ++i) {
   5084               palette[i*4+0] = stbi__get8(s);
   5085               palette[i*4+1] = stbi__get8(s);
   5086               palette[i*4+2] = stbi__get8(s);
   5087               palette[i*4+3] = 255;
   5088            }
   5089            break;
   5090         }
   5091
   5092         case STBI__PNG_TYPE('t','R','N','S'): {
   5093            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5094            if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
   5095            if (pal_img_n) {
   5096               if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
   5097               if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
   5098               if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
   5099               pal_img_n = 4;
   5100               for (i=0; i < c.length; ++i)
   5101                  palette[i*4+3] = stbi__get8(s);
   5102            } else {
   5103               if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
   5104               if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
   5105               has_trans = 1;
   5106               if (z->depth == 16) {
   5107                  for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
   5108               } else {
   5109                  for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
   5110               }
   5111            }
   5112            break;
   5113         }
   5114
   5115         case STBI__PNG_TYPE('I','D','A','T'): {
   5116            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5117            if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
   5118            if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
   5119            if ((int)(ioff + c.length) < (int)ioff) return 0;
   5120            if (ioff + c.length > idata_limit) {
   5121               stbi__uint32 idata_limit_old = idata_limit;
   5122               stbi_uc *p;
   5123               if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
   5124               while (ioff + c.length > idata_limit)
   5125                  idata_limit *= 2;
   5126               STBI_NOTUSED(idata_limit_old);
   5127               p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
   5128               z->idata = p;
   5129            }
   5130            if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
   5131            ioff += c.length;
   5132            break;
   5133         }
   5134
   5135         case STBI__PNG_TYPE('I','E','N','D'): {
   5136            stbi__uint32 raw_len, bpl;
   5137            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5138            if (scan != STBI__SCAN_load) return 1;
   5139            if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
   5140            // initial guess for decoded data size to avoid unnecessary reallocs
   5141            bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
   5142            raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
   5143            z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
   5144            if (z->expanded == NULL) return 0; // zlib should set error
   5145            STBI_FREE(z->idata); z->idata = NULL;
   5146            if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
   5147               s->img_out_n = s->img_n+1;
   5148            else
   5149               s->img_out_n = s->img_n;
   5150            if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
   5151            if (has_trans) {
   5152               if (z->depth == 16) {
   5153                  if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
   5154               } else {
   5155                  if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
   5156               }
   5157            }
   5158            if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
   5159               stbi__de_iphone(z);
   5160            if (pal_img_n) {
   5161               // pal_img_n == 3 or 4
   5162               s->img_n = pal_img_n; // record the actual colors we had
   5163               s->img_out_n = pal_img_n;
   5164               if (req_comp >= 3) s->img_out_n = req_comp;
   5165               if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
   5166                  return 0;
   5167            } else if (has_trans) {
   5168               // non-paletted image with tRNS -> source image has (constant) alpha
   5169               ++s->img_n;
   5170            }
   5171            STBI_FREE(z->expanded); z->expanded = NULL;
   5172            // end of PNG chunk, read and skip CRC
   5173            stbi__get32be(s);
   5174            return 1;
   5175         }
   5176
   5177         default:
   5178            // if critical, fail
   5179            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5180            if ((c.type & (1 << 29)) == 0) {
   5181               #ifndef STBI_NO_FAILURE_STRINGS
   5182               // not threadsafe
   5183               static char invalid_chunk[] = "XXXX PNG chunk not known";
   5184               invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
   5185               invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
   5186               invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
   5187               invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
   5188               #endif
   5189               return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
   5190            }
   5191            stbi__skip(s, c.length);
   5192            break;
   5193      }
   5194      // end of PNG chunk, read and skip CRC
   5195      stbi__get32be(s);
   5196   }
   5197}
   5198
   5199static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri)
   5200{
   5201   void *result=NULL;
   5202   if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
   5203   if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
   5204      if (p->depth <= 8)
   5205         ri->bits_per_channel = 8;
   5206      else if (p->depth == 16)
   5207         ri->bits_per_channel = 16;
   5208      else
   5209         return stbi__errpuc("bad bits_per_channel", "PNG not supported: unsupported color depth");
   5210      result = p->out;
   5211      p->out = NULL;
   5212      if (req_comp && req_comp != p->s->img_out_n) {
   5213         if (ri->bits_per_channel == 8)
   5214            result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
   5215         else
   5216            result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
   5217         p->s->img_out_n = req_comp;
   5218         if (result == NULL) return result;
   5219      }
   5220      *x = p->s->img_x;
   5221      *y = p->s->img_y;
   5222      if (n) *n = p->s->img_n;
   5223   }
   5224   STBI_FREE(p->out);      p->out      = NULL;
   5225   STBI_FREE(p->expanded); p->expanded = NULL;
   5226   STBI_FREE(p->idata);    p->idata    = NULL;
   5227
   5228   return result;
   5229}
   5230
   5231static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5232{
   5233   stbi__png p;
   5234   p.s = s;
   5235   return stbi__do_png(&p, x,y,comp,req_comp, ri);
   5236}
   5237
   5238static int stbi__png_test(stbi__context *s)
   5239{
   5240   int r;
   5241   r = stbi__check_png_header(s);
   5242   stbi__rewind(s);
   5243   return r;
   5244}
   5245
   5246static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
   5247{
   5248   if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
   5249      stbi__rewind( p->s );
   5250      return 0;
   5251   }
   5252   if (x) *x = p->s->img_x;
   5253   if (y) *y = p->s->img_y;
   5254   if (comp) *comp = p->s->img_n;
   5255   return 1;
   5256}
   5257
   5258static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
   5259{
   5260   stbi__png p;
   5261   p.s = s;
   5262   return stbi__png_info_raw(&p, x, y, comp);
   5263}
   5264
   5265static int stbi__png_is16(stbi__context *s)
   5266{
   5267   stbi__png p;
   5268   p.s = s;
   5269   if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
   5270	   return 0;
   5271   if (p.depth != 16) {
   5272      stbi__rewind(p.s);
   5273      return 0;
   5274   }
   5275   return 1;
   5276}
   5277#endif
   5278
   5279// Microsoft/Windows BMP image
   5280
   5281#ifndef STBI_NO_BMP
   5282static int stbi__bmp_test_raw(stbi__context *s)
   5283{
   5284   int r;
   5285   int sz;
   5286   if (stbi__get8(s) != 'B') return 0;
   5287   if (stbi__get8(s) != 'M') return 0;
   5288   stbi__get32le(s); // discard filesize
   5289   stbi__get16le(s); // discard reserved
   5290   stbi__get16le(s); // discard reserved
   5291   stbi__get32le(s); // discard data offset
   5292   sz = stbi__get32le(s);
   5293   r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
   5294   return r;
   5295}
   5296
   5297static int stbi__bmp_test(stbi__context *s)
   5298{
   5299   int r = stbi__bmp_test_raw(s);
   5300   stbi__rewind(s);
   5301   return r;
   5302}
   5303
   5304
   5305// returns 0..31 for the highest set bit
   5306static int stbi__high_bit(unsigned int z)
   5307{
   5308   int n=0;
   5309   if (z == 0) return -1;
   5310   if (z >= 0x10000) { n += 16; z >>= 16; }
   5311   if (z >= 0x00100) { n +=  8; z >>=  8; }
   5312   if (z >= 0x00010) { n +=  4; z >>=  4; }
   5313   if (z >= 0x00004) { n +=  2; z >>=  2; }
   5314   if (z >= 0x00002) { n +=  1;/* >>=  1;*/ }
   5315   return n;
   5316}
   5317
   5318static int stbi__bitcount(unsigned int a)
   5319{
   5320   a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
   5321   a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
   5322   a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
   5323   a = (a + (a >> 8)); // max 16 per 8 bits
   5324   a = (a + (a >> 16)); // max 32 per 8 bits
   5325   return a & 0xff;
   5326}
   5327
   5328// extract an arbitrarily-aligned N-bit value (N=bits)
   5329// from v, and then make it 8-bits long and fractionally
   5330// extend it to full full range.
   5331static int stbi__shiftsigned(unsigned int v, int shift, int bits)
   5332{
   5333   static unsigned int mul_table[9] = {
   5334      0,
   5335      0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
   5336      0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
   5337   };
   5338   static unsigned int shift_table[9] = {
   5339      0, 0,0,1,0,2,4,6,0,
   5340   };
   5341   if (shift < 0)
   5342      v <<= -shift;
   5343   else
   5344      v >>= shift;
   5345   STBI_ASSERT(v < 256);
   5346   v >>= (8-bits);
   5347   STBI_ASSERT(bits >= 0 && bits <= 8);
   5348   return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits];
   5349}
   5350
   5351typedef struct
   5352{
   5353   int bpp, offset, hsz;
   5354   unsigned int mr,mg,mb,ma, all_a;
   5355   int extra_read;
   5356} stbi__bmp_data;
   5357
   5358static int stbi__bmp_set_mask_defaults(stbi__bmp_data *info, int compress)
   5359{
   5360   // BI_BITFIELDS specifies masks explicitly, don't override
   5361   if (compress == 3)
   5362      return 1;
   5363
   5364   if (compress == 0) {
   5365      if (info->bpp == 16) {
   5366         info->mr = 31u << 10;
   5367         info->mg = 31u <<  5;
   5368         info->mb = 31u <<  0;
   5369      } else if (info->bpp == 32) {
   5370         info->mr = 0xffu << 16;
   5371         info->mg = 0xffu <<  8;
   5372         info->mb = 0xffu <<  0;
   5373         info->ma = 0xffu << 24;
   5374         info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
   5375      } else {
   5376         // otherwise, use defaults, which is all-0
   5377         info->mr = info->mg = info->mb = info->ma = 0;
   5378      }
   5379      return 1;
   5380   }
   5381   return 0; // error
   5382}
   5383
   5384static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
   5385{
   5386   int hsz;
   5387   if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
   5388   stbi__get32le(s); // discard filesize
   5389   stbi__get16le(s); // discard reserved
   5390   stbi__get16le(s); // discard reserved
   5391   info->offset = stbi__get32le(s);
   5392   info->hsz = hsz = stbi__get32le(s);
   5393   info->mr = info->mg = info->mb = info->ma = 0;
   5394   info->extra_read = 14;
   5395
   5396   if (info->offset < 0) return stbi__errpuc("bad BMP", "bad BMP");
   5397
   5398   if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
   5399   if (hsz == 12) {
   5400      s->img_x = stbi__get16le(s);
   5401      s->img_y = stbi__get16le(s);
   5402   } else {
   5403      s->img_x = stbi__get32le(s);
   5404      s->img_y = stbi__get32le(s);
   5405   }
   5406   if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
   5407   info->bpp = stbi__get16le(s);
   5408   if (hsz != 12) {
   5409      int compress = stbi__get32le(s);
   5410      if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
   5411      if (compress >= 4) return stbi__errpuc("BMP JPEG/PNG", "BMP type not supported: unsupported compression"); // this includes PNG/JPEG modes
   5412      if (compress == 3 && info->bpp != 16 && info->bpp != 32) return stbi__errpuc("bad BMP", "bad BMP"); // bitfields requires 16 or 32 bits/pixel
   5413      stbi__get32le(s); // discard sizeof
   5414      stbi__get32le(s); // discard hres
   5415      stbi__get32le(s); // discard vres
   5416      stbi__get32le(s); // discard colorsused
   5417      stbi__get32le(s); // discard max important
   5418      if (hsz == 40 || hsz == 56) {
   5419         if (hsz == 56) {
   5420            stbi__get32le(s);
   5421            stbi__get32le(s);
   5422            stbi__get32le(s);
   5423            stbi__get32le(s);
   5424         }
   5425         if (info->bpp == 16 || info->bpp == 32) {
   5426            if (compress == 0) {
   5427               stbi__bmp_set_mask_defaults(info, compress);
   5428            } else if (compress == 3) {
   5429               info->mr = stbi__get32le(s);
   5430               info->mg = stbi__get32le(s);
   5431               info->mb = stbi__get32le(s);
   5432               info->extra_read += 12;
   5433               // not documented, but generated by photoshop and handled by mspaint
   5434               if (info->mr == info->mg && info->mg == info->mb) {
   5435                  // ?!?!?
   5436                  return stbi__errpuc("bad BMP", "bad BMP");
   5437               }
   5438            } else
   5439               return stbi__errpuc("bad BMP", "bad BMP");
   5440         }
   5441      } else {
   5442         // V4/V5 header
   5443         int i;
   5444         if (hsz != 108 && hsz != 124)
   5445            return stbi__errpuc("bad BMP", "bad BMP");
   5446         info->mr = stbi__get32le(s);
   5447         info->mg = stbi__get32le(s);
   5448         info->mb = stbi__get32le(s);
   5449         info->ma = stbi__get32le(s);
   5450         if (compress != 3) // override mr/mg/mb unless in BI_BITFIELDS mode, as per docs
   5451            stbi__bmp_set_mask_defaults(info, compress);
   5452         stbi__get32le(s); // discard color space
   5453         for (i=0; i < 12; ++i)
   5454            stbi__get32le(s); // discard color space parameters
   5455         if (hsz == 124) {
   5456            stbi__get32le(s); // discard rendering intent
   5457            stbi__get32le(s); // discard offset of profile data
   5458            stbi__get32le(s); // discard size of profile data
   5459            stbi__get32le(s); // discard reserved
   5460         }
   5461      }
   5462   }
   5463   return (void *) 1;
   5464}
   5465
   5466
   5467static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5468{
   5469   stbi_uc *out;
   5470   unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
   5471   stbi_uc pal[256][4];
   5472   int psize=0,i,j,width;
   5473   int flip_vertically, pad, target;
   5474   stbi__bmp_data info;
   5475   STBI_NOTUSED(ri);
   5476
   5477   info.all_a = 255;
   5478   if (stbi__bmp_parse_header(s, &info) == NULL)
   5479      return NULL; // error code already set
   5480
   5481   flip_vertically = ((int) s->img_y) > 0;
   5482   s->img_y = abs((int) s->img_y);
   5483
   5484   if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5485   if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5486
   5487   mr = info.mr;
   5488   mg = info.mg;
   5489   mb = info.mb;
   5490   ma = info.ma;
   5491   all_a = info.all_a;
   5492
   5493   if (info.hsz == 12) {
   5494      if (info.bpp < 24)
   5495         psize = (info.offset - info.extra_read - 24) / 3;
   5496   } else {
   5497      if (info.bpp < 16)
   5498         psize = (info.offset - info.extra_read - info.hsz) >> 2;
   5499   }
   5500   if (psize == 0) {
   5501      if (info.offset != s->callback_already_read + (s->img_buffer - s->img_buffer_original)) {
   5502        return stbi__errpuc("bad offset", "Corrupt BMP");
   5503      }
   5504   }
   5505
   5506   if (info.bpp == 24 && ma == 0xff000000)
   5507      s->img_n = 3;
   5508   else
   5509      s->img_n = ma ? 4 : 3;
   5510   if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
   5511      target = req_comp;
   5512   else
   5513      target = s->img_n; // if they want monochrome, we'll post-convert
   5514
   5515   // sanity-check size
   5516   if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
   5517      return stbi__errpuc("too large", "Corrupt BMP");
   5518
   5519   out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
   5520   if (!out) return stbi__errpuc("outofmem", "Out of memory");
   5521   if (info.bpp < 16) {
   5522      int z=0;
   5523      if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
   5524      for (i=0; i < psize; ++i) {
   5525         pal[i][2] = stbi__get8(s);
   5526         pal[i][1] = stbi__get8(s);
   5527         pal[i][0] = stbi__get8(s);
   5528         if (info.hsz != 12) stbi__get8(s);
   5529         pal[i][3] = 255;
   5530      }
   5531      stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
   5532      if (info.bpp == 1) width = (s->img_x + 7) >> 3;
   5533      else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
   5534      else if (info.bpp == 8) width = s->img_x;
   5535      else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
   5536      pad = (-width)&3;
   5537      if (info.bpp == 1) {
   5538         for (j=0; j < (int) s->img_y; ++j) {
   5539            int bit_offset = 7, v = stbi__get8(s);
   5540            for (i=0; i < (int) s->img_x; ++i) {
   5541               int color = (v>>bit_offset)&0x1;
   5542               out[z++] = pal[color][0];
   5543               out[z++] = pal[color][1];
   5544               out[z++] = pal[color][2];
   5545               if (target == 4) out[z++] = 255;
   5546               if (i+1 == (int) s->img_x) break;
   5547               if((--bit_offset) < 0) {
   5548                  bit_offset = 7;
   5549                  v = stbi__get8(s);
   5550               }
   5551            }
   5552            stbi__skip(s, pad);
   5553         }
   5554      } else {
   5555         for (j=0; j < (int) s->img_y; ++j) {
   5556            for (i=0; i < (int) s->img_x; i += 2) {
   5557               int v=stbi__get8(s),v2=0;
   5558               if (info.bpp == 4) {
   5559                  v2 = v & 15;
   5560                  v >>= 4;
   5561               }
   5562               out[z++] = pal[v][0];
   5563               out[z++] = pal[v][1];
   5564               out[z++] = pal[v][2];
   5565               if (target == 4) out[z++] = 255;
   5566               if (i+1 == (int) s->img_x) break;
   5567               v = (info.bpp == 8) ? stbi__get8(s) : v2;
   5568               out[z++] = pal[v][0];
   5569               out[z++] = pal[v][1];
   5570               out[z++] = pal[v][2];
   5571               if (target == 4) out[z++] = 255;
   5572            }
   5573            stbi__skip(s, pad);
   5574         }
   5575      }
   5576   } else {
   5577      int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
   5578      int z = 0;
   5579      int easy=0;
   5580      stbi__skip(s, info.offset - info.extra_read - info.hsz);
   5581      if (info.bpp == 24) width = 3 * s->img_x;
   5582      else if (info.bpp == 16) width = 2*s->img_x;
   5583      else /* bpp = 32 and pad = 0 */ width=0;
   5584      pad = (-width) & 3;
   5585      if (info.bpp == 24) {
   5586         easy = 1;
   5587      } else if (info.bpp == 32) {
   5588         if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
   5589            easy = 2;
   5590      }
   5591      if (!easy) {
   5592         if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
   5593         // right shift amt to put high bit in position #7
   5594         rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
   5595         gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
   5596         bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
   5597         ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
   5598         if (rcount > 8 || gcount > 8 || bcount > 8 || acount > 8) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
   5599      }
   5600      for (j=0; j < (int) s->img_y; ++j) {
   5601         if (easy) {
   5602            for (i=0; i < (int) s->img_x; ++i) {
   5603               unsigned char a;
   5604               out[z+2] = stbi__get8(s);
   5605               out[z+1] = stbi__get8(s);
   5606               out[z+0] = stbi__get8(s);
   5607               z += 3;
   5608               a = (easy == 2 ? stbi__get8(s) : 255);
   5609               all_a |= a;
   5610               if (target == 4) out[z++] = a;
   5611            }
   5612         } else {
   5613            int bpp = info.bpp;
   5614            for (i=0; i < (int) s->img_x; ++i) {
   5615               stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
   5616               unsigned int a;
   5617               out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
   5618               out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
   5619               out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
   5620               a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
   5621               all_a |= a;
   5622               if (target == 4) out[z++] = STBI__BYTECAST(a);
   5623            }
   5624         }
   5625         stbi__skip(s, pad);
   5626      }
   5627   }
   5628
   5629   // if alpha channel is all 0s, replace with all 255s
   5630   if (target == 4 && all_a == 0)
   5631      for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
   5632         out[i] = 255;
   5633
   5634   if (flip_vertically) {
   5635      stbi_uc t;
   5636      for (j=0; j < (int) s->img_y>>1; ++j) {
   5637         stbi_uc *p1 = out +      j     *s->img_x*target;
   5638         stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
   5639         for (i=0; i < (int) s->img_x*target; ++i) {
   5640            t = p1[i]; p1[i] = p2[i]; p2[i] = t;
   5641         }
   5642      }
   5643   }
   5644
   5645   if (req_comp && req_comp != target) {
   5646      out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
   5647      if (out == NULL) return out; // stbi__convert_format frees input on failure
   5648   }
   5649
   5650   *x = s->img_x;
   5651   *y = s->img_y;
   5652   if (comp) *comp = s->img_n;
   5653   return out;
   5654}
   5655#endif
   5656
   5657// Targa Truevision - TGA
   5658// by Jonathan Dummer
   5659#ifndef STBI_NO_TGA
   5660// returns STBI_rgb or whatever, 0 on error
   5661static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
   5662{
   5663   // only RGB or RGBA (incl. 16bit) or grey allowed
   5664   if (is_rgb16) *is_rgb16 = 0;
   5665   switch(bits_per_pixel) {
   5666      case 8:  return STBI_grey;
   5667      case 16: if(is_grey) return STBI_grey_alpha;
   5668               // fallthrough
   5669      case 15: if(is_rgb16) *is_rgb16 = 1;
   5670               return STBI_rgb;
   5671      case 24: // fallthrough
   5672      case 32: return bits_per_pixel/8;
   5673      default: return 0;
   5674   }
   5675}
   5676
   5677static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
   5678{
   5679    int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
   5680    int sz, tga_colormap_type;
   5681    stbi__get8(s);                   // discard Offset
   5682    tga_colormap_type = stbi__get8(s); // colormap type
   5683    if( tga_colormap_type > 1 ) {
   5684        stbi__rewind(s);
   5685        return 0;      // only RGB or indexed allowed
   5686    }
   5687    tga_image_type = stbi__get8(s); // image type
   5688    if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
   5689        if (tga_image_type != 1 && tga_image_type != 9) {
   5690            stbi__rewind(s);
   5691            return 0;
   5692        }
   5693        stbi__skip(s,4);       // skip index of first colormap entry and number of entries
   5694        sz = stbi__get8(s);    //   check bits per palette color entry
   5695        if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
   5696            stbi__rewind(s);
   5697            return 0;
   5698        }
   5699        stbi__skip(s,4);       // skip image x and y origin
   5700        tga_colormap_bpp = sz;
   5701    } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
   5702        if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
   5703            stbi__rewind(s);
   5704            return 0; // only RGB or grey allowed, +/- RLE
   5705        }
   5706        stbi__skip(s,9); // skip colormap specification and image x/y origin
   5707        tga_colormap_bpp = 0;
   5708    }
   5709    tga_w = stbi__get16le(s);
   5710    if( tga_w < 1 ) {
   5711        stbi__rewind(s);
   5712        return 0;   // test width
   5713    }
   5714    tga_h = stbi__get16le(s);
   5715    if( tga_h < 1 ) {
   5716        stbi__rewind(s);
   5717        return 0;   // test height
   5718    }
   5719    tga_bits_per_pixel = stbi__get8(s); // bits per pixel
   5720    stbi__get8(s); // ignore alpha bits
   5721    if (tga_colormap_bpp != 0) {
   5722        if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
   5723            // when using a colormap, tga_bits_per_pixel is the size of the indexes
   5724            // I don't think anything but 8 or 16bit indexes makes sense
   5725            stbi__rewind(s);
   5726            return 0;
   5727        }
   5728        tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
   5729    } else {
   5730        tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
   5731    }
   5732    if(!tga_comp) {
   5733      stbi__rewind(s);
   5734      return 0;
   5735    }
   5736    if (x) *x = tga_w;
   5737    if (y) *y = tga_h;
   5738    if (comp) *comp = tga_comp;
   5739    return 1;                   // seems to have passed everything
   5740}
   5741
   5742static int stbi__tga_test(stbi__context *s)
   5743{
   5744   int res = 0;
   5745   int sz, tga_color_type;
   5746   stbi__get8(s);      //   discard Offset
   5747   tga_color_type = stbi__get8(s);   //   color type
   5748   if ( tga_color_type > 1 ) goto errorEnd;   //   only RGB or indexed allowed
   5749   sz = stbi__get8(s);   //   image type
   5750   if ( tga_color_type == 1 ) { // colormapped (paletted) image
   5751      if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
   5752      stbi__skip(s,4);       // skip index of first colormap entry and number of entries
   5753      sz = stbi__get8(s);    //   check bits per palette color entry
   5754      if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
   5755      stbi__skip(s,4);       // skip image x and y origin
   5756   } else { // "normal" image w/o colormap
   5757      if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
   5758      stbi__skip(s,9); // skip colormap specification and image x/y origin
   5759   }
   5760   if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test width
   5761   if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test height
   5762   sz = stbi__get8(s);   //   bits per pixel
   5763   if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
   5764   if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
   5765
   5766   res = 1; // if we got this far, everything's good and we can return 1 instead of 0
   5767
   5768errorEnd:
   5769   stbi__rewind(s);
   5770   return res;
   5771}
   5772
   5773// read 16bit value and convert to 24bit RGB
   5774static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
   5775{
   5776   stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
   5777   stbi__uint16 fiveBitMask = 31;
   5778   // we have 3 channels with 5bits each
   5779   int r = (px >> 10) & fiveBitMask;
   5780   int g = (px >> 5) & fiveBitMask;
   5781   int b = px & fiveBitMask;
   5782   // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
   5783   out[0] = (stbi_uc)((r * 255)/31);
   5784   out[1] = (stbi_uc)((g * 255)/31);
   5785   out[2] = (stbi_uc)((b * 255)/31);
   5786
   5787   // some people claim that the most significant bit might be used for alpha
   5788   // (possibly if an alpha-bit is set in the "image descriptor byte")
   5789   // but that only made 16bit test images completely translucent..
   5790   // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
   5791}
   5792
   5793static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5794{
   5795   //   read in the TGA header stuff
   5796   int tga_offset = stbi__get8(s);
   5797   int tga_indexed = stbi__get8(s);
   5798   int tga_image_type = stbi__get8(s);
   5799   int tga_is_RLE = 0;
   5800   int tga_palette_start = stbi__get16le(s);
   5801   int tga_palette_len = stbi__get16le(s);
   5802   int tga_palette_bits = stbi__get8(s);
   5803   int tga_x_origin = stbi__get16le(s);
   5804   int tga_y_origin = stbi__get16le(s);
   5805   int tga_width = stbi__get16le(s);
   5806   int tga_height = stbi__get16le(s);
   5807   int tga_bits_per_pixel = stbi__get8(s);
   5808   int tga_comp, tga_rgb16=0;
   5809   int tga_inverted = stbi__get8(s);
   5810   // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
   5811   //   image data
   5812   unsigned char *tga_data;
   5813   unsigned char *tga_palette = NULL;
   5814   int i, j;
   5815   unsigned char raw_data[4] = {0};
   5816   int RLE_count = 0;
   5817   int RLE_repeating = 0;
   5818   int read_next_pixel = 1;
   5819   STBI_NOTUSED(ri);
   5820   STBI_NOTUSED(tga_x_origin); // @TODO
   5821   STBI_NOTUSED(tga_y_origin); // @TODO
   5822
   5823   if (tga_height > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5824   if (tga_width > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5825
   5826   //   do a tiny bit of precessing
   5827   if ( tga_image_type >= 8 )
   5828   {
   5829      tga_image_type -= 8;
   5830      tga_is_RLE = 1;
   5831   }
   5832   tga_inverted = 1 - ((tga_inverted >> 5) & 1);
   5833
   5834   //   If I'm paletted, then I'll use the number of bits from the palette
   5835   if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
   5836   else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
   5837
   5838   if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
   5839      return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
   5840
   5841   //   tga info
   5842   *x = tga_width;
   5843   *y = tga_height;
   5844   if (comp) *comp = tga_comp;
   5845
   5846   if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
   5847      return stbi__errpuc("too large", "Corrupt TGA");
   5848
   5849   tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
   5850   if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
   5851
   5852   // skip to the data's starting position (offset usually = 0)
   5853   stbi__skip(s, tga_offset );
   5854
   5855   if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
   5856      for (i=0; i < tga_height; ++i) {
   5857         int row = tga_inverted ? tga_height -i - 1 : i;
   5858         stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
   5859         stbi__getn(s, tga_row, tga_width * tga_comp);
   5860      }
   5861   } else  {
   5862      //   do I need to load a palette?
   5863      if ( tga_indexed)
   5864      {
   5865         if (tga_palette_len == 0) {  /* you have to have at least one entry! */
   5866            STBI_FREE(tga_data);
   5867            return stbi__errpuc("bad palette", "Corrupt TGA");
   5868         }
   5869
   5870         //   any data to skip? (offset usually = 0)
   5871         stbi__skip(s, tga_palette_start );
   5872         //   load the palette
   5873         tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
   5874         if (!tga_palette) {
   5875            STBI_FREE(tga_data);
   5876            return stbi__errpuc("outofmem", "Out of memory");
   5877         }
   5878         if (tga_rgb16) {
   5879            stbi_uc *pal_entry = tga_palette;
   5880            STBI_ASSERT(tga_comp == STBI_rgb);
   5881            for (i=0; i < tga_palette_len; ++i) {
   5882               stbi__tga_read_rgb16(s, pal_entry);
   5883               pal_entry += tga_comp;
   5884            }
   5885         } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
   5886               STBI_FREE(tga_data);
   5887               STBI_FREE(tga_palette);
   5888               return stbi__errpuc("bad palette", "Corrupt TGA");
   5889         }
   5890      }
   5891      //   load the data
   5892      for (i=0; i < tga_width * tga_height; ++i)
   5893      {
   5894         //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
   5895         if ( tga_is_RLE )
   5896         {
   5897            if ( RLE_count == 0 )
   5898            {
   5899               //   yep, get the next byte as a RLE command
   5900               int RLE_cmd = stbi__get8(s);
   5901               RLE_count = 1 + (RLE_cmd & 127);
   5902               RLE_repeating = RLE_cmd >> 7;
   5903               read_next_pixel = 1;
   5904            } else if ( !RLE_repeating )
   5905            {
   5906               read_next_pixel = 1;
   5907            }
   5908         } else
   5909         {
   5910            read_next_pixel = 1;
   5911         }
   5912         //   OK, if I need to read a pixel, do it now
   5913         if ( read_next_pixel )
   5914         {
   5915            //   load however much data we did have
   5916            if ( tga_indexed )
   5917            {
   5918               // read in index, then perform the lookup
   5919               int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
   5920               if ( pal_idx >= tga_palette_len ) {
   5921                  // invalid index
   5922                  pal_idx = 0;
   5923               }
   5924               pal_idx *= tga_comp;
   5925               for (j = 0; j < tga_comp; ++j) {
   5926                  raw_data[j] = tga_palette[pal_idx+j];
   5927               }
   5928            } else if(tga_rgb16) {
   5929               STBI_ASSERT(tga_comp == STBI_rgb);
   5930               stbi__tga_read_rgb16(s, raw_data);
   5931            } else {
   5932               //   read in the data raw
   5933               for (j = 0; j < tga_comp; ++j) {
   5934                  raw_data[j] = stbi__get8(s);
   5935               }
   5936            }
   5937            //   clear the reading flag for the next pixel
   5938            read_next_pixel = 0;
   5939         } // end of reading a pixel
   5940
   5941         // copy data
   5942         for (j = 0; j < tga_comp; ++j)
   5943           tga_data[i*tga_comp+j] = raw_data[j];
   5944
   5945         //   in case we're in RLE mode, keep counting down
   5946         --RLE_count;
   5947      }
   5948      //   do I need to invert the image?
   5949      if ( tga_inverted )
   5950      {
   5951         for (j = 0; j*2 < tga_height; ++j)
   5952         {
   5953            int index1 = j * tga_width * tga_comp;
   5954            int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
   5955            for (i = tga_width * tga_comp; i > 0; --i)
   5956            {
   5957               unsigned char temp = tga_data[index1];
   5958               tga_data[index1] = tga_data[index2];
   5959               tga_data[index2] = temp;
   5960               ++index1;
   5961               ++index2;
   5962            }
   5963         }
   5964      }
   5965      //   clear my palette, if I had one
   5966      if ( tga_palette != NULL )
   5967      {
   5968         STBI_FREE( tga_palette );
   5969      }
   5970   }
   5971
   5972   // swap RGB - if the source data was RGB16, it already is in the right order
   5973   if (tga_comp >= 3 && !tga_rgb16)
   5974   {
   5975      unsigned char* tga_pixel = tga_data;
   5976      for (i=0; i < tga_width * tga_height; ++i)
   5977      {
   5978         unsigned char temp = tga_pixel[0];
   5979         tga_pixel[0] = tga_pixel[2];
   5980         tga_pixel[2] = temp;
   5981         tga_pixel += tga_comp;
   5982      }
   5983   }
   5984
   5985   // convert to target component count
   5986   if (req_comp && req_comp != tga_comp)
   5987      tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
   5988
   5989   //   the things I do to get rid of an error message, and yet keep
   5990   //   Microsoft's C compilers happy... [8^(
   5991   tga_palette_start = tga_palette_len = tga_palette_bits =
   5992         tga_x_origin = tga_y_origin = 0;
   5993   STBI_NOTUSED(tga_palette_start);
   5994   //   OK, done
   5995   return tga_data;
   5996}
   5997#endif
   5998
   5999// *************************************************************************************************
   6000// Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
   6001
   6002#ifndef STBI_NO_PSD
   6003static int stbi__psd_test(stbi__context *s)
   6004{
   6005   int r = (stbi__get32be(s) == 0x38425053);
   6006   stbi__rewind(s);
   6007   return r;
   6008}
   6009
   6010static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount)
   6011{
   6012   int count, nleft, len;
   6013
   6014   count = 0;
   6015   while ((nleft = pixelCount - count) > 0) {
   6016      len = stbi__get8(s);
   6017      if (len == 128) {
   6018         // No-op.
   6019      } else if (len < 128) {
   6020         // Copy next len+1 bytes literally.
   6021         len++;
   6022         if (len > nleft) return 0; // corrupt data
   6023         count += len;
   6024         while (len) {
   6025            *p = stbi__get8(s);
   6026            p += 4;
   6027            len--;
   6028         }
   6029      } else if (len > 128) {
   6030         stbi_uc   val;
   6031         // Next -len+1 bytes in the dest are replicated from next source byte.
   6032         // (Interpret len as a negative 8-bit int.)
   6033         len = 257 - len;
   6034         if (len > nleft) return 0; // corrupt data
   6035         val = stbi__get8(s);
   6036         count += len;
   6037         while (len) {
   6038            *p = val;
   6039            p += 4;
   6040            len--;
   6041         }
   6042      }
   6043   }
   6044
   6045   return 1;
   6046}
   6047
   6048static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
   6049{
   6050   int pixelCount;
   6051   int channelCount, compression;
   6052   int channel, i;
   6053   int bitdepth;
   6054   int w,h;
   6055   stbi_uc *out;
   6056   STBI_NOTUSED(ri);
   6057
   6058   // Check identifier
   6059   if (stbi__get32be(s) != 0x38425053)   // "8BPS"
   6060      return stbi__errpuc("not PSD", "Corrupt PSD image");
   6061
   6062   // Check file type version.
   6063   if (stbi__get16be(s) != 1)
   6064      return stbi__errpuc("wrong version", "Unsupported version of PSD image");
   6065
   6066   // Skip 6 reserved bytes.
   6067   stbi__skip(s, 6 );
   6068
   6069   // Read the number of channels (R, G, B, A, etc).
   6070   channelCount = stbi__get16be(s);
   6071   if (channelCount < 0 || channelCount > 16)
   6072      return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
   6073
   6074   // Read the rows and columns of the image.
   6075   h = stbi__get32be(s);
   6076   w = stbi__get32be(s);
   6077
   6078   if (h > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6079   if (w > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6080
   6081   // Make sure the depth is 8 bits.
   6082   bitdepth = stbi__get16be(s);
   6083   if (bitdepth != 8 && bitdepth != 16)
   6084      return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
   6085
   6086   // Make sure the color mode is RGB.
   6087   // Valid options are:
   6088   //   0: Bitmap
   6089   //   1: Grayscale
   6090   //   2: Indexed color
   6091   //   3: RGB color
   6092   //   4: CMYK color
   6093   //   7: Multichannel
   6094   //   8: Duotone
   6095   //   9: Lab color
   6096   if (stbi__get16be(s) != 3)
   6097      return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
   6098
   6099   // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
   6100   stbi__skip(s,stbi__get32be(s) );
   6101
   6102   // Skip the image resources.  (resolution, pen tool paths, etc)
   6103   stbi__skip(s, stbi__get32be(s) );
   6104
   6105   // Skip the reserved data.
   6106   stbi__skip(s, stbi__get32be(s) );
   6107
   6108   // Find out if the data is compressed.
   6109   // Known values:
   6110   //   0: no compression
   6111   //   1: RLE compressed
   6112   compression = stbi__get16be(s);
   6113   if (compression > 1)
   6114      return stbi__errpuc("bad compression", "PSD has an unknown compression format");
   6115
   6116   // Check size
   6117   if (!stbi__mad3sizes_valid(4, w, h, 0))
   6118      return stbi__errpuc("too large", "Corrupt PSD");
   6119
   6120   // Create the destination image.
   6121
   6122   if (!compression && bitdepth == 16 && bpc == 16) {
   6123      out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0);
   6124      ri->bits_per_channel = 16;
   6125   } else
   6126      out = (stbi_uc *) stbi__malloc(4 * w*h);
   6127
   6128   if (!out) return stbi__errpuc("outofmem", "Out of memory");
   6129   pixelCount = w*h;
   6130
   6131   // Initialize the data to zero.
   6132   //memset( out, 0, pixelCount * 4 );
   6133
   6134   // Finally, the image data.
   6135   if (compression) {
   6136      // RLE as used by .PSD and .TIFF
   6137      // Loop until you get the number of unpacked bytes you are expecting:
   6138      //     Read the next source byte into n.
   6139      //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
   6140      //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
   6141      //     Else if n is 128, noop.
   6142      // Endloop
   6143
   6144      // The RLE-compressed data is preceded by a 2-byte data count for each row in the data,
   6145      // which we're going to just skip.
   6146      stbi__skip(s, h * channelCount * 2 );
   6147
   6148      // Read the RLE data by channel.
   6149      for (channel = 0; channel < 4; channel++) {
   6150         stbi_uc *p;
   6151
   6152         p = out+channel;
   6153         if (channel >= channelCount) {
   6154            // Fill this channel with default data.
   6155            for (i = 0; i < pixelCount; i++, p += 4)
   6156               *p = (channel == 3 ? 255 : 0);
   6157         } else {
   6158            // Read the RLE data.
   6159            if (!stbi__psd_decode_rle(s, p, pixelCount)) {
   6160               STBI_FREE(out);
   6161               return stbi__errpuc("corrupt", "bad RLE data");
   6162            }
   6163         }
   6164      }
   6165
   6166   } else {
   6167      // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
   6168      // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
   6169
   6170      // Read the data by channel.
   6171      for (channel = 0; channel < 4; channel++) {
   6172         if (channel >= channelCount) {
   6173            // Fill this channel with default data.
   6174            if (bitdepth == 16 && bpc == 16) {
   6175               stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
   6176               stbi__uint16 val = channel == 3 ? 65535 : 0;
   6177               for (i = 0; i < pixelCount; i++, q += 4)
   6178                  *q = val;
   6179            } else {
   6180               stbi_uc *p = out+channel;
   6181               stbi_uc val = channel == 3 ? 255 : 0;
   6182               for (i = 0; i < pixelCount; i++, p += 4)
   6183                  *p = val;
   6184            }
   6185         } else {
   6186            if (ri->bits_per_channel == 16) {    // output bpc
   6187               stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
   6188               for (i = 0; i < pixelCount; i++, q += 4)
   6189                  *q = (stbi__uint16) stbi__get16be(s);
   6190            } else {
   6191               stbi_uc *p = out+channel;
   6192               if (bitdepth == 16) {  // input bpc
   6193                  for (i = 0; i < pixelCount; i++, p += 4)
   6194                     *p = (stbi_uc) (stbi__get16be(s) >> 8);
   6195               } else {
   6196                  for (i = 0; i < pixelCount; i++, p += 4)
   6197                     *p = stbi__get8(s);
   6198               }
   6199            }
   6200         }
   6201      }
   6202   }
   6203
   6204   // remove weird white matte from PSD
   6205   if (channelCount >= 4) {
   6206      if (ri->bits_per_channel == 16) {
   6207         for (i=0; i < w*h; ++i) {
   6208            stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i;
   6209            if (pixel[3] != 0 && pixel[3] != 65535) {
   6210               float a = pixel[3] / 65535.0f;
   6211               float ra = 1.0f / a;
   6212               float inv_a = 65535.0f * (1 - ra);
   6213               pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a);
   6214               pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a);
   6215               pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a);
   6216            }
   6217         }
   6218      } else {
   6219         for (i=0; i < w*h; ++i) {
   6220            unsigned char *pixel = out + 4*i;
   6221            if (pixel[3] != 0 && pixel[3] != 255) {
   6222               float a = pixel[3] / 255.0f;
   6223               float ra = 1.0f / a;
   6224               float inv_a = 255.0f * (1 - ra);
   6225               pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
   6226               pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
   6227               pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
   6228            }
   6229         }
   6230      }
   6231   }
   6232
   6233   // convert to desired output format
   6234   if (req_comp && req_comp != 4) {
   6235      if (ri->bits_per_channel == 16)
   6236         out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h);
   6237      else
   6238         out = stbi__convert_format(out, 4, req_comp, w, h);
   6239      if (out == NULL) return out; // stbi__convert_format frees input on failure
   6240   }
   6241
   6242   if (comp) *comp = 4;
   6243   *y = h;
   6244   *x = w;
   6245
   6246   return out;
   6247}
   6248#endif
   6249
   6250// *************************************************************************************************
   6251// Softimage PIC loader
   6252// by Tom Seddon
   6253//
   6254// See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
   6255// See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
   6256
   6257#ifndef STBI_NO_PIC
   6258static int stbi__pic_is4(stbi__context *s,const char *str)
   6259{
   6260   int i;
   6261   for (i=0; i<4; ++i)
   6262      if (stbi__get8(s) != (stbi_uc)str[i])
   6263         return 0;
   6264
   6265   return 1;
   6266}
   6267
   6268static int stbi__pic_test_core(stbi__context *s)
   6269{
   6270   int i;
   6271
   6272   if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
   6273      return 0;
   6274
   6275   for(i=0;i<84;++i)
   6276      stbi__get8(s);
   6277
   6278   if (!stbi__pic_is4(s,"PICT"))
   6279      return 0;
   6280
   6281   return 1;
   6282}
   6283
   6284typedef struct
   6285{
   6286   stbi_uc size,type,channel;
   6287} stbi__pic_packet;
   6288
   6289static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
   6290{
   6291   int mask=0x80, i;
   6292
   6293   for (i=0; i<4; ++i, mask>>=1) {
   6294      if (channel & mask) {
   6295         if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
   6296         dest[i]=stbi__get8(s);
   6297      }
   6298   }
   6299
   6300   return dest;
   6301}
   6302
   6303static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
   6304{
   6305   int mask=0x80,i;
   6306
   6307   for (i=0;i<4; ++i, mask>>=1)
   6308      if (channel&mask)
   6309         dest[i]=src[i];
   6310}
   6311
   6312static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
   6313{
   6314   int act_comp=0,num_packets=0,y,chained;
   6315   stbi__pic_packet packets[10];
   6316
   6317   // this will (should...) cater for even some bizarre stuff like having data
   6318    // for the same channel in multiple packets.
   6319   do {
   6320      stbi__pic_packet *packet;
   6321
   6322      if (num_packets==sizeof(packets)/sizeof(packets[0]))
   6323         return stbi__errpuc("bad format","too many packets");
   6324
   6325      packet = &packets[num_packets++];
   6326
   6327      chained = stbi__get8(s);
   6328      packet->size    = stbi__get8(s);
   6329      packet->type    = stbi__get8(s);
   6330      packet->channel = stbi__get8(s);
   6331
   6332      act_comp |= packet->channel;
   6333
   6334      if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
   6335      if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
   6336   } while (chained);
   6337
   6338   *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
   6339
   6340   for(y=0; y<height; ++y) {
   6341      int packet_idx;
   6342
   6343      for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
   6344         stbi__pic_packet *packet = &packets[packet_idx];
   6345         stbi_uc *dest = result+y*width*4;
   6346
   6347         switch (packet->type) {
   6348            default:
   6349               return stbi__errpuc("bad format","packet has bad compression type");
   6350
   6351            case 0: {//uncompressed
   6352               int x;
   6353
   6354               for(x=0;x<width;++x, dest+=4)
   6355                  if (!stbi__readval(s,packet->channel,dest))
   6356                     return 0;
   6357               break;
   6358            }
   6359
   6360            case 1://Pure RLE
   6361               {
   6362                  int left=width, i;
   6363
   6364                  while (left>0) {
   6365                     stbi_uc count,value[4];
   6366
   6367                     count=stbi__get8(s);
   6368                     if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
   6369
   6370                     if (count > left)
   6371                        count = (stbi_uc) left;
   6372
   6373                     if (!stbi__readval(s,packet->channel,value))  return 0;
   6374
   6375                     for(i=0; i<count; ++i,dest+=4)
   6376                        stbi__copyval(packet->channel,dest,value);
   6377                     left -= count;
   6378                  }
   6379               }
   6380               break;
   6381
   6382            case 2: {//Mixed RLE
   6383               int left=width;
   6384               while (left>0) {
   6385                  int count = stbi__get8(s), i;
   6386                  if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
   6387
   6388                  if (count >= 128) { // Repeated
   6389                     stbi_uc value[4];
   6390
   6391                     if (count==128)
   6392                        count = stbi__get16be(s);
   6393                     else
   6394                        count -= 127;
   6395                     if (count > left)
   6396                        return stbi__errpuc("bad file","scanline overrun");
   6397
   6398                     if (!stbi__readval(s,packet->channel,value))
   6399                        return 0;
   6400
   6401                     for(i=0;i<count;++i, dest += 4)
   6402                        stbi__copyval(packet->channel,dest,value);
   6403                  } else { // Raw
   6404                     ++count;
   6405                     if (count>left) return stbi__errpuc("bad file","scanline overrun");
   6406
   6407                     for(i=0;i<count;++i, dest+=4)
   6408                        if (!stbi__readval(s,packet->channel,dest))
   6409                           return 0;
   6410                  }
   6411                  left-=count;
   6412               }
   6413               break;
   6414            }
   6415         }
   6416      }
   6417   }
   6418
   6419   return result;
   6420}
   6421
   6422static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri)
   6423{
   6424   stbi_uc *result;
   6425   int i, x,y, internal_comp;
   6426   STBI_NOTUSED(ri);
   6427
   6428   if (!comp) comp = &internal_comp;
   6429
   6430   for (i=0; i<92; ++i)
   6431      stbi__get8(s);
   6432
   6433   x = stbi__get16be(s);
   6434   y = stbi__get16be(s);
   6435
   6436   if (y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6437   if (x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6438
   6439   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
   6440   if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
   6441
   6442   stbi__get32be(s); //skip `ratio'
   6443   stbi__get16be(s); //skip `fields'
   6444   stbi__get16be(s); //skip `pad'
   6445
   6446   // intermediate buffer is RGBA
   6447   result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0);
   6448   if (!result) return stbi__errpuc("outofmem", "Out of memory");
   6449   memset(result, 0xff, x*y*4);
   6450
   6451   if (!stbi__pic_load_core(s,x,y,comp, result)) {
   6452      STBI_FREE(result);
   6453      result=0;
   6454   }
   6455   *px = x;
   6456   *py = y;
   6457   if (req_comp == 0) req_comp = *comp;
   6458   result=stbi__convert_format(result,4,req_comp,x,y);
   6459
   6460   return result;
   6461}
   6462
   6463static int stbi__pic_test(stbi__context *s)
   6464{
   6465   int r = stbi__pic_test_core(s);
   6466   stbi__rewind(s);
   6467   return r;
   6468}
   6469#endif
   6470
   6471// *************************************************************************************************
   6472// GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
   6473
   6474#ifndef STBI_NO_GIF
   6475typedef struct
   6476{
   6477   stbi__int16 prefix;
   6478   stbi_uc first;
   6479   stbi_uc suffix;
   6480} stbi__gif_lzw;
   6481
   6482typedef struct
   6483{
   6484   int w,h;
   6485   stbi_uc *out;                 // output buffer (always 4 components)
   6486   stbi_uc *background;          // The current "background" as far as a gif is concerned
   6487   stbi_uc *history;
   6488   int flags, bgindex, ratio, transparent, eflags;
   6489   stbi_uc  pal[256][4];
   6490   stbi_uc lpal[256][4];
   6491   stbi__gif_lzw codes[8192];
   6492   stbi_uc *color_table;
   6493   int parse, step;
   6494   int lflags;
   6495   int start_x, start_y;
   6496   int max_x, max_y;
   6497   int cur_x, cur_y;
   6498   int line_size;
   6499   int delay;
   6500} stbi__gif;
   6501
   6502static int stbi__gif_test_raw(stbi__context *s)
   6503{
   6504   int sz;
   6505   if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
   6506   sz = stbi__get8(s);
   6507   if (sz != '9' && sz != '7') return 0;
   6508   if (stbi__get8(s) != 'a') return 0;
   6509   return 1;
   6510}
   6511
   6512static int stbi__gif_test(stbi__context *s)
   6513{
   6514   int r = stbi__gif_test_raw(s);
   6515   stbi__rewind(s);
   6516   return r;
   6517}
   6518
   6519static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
   6520{
   6521   int i;
   6522   for (i=0; i < num_entries; ++i) {
   6523      pal[i][2] = stbi__get8(s);
   6524      pal[i][1] = stbi__get8(s);
   6525      pal[i][0] = stbi__get8(s);
   6526      pal[i][3] = transp == i ? 0 : 255;
   6527   }
   6528}
   6529
   6530static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
   6531{
   6532   stbi_uc version;
   6533   if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
   6534      return stbi__err("not GIF", "Corrupt GIF");
   6535
   6536   version = stbi__get8(s);
   6537   if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
   6538   if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
   6539
   6540   stbi__g_failure_reason = "";
   6541   g->w = stbi__get16le(s);
   6542   g->h = stbi__get16le(s);
   6543   g->flags = stbi__get8(s);
   6544   g->bgindex = stbi__get8(s);
   6545   g->ratio = stbi__get8(s);
   6546   g->transparent = -1;
   6547
   6548   if (g->w > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   6549   if (g->h > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   6550
   6551   if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
   6552
   6553   if (is_info) return 1;
   6554
   6555   if (g->flags & 0x80)
   6556      stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
   6557
   6558   return 1;
   6559}
   6560
   6561static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
   6562{
   6563   stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
   6564   if (!g) return stbi__err("outofmem", "Out of memory");
   6565   if (!stbi__gif_header(s, g, comp, 1)) {
   6566      STBI_FREE(g);
   6567      stbi__rewind( s );
   6568      return 0;
   6569   }
   6570   if (x) *x = g->w;
   6571   if (y) *y = g->h;
   6572   STBI_FREE(g);
   6573   return 1;
   6574}
   6575
   6576static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
   6577{
   6578   stbi_uc *p, *c;
   6579   int idx;
   6580
   6581   // recurse to decode the prefixes, since the linked-list is backwards,
   6582   // and working backwards through an interleaved image would be nasty
   6583   if (g->codes[code].prefix >= 0)
   6584      stbi__out_gif_code(g, g->codes[code].prefix);
   6585
   6586   if (g->cur_y >= g->max_y) return;
   6587
   6588   idx = g->cur_x + g->cur_y;
   6589   p = &g->out[idx];
   6590   g->history[idx / 4] = 1;
   6591
   6592   c = &g->color_table[g->codes[code].suffix * 4];
   6593   if (c[3] > 128) { // don't render transparent pixels;
   6594      p[0] = c[2];
   6595      p[1] = c[1];
   6596      p[2] = c[0];
   6597      p[3] = c[3];
   6598   }
   6599   g->cur_x += 4;
   6600
   6601   if (g->cur_x >= g->max_x) {
   6602      g->cur_x = g->start_x;
   6603      g->cur_y += g->step;
   6604
   6605      while (g->cur_y >= g->max_y && g->parse > 0) {
   6606         g->step = (1 << g->parse) * g->line_size;
   6607         g->cur_y = g->start_y + (g->step >> 1);
   6608         --g->parse;
   6609      }
   6610   }
   6611}
   6612
   6613static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
   6614{
   6615   stbi_uc lzw_cs;
   6616   stbi__int32 len, init_code;
   6617   stbi__uint32 first;
   6618   stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
   6619   stbi__gif_lzw *p;
   6620
   6621   lzw_cs = stbi__get8(s);
   6622   if (lzw_cs > 12) return NULL;
   6623   clear = 1 << lzw_cs;
   6624   first = 1;
   6625   codesize = lzw_cs + 1;
   6626   codemask = (1 << codesize) - 1;
   6627   bits = 0;
   6628   valid_bits = 0;
   6629   for (init_code = 0; init_code < clear; init_code++) {
   6630      g->codes[init_code].prefix = -1;
   6631      g->codes[init_code].first = (stbi_uc) init_code;
   6632      g->codes[init_code].suffix = (stbi_uc) init_code;
   6633   }
   6634
   6635   // support no starting clear code
   6636   avail = clear+2;
   6637   oldcode = -1;
   6638
   6639   len = 0;
   6640   for(;;) {
   6641      if (valid_bits < codesize) {
   6642         if (len == 0) {
   6643            len = stbi__get8(s); // start new block
   6644            if (len == 0)
   6645               return g->out;
   6646         }
   6647         --len;
   6648         bits |= (stbi__int32) stbi__get8(s) << valid_bits;
   6649         valid_bits += 8;
   6650      } else {
   6651         stbi__int32 code = bits & codemask;
   6652         bits >>= codesize;
   6653         valid_bits -= codesize;
   6654         // @OPTIMIZE: is there some way we can accelerate the non-clear path?
   6655         if (code == clear) {  // clear code
   6656            codesize = lzw_cs + 1;
   6657            codemask = (1 << codesize) - 1;
   6658            avail = clear + 2;
   6659            oldcode = -1;
   6660            first = 0;
   6661         } else if (code == clear + 1) { // end of stream code
   6662            stbi__skip(s, len);
   6663            while ((len = stbi__get8(s)) > 0)
   6664               stbi__skip(s,len);
   6665            return g->out;
   6666         } else if (code <= avail) {
   6667            if (first) {
   6668               return stbi__errpuc("no clear code", "Corrupt GIF");
   6669            }
   6670
   6671            if (oldcode >= 0) {
   6672               p = &g->codes[avail++];
   6673               if (avail > 8192) {
   6674                  return stbi__errpuc("too many codes", "Corrupt GIF");
   6675               }
   6676
   6677               p->prefix = (stbi__int16) oldcode;
   6678               p->first = g->codes[oldcode].first;
   6679               p->suffix = (code == avail) ? p->first : g->codes[code].first;
   6680            } else if (code == avail)
   6681               return stbi__errpuc("illegal code in raster", "Corrupt GIF");
   6682
   6683            stbi__out_gif_code(g, (stbi__uint16) code);
   6684
   6685            if ((avail & codemask) == 0 && avail <= 0x0FFF) {
   6686               codesize++;
   6687               codemask = (1 << codesize) - 1;
   6688            }
   6689
   6690            oldcode = code;
   6691         } else {
   6692            return stbi__errpuc("illegal code in raster", "Corrupt GIF");
   6693         }
   6694      }
   6695   }
   6696}
   6697
   6698// this function is designed to support animated gifs, although stb_image doesn't support it
   6699// two back is the image from two frames ago, used for a very specific disposal format
   6700static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back)
   6701{
   6702   int dispose;
   6703   int first_frame;
   6704   int pi;
   6705   int pcount;
   6706   STBI_NOTUSED(req_comp);
   6707
   6708   // on first frame, any non-written pixels get the background colour (non-transparent)
   6709   first_frame = 0;
   6710   if (g->out == 0) {
   6711      if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
   6712      if (!stbi__mad3sizes_valid(4, g->w, g->h, 0))
   6713         return stbi__errpuc("too large", "GIF image is too large");
   6714      pcount = g->w * g->h;
   6715      g->out = (stbi_uc *) stbi__malloc(4 * pcount);
   6716      g->background = (stbi_uc *) stbi__malloc(4 * pcount);
   6717      g->history = (stbi_uc *) stbi__malloc(pcount);
   6718      if (!g->out || !g->background || !g->history)
   6719         return stbi__errpuc("outofmem", "Out of memory");
   6720
   6721      // image is treated as "transparent" at the start - ie, nothing overwrites the current background;
   6722      // background colour is only used for pixels that are not rendered first frame, after that "background"
   6723      // color refers to the color that was there the previous frame.
   6724      memset(g->out, 0x00, 4 * pcount);
   6725      memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent)
   6726      memset(g->history, 0x00, pcount);        // pixels that were affected previous frame
   6727      first_frame = 1;
   6728   } else {
   6729      // second frame - how do we dispose of the previous one?
   6730      dispose = (g->eflags & 0x1C) >> 2;
   6731      pcount = g->w * g->h;
   6732
   6733      if ((dispose == 3) && (two_back == 0)) {
   6734         dispose = 2; // if I don't have an image to revert back to, default to the old background
   6735      }
   6736
   6737      if (dispose == 3) { // use previous graphic
   6738         for (pi = 0; pi < pcount; ++pi) {
   6739            if (g->history[pi]) {
   6740               memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 );
   6741            }
   6742         }
   6743      } else if (dispose == 2) {
   6744         // restore what was changed last frame to background before that frame;
   6745         for (pi = 0; pi < pcount; ++pi) {
   6746            if (g->history[pi]) {
   6747               memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 );
   6748            }
   6749         }
   6750      } else {
   6751         // This is a non-disposal case eithe way, so just
   6752         // leave the pixels as is, and they will become the new background
   6753         // 1: do not dispose
   6754         // 0:  not specified.
   6755      }
   6756
   6757      // background is what out is after the undoing of the previou frame;
   6758      memcpy( g->background, g->out, 4 * g->w * g->h );
   6759   }
   6760
   6761   // clear my history;
   6762   memset( g->history, 0x00, g->w * g->h );        // pixels that were affected previous frame
   6763
   6764   for (;;) {
   6765      int tag = stbi__get8(s);
   6766      switch (tag) {
   6767         case 0x2C: /* Image Descriptor */
   6768         {
   6769            stbi__int32 x, y, w, h;
   6770            stbi_uc *o;
   6771
   6772            x = stbi__get16le(s);
   6773            y = stbi__get16le(s);
   6774            w = stbi__get16le(s);
   6775            h = stbi__get16le(s);
   6776            if (((x + w) > (g->w)) || ((y + h) > (g->h)))
   6777               return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
   6778
   6779            g->line_size = g->w * 4;
   6780            g->start_x = x * 4;
   6781            g->start_y = y * g->line_size;
   6782            g->max_x   = g->start_x + w * 4;
   6783            g->max_y   = g->start_y + h * g->line_size;
   6784            g->cur_x   = g->start_x;
   6785            g->cur_y   = g->start_y;
   6786
   6787            // if the width of the specified rectangle is 0, that means
   6788            // we may not see *any* pixels or the image is malformed;
   6789            // to make sure this is caught, move the current y down to
   6790            // max_y (which is what out_gif_code checks).
   6791            if (w == 0)
   6792               g->cur_y = g->max_y;
   6793
   6794            g->lflags = stbi__get8(s);
   6795
   6796            if (g->lflags & 0x40) {
   6797               g->step = 8 * g->line_size; // first interlaced spacing
   6798               g->parse = 3;
   6799            } else {
   6800               g->step = g->line_size;
   6801               g->parse = 0;
   6802            }
   6803
   6804            if (g->lflags & 0x80) {
   6805               stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
   6806               g->color_table = (stbi_uc *) g->lpal;
   6807            } else if (g->flags & 0x80) {
   6808               g->color_table = (stbi_uc *) g->pal;
   6809            } else
   6810               return stbi__errpuc("missing color table", "Corrupt GIF");
   6811
   6812            o = stbi__process_gif_raster(s, g);
   6813            if (!o) return NULL;
   6814
   6815            // if this was the first frame,
   6816            pcount = g->w * g->h;
   6817            if (first_frame && (g->bgindex > 0)) {
   6818               // if first frame, any pixel not drawn to gets the background color
   6819               for (pi = 0; pi < pcount; ++pi) {
   6820                  if (g->history[pi] == 0) {
   6821                     g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
   6822                     memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 );
   6823                  }
   6824               }
   6825            }
   6826
   6827            return o;
   6828         }
   6829
   6830         case 0x21: // Comment Extension.
   6831         {
   6832            int len;
   6833            int ext = stbi__get8(s);
   6834            if (ext == 0xF9) { // Graphic Control Extension.
   6835               len = stbi__get8(s);
   6836               if (len == 4) {
   6837                  g->eflags = stbi__get8(s);
   6838                  g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
   6839
   6840                  // unset old transparent
   6841                  if (g->transparent >= 0) {
   6842                     g->pal[g->transparent][3] = 255;
   6843                  }
   6844                  if (g->eflags & 0x01) {
   6845                     g->transparent = stbi__get8(s);
   6846                     if (g->transparent >= 0) {
   6847                        g->pal[g->transparent][3] = 0;
   6848                     }
   6849                  } else {
   6850                     // don't need transparent
   6851                     stbi__skip(s, 1);
   6852                     g->transparent = -1;
   6853                  }
   6854               } else {
   6855                  stbi__skip(s, len);
   6856                  break;
   6857               }
   6858            }
   6859            while ((len = stbi__get8(s)) != 0) {
   6860               stbi__skip(s, len);
   6861            }
   6862            break;
   6863         }
   6864
   6865         case 0x3B: // gif stream termination code
   6866            return (stbi_uc *) s; // using '1' causes warning on some compilers
   6867
   6868         default:
   6869            return stbi__errpuc("unknown code", "Corrupt GIF");
   6870      }
   6871   }
   6872}
   6873
   6874static void *stbi__load_gif_main_outofmem(stbi__gif *g, stbi_uc *out, int **delays)
   6875{
   6876   STBI_FREE(g->out);
   6877   STBI_FREE(g->history);
   6878   STBI_FREE(g->background);
   6879
   6880   if (out) STBI_FREE(out);
   6881   if (delays && *delays) STBI_FREE(*delays);
   6882   return stbi__errpuc("outofmem", "Out of memory");
   6883}
   6884
   6885static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
   6886{
   6887   if (stbi__gif_test(s)) {
   6888      int layers = 0;
   6889      stbi_uc *u = 0;
   6890      stbi_uc *out = 0;
   6891      stbi_uc *two_back = 0;
   6892      stbi__gif g;
   6893      int stride;
   6894      int out_size = 0;
   6895      int delays_size = 0;
   6896
   6897      STBI_NOTUSED(out_size);
   6898      STBI_NOTUSED(delays_size);
   6899
   6900      memset(&g, 0, sizeof(g));
   6901      if (delays) {
   6902         *delays = 0;
   6903      }
   6904
   6905      do {
   6906         u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
   6907         if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
   6908
   6909         if (u) {
   6910            *x = g.w;
   6911            *y = g.h;
   6912            ++layers;
   6913            stride = g.w * g.h * 4;
   6914
   6915            if (out) {
   6916               void *tmp = (stbi_uc*) STBI_REALLOC_SIZED( out, out_size, layers * stride );
   6917               if (!tmp)
   6918                  return stbi__load_gif_main_outofmem(&g, out, delays);
   6919               else {
   6920                   out = (stbi_uc*) tmp;
   6921                   out_size = layers * stride;
   6922               }
   6923
   6924               if (delays) {
   6925                  int *new_delays = (int*) STBI_REALLOC_SIZED( *delays, delays_size, sizeof(int) * layers );
   6926                  if (!new_delays)
   6927                     return stbi__load_gif_main_outofmem(&g, out, delays);
   6928                  *delays = new_delays;
   6929                  delays_size = layers * sizeof(int);
   6930               }
   6931            } else {
   6932               out = (stbi_uc*)stbi__malloc( layers * stride );
   6933               if (!out)
   6934                  return stbi__load_gif_main_outofmem(&g, out, delays);
   6935               out_size = layers * stride;
   6936               if (delays) {
   6937                  *delays = (int*) stbi__malloc( layers * sizeof(int) );
   6938                  if (!*delays)
   6939                     return stbi__load_gif_main_outofmem(&g, out, delays);
   6940                  delays_size = layers * sizeof(int);
   6941               }
   6942            }
   6943            memcpy( out + ((layers - 1) * stride), u, stride );
   6944            if (layers >= 2) {
   6945               two_back = out - 2 * stride;
   6946            }
   6947
   6948            if (delays) {
   6949               (*delays)[layers - 1U] = g.delay;
   6950            }
   6951         }
   6952      } while (u != 0);
   6953
   6954      // free temp buffer;
   6955      STBI_FREE(g.out);
   6956      STBI_FREE(g.history);
   6957      STBI_FREE(g.background);
   6958
   6959      // do the final conversion after loading everything;
   6960      if (req_comp && req_comp != 4)
   6961         out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
   6962
   6963      *z = layers;
   6964      return out;
   6965   } else {
   6966      return stbi__errpuc("not GIF", "Image was not as a gif type.");
   6967   }
   6968}
   6969
   6970static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   6971{
   6972   stbi_uc *u = 0;
   6973   stbi__gif g;
   6974   memset(&g, 0, sizeof(g));
   6975   STBI_NOTUSED(ri);
   6976
   6977   u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
   6978   if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
   6979   if (u) {
   6980      *x = g.w;
   6981      *y = g.h;
   6982
   6983      // moved conversion to after successful load so that the same
   6984      // can be done for multiple frames.
   6985      if (req_comp && req_comp != 4)
   6986         u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
   6987   } else if (g.out) {
   6988      // if there was an error and we allocated an image buffer, free it!
   6989      STBI_FREE(g.out);
   6990   }
   6991
   6992   // free buffers needed for multiple frame loading;
   6993   STBI_FREE(g.history);
   6994   STBI_FREE(g.background);
   6995
   6996   return u;
   6997}
   6998
   6999static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
   7000{
   7001   return stbi__gif_info_raw(s,x,y,comp);
   7002}
   7003#endif
   7004
   7005// *************************************************************************************************
   7006// Radiance RGBE HDR loader
   7007// originally by Nicolas Schulz
   7008#ifndef STBI_NO_HDR
   7009static int stbi__hdr_test_core(stbi__context *s, const char *signature)
   7010{
   7011   int i;
   7012   for (i=0; signature[i]; ++i)
   7013      if (stbi__get8(s) != signature[i])
   7014          return 0;
   7015   stbi__rewind(s);
   7016   return 1;
   7017}
   7018
   7019static int stbi__hdr_test(stbi__context* s)
   7020{
   7021   int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
   7022   stbi__rewind(s);
   7023   if(!r) {
   7024       r = stbi__hdr_test_core(s, "#?RGBE\n");
   7025       stbi__rewind(s);
   7026   }
   7027   return r;
   7028}
   7029
   7030#define STBI__HDR_BUFLEN  1024
   7031static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
   7032{
   7033   int len=0;
   7034   char c = '\0';
   7035
   7036   c = (char) stbi__get8(z);
   7037
   7038   while (!stbi__at_eof(z) && c != '\n') {
   7039      buffer[len++] = c;
   7040      if (len == STBI__HDR_BUFLEN-1) {
   7041         // flush to end of line
   7042         while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
   7043            ;
   7044         break;
   7045      }
   7046      c = (char) stbi__get8(z);
   7047   }
   7048
   7049   buffer[len] = 0;
   7050   return buffer;
   7051}
   7052
   7053static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
   7054{
   7055   if ( input[3] != 0 ) {
   7056      float f1;
   7057      // Exponent
   7058      f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
   7059      if (req_comp <= 2)
   7060         output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
   7061      else {
   7062         output[0] = input[0] * f1;
   7063         output[1] = input[1] * f1;
   7064         output[2] = input[2] * f1;
   7065      }
   7066      if (req_comp == 2) output[1] = 1;
   7067      if (req_comp == 4) output[3] = 1;
   7068   } else {
   7069      switch (req_comp) {
   7070         case 4: output[3] = 1; /* fallthrough */
   7071         case 3: output[0] = output[1] = output[2] = 0;
   7072                 break;
   7073         case 2: output[1] = 1; /* fallthrough */
   7074         case 1: output[0] = 0;
   7075                 break;
   7076      }
   7077   }
   7078}
   7079
   7080static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   7081{
   7082   char buffer[STBI__HDR_BUFLEN];
   7083   char *token;
   7084   int valid = 0;
   7085   int width, height;
   7086   stbi_uc *scanline;
   7087   float *hdr_data;
   7088   int len;
   7089   unsigned char count, value;
   7090   int i, j, k, c1,c2, z;
   7091   const char *headerToken;
   7092   STBI_NOTUSED(ri);
   7093
   7094   // Check identifier
   7095   headerToken = stbi__hdr_gettoken(s,buffer);
   7096   if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
   7097      return stbi__errpf("not HDR", "Corrupt HDR image");
   7098
   7099   // Parse header
   7100   for(;;) {
   7101      token = stbi__hdr_gettoken(s,buffer);
   7102      if (token[0] == 0) break;
   7103      if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   7104   }
   7105
   7106   if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
   7107
   7108   // Parse width and height
   7109   // can't use sscanf() if we're not using stdio!
   7110   token = stbi__hdr_gettoken(s,buffer);
   7111   if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
   7112   token += 3;
   7113   height = (int) strtol(token, &token, 10);
   7114   while (*token == ' ') ++token;
   7115   if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
   7116   token += 3;
   7117   width = (int) strtol(token, NULL, 10);
   7118
   7119   if (height > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
   7120   if (width > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
   7121
   7122   *x = width;
   7123   *y = height;
   7124
   7125   if (comp) *comp = 3;
   7126   if (req_comp == 0) req_comp = 3;
   7127
   7128   if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
   7129      return stbi__errpf("too large", "HDR image is too large");
   7130
   7131   // Read data
   7132   hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
   7133   if (!hdr_data)
   7134      return stbi__errpf("outofmem", "Out of memory");
   7135
   7136   // Load image data
   7137   // image data is stored as some number of sca
   7138   if ( width < 8 || width >= 32768) {
   7139      // Read flat data
   7140      for (j=0; j < height; ++j) {
   7141         for (i=0; i < width; ++i) {
   7142            stbi_uc rgbe[4];
   7143           main_decode_loop:
   7144            stbi__getn(s, rgbe, 4);
   7145            stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
   7146         }
   7147      }
   7148   } else {
   7149      // Read RLE-encoded data
   7150      scanline = NULL;
   7151
   7152      for (j = 0; j < height; ++j) {
   7153         c1 = stbi__get8(s);
   7154         c2 = stbi__get8(s);
   7155         len = stbi__get8(s);
   7156         if (c1 != 2 || c2 != 2 || (len & 0x80)) {
   7157            // not run-length encoded, so we have to actually use THIS data as a decoded
   7158            // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
   7159            stbi_uc rgbe[4];
   7160            rgbe[0] = (stbi_uc) c1;
   7161            rgbe[1] = (stbi_uc) c2;
   7162            rgbe[2] = (stbi_uc) len;
   7163            rgbe[3] = (stbi_uc) stbi__get8(s);
   7164            stbi__hdr_convert(hdr_data, rgbe, req_comp);
   7165            i = 1;
   7166            j = 0;
   7167            STBI_FREE(scanline);
   7168            goto main_decode_loop; // yes, this makes no sense
   7169         }
   7170         len <<= 8;
   7171         len |= stbi__get8(s);
   7172         if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
   7173         if (scanline == NULL) {
   7174            scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0);
   7175            if (!scanline) {
   7176               STBI_FREE(hdr_data);
   7177               return stbi__errpf("outofmem", "Out of memory");
   7178            }
   7179         }
   7180
   7181         for (k = 0; k < 4; ++k) {
   7182            int nleft;
   7183            i = 0;
   7184            while ((nleft = width - i) > 0) {
   7185               count = stbi__get8(s);
   7186               if (count > 128) {
   7187                  // Run
   7188                  value = stbi__get8(s);
   7189                  count -= 128;
   7190                  if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
   7191                  for (z = 0; z < count; ++z)
   7192                     scanline[i++ * 4 + k] = value;
   7193               } else {
   7194                  // Dump
   7195                  if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
   7196                  for (z = 0; z < count; ++z)
   7197                     scanline[i++ * 4 + k] = stbi__get8(s);
   7198               }
   7199            }
   7200         }
   7201         for (i=0; i < width; ++i)
   7202            stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
   7203      }
   7204      if (scanline)
   7205         STBI_FREE(scanline);
   7206   }
   7207
   7208   return hdr_data;
   7209}
   7210
   7211static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
   7212{
   7213   char buffer[STBI__HDR_BUFLEN];
   7214   char *token;
   7215   int valid = 0;
   7216   int dummy;
   7217
   7218   if (!x) x = &dummy;
   7219   if (!y) y = &dummy;
   7220   if (!comp) comp = &dummy;
   7221
   7222   if (stbi__hdr_test(s) == 0) {
   7223       stbi__rewind( s );
   7224       return 0;
   7225   }
   7226
   7227   for(;;) {
   7228      token = stbi__hdr_gettoken(s,buffer);
   7229      if (token[0] == 0) break;
   7230      if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   7231   }
   7232
   7233   if (!valid) {
   7234       stbi__rewind( s );
   7235       return 0;
   7236   }
   7237   token = stbi__hdr_gettoken(s,buffer);
   7238   if (strncmp(token, "-Y ", 3)) {
   7239       stbi__rewind( s );
   7240       return 0;
   7241   }
   7242   token += 3;
   7243   *y = (int) strtol(token, &token, 10);
   7244   while (*token == ' ') ++token;
   7245   if (strncmp(token, "+X ", 3)) {
   7246       stbi__rewind( s );
   7247       return 0;
   7248   }
   7249   token += 3;
   7250   *x = (int) strtol(token, NULL, 10);
   7251   *comp = 3;
   7252   return 1;
   7253}
   7254#endif // STBI_NO_HDR
   7255
   7256#ifndef STBI_NO_BMP
   7257static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
   7258{
   7259   void *p;
   7260   stbi__bmp_data info;
   7261
   7262   info.all_a = 255;
   7263   p = stbi__bmp_parse_header(s, &info);
   7264   if (p == NULL) {
   7265      stbi__rewind( s );
   7266      return 0;
   7267   }
   7268   if (x) *x = s->img_x;
   7269   if (y) *y = s->img_y;
   7270   if (comp) {
   7271      if (info.bpp == 24 && info.ma == 0xff000000)
   7272         *comp = 3;
   7273      else
   7274         *comp = info.ma ? 4 : 3;
   7275   }
   7276   return 1;
   7277}
   7278#endif
   7279
   7280#ifndef STBI_NO_PSD
   7281static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
   7282{
   7283   int channelCount, dummy, depth;
   7284   if (!x) x = &dummy;
   7285   if (!y) y = &dummy;
   7286   if (!comp) comp = &dummy;
   7287   if (stbi__get32be(s) != 0x38425053) {
   7288       stbi__rewind( s );
   7289       return 0;
   7290   }
   7291   if (stbi__get16be(s) != 1) {
   7292       stbi__rewind( s );
   7293       return 0;
   7294   }
   7295   stbi__skip(s, 6);
   7296   channelCount = stbi__get16be(s);
   7297   if (channelCount < 0 || channelCount > 16) {
   7298       stbi__rewind( s );
   7299       return 0;
   7300   }
   7301   *y = stbi__get32be(s);
   7302   *x = stbi__get32be(s);
   7303   depth = stbi__get16be(s);
   7304   if (depth != 8 && depth != 16) {
   7305       stbi__rewind( s );
   7306       return 0;
   7307   }
   7308   if (stbi__get16be(s) != 3) {
   7309       stbi__rewind( s );
   7310       return 0;
   7311   }
   7312   *comp = 4;
   7313   return 1;
   7314}
   7315
   7316static int stbi__psd_is16(stbi__context *s)
   7317{
   7318   int channelCount, depth;
   7319   if (stbi__get32be(s) != 0x38425053) {
   7320       stbi__rewind( s );
   7321       return 0;
   7322   }
   7323   if (stbi__get16be(s) != 1) {
   7324       stbi__rewind( s );
   7325       return 0;
   7326   }
   7327   stbi__skip(s, 6);
   7328   channelCount = stbi__get16be(s);
   7329   if (channelCount < 0 || channelCount > 16) {
   7330       stbi__rewind( s );
   7331       return 0;
   7332   }
   7333   STBI_NOTUSED(stbi__get32be(s));
   7334   STBI_NOTUSED(stbi__get32be(s));
   7335   depth = stbi__get16be(s);
   7336   if (depth != 16) {
   7337       stbi__rewind( s );
   7338       return 0;
   7339   }
   7340   return 1;
   7341}
   7342#endif
   7343
   7344#ifndef STBI_NO_PIC
   7345static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
   7346{
   7347   int act_comp=0,num_packets=0,chained,dummy;
   7348   stbi__pic_packet packets[10];
   7349
   7350   if (!x) x = &dummy;
   7351   if (!y) y = &dummy;
   7352   if (!comp) comp = &dummy;
   7353
   7354   if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
   7355      stbi__rewind(s);
   7356      return 0;
   7357   }
   7358
   7359   stbi__skip(s, 88);
   7360
   7361   *x = stbi__get16be(s);
   7362   *y = stbi__get16be(s);
   7363   if (stbi__at_eof(s)) {
   7364      stbi__rewind( s);
   7365      return 0;
   7366   }
   7367   if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
   7368      stbi__rewind( s );
   7369      return 0;
   7370   }
   7371
   7372   stbi__skip(s, 8);
   7373
   7374   do {
   7375      stbi__pic_packet *packet;
   7376
   7377      if (num_packets==sizeof(packets)/sizeof(packets[0]))
   7378         return 0;
   7379
   7380      packet = &packets[num_packets++];
   7381      chained = stbi__get8(s);
   7382      packet->size    = stbi__get8(s);
   7383      packet->type    = stbi__get8(s);
   7384      packet->channel = stbi__get8(s);
   7385      act_comp |= packet->channel;
   7386
   7387      if (stbi__at_eof(s)) {
   7388          stbi__rewind( s );
   7389          return 0;
   7390      }
   7391      if (packet->size != 8) {
   7392          stbi__rewind( s );
   7393          return 0;
   7394      }
   7395   } while (chained);
   7396
   7397   *comp = (act_comp & 0x10 ? 4 : 3);
   7398
   7399   return 1;
   7400}
   7401#endif
   7402
   7403// *************************************************************************************************
   7404// Portable Gray Map and Portable Pixel Map loader
   7405// by Ken Miller
   7406//
   7407// PGM: http://netpbm.sourceforge.net/doc/pgm.html
   7408// PPM: http://netpbm.sourceforge.net/doc/ppm.html
   7409//
   7410// Known limitations:
   7411//    Does not support comments in the header section
   7412//    Does not support ASCII image data (formats P2 and P3)
   7413
   7414#ifndef STBI_NO_PNM
   7415
   7416static int      stbi__pnm_test(stbi__context *s)
   7417{
   7418   char p, t;
   7419   p = (char) stbi__get8(s);
   7420   t = (char) stbi__get8(s);
   7421   if (p != 'P' || (t != '5' && t != '6')) {
   7422       stbi__rewind( s );
   7423       return 0;
   7424   }
   7425   return 1;
   7426}
   7427
   7428static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   7429{
   7430   stbi_uc *out;
   7431   STBI_NOTUSED(ri);
   7432
   7433   ri->bits_per_channel = stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n);
   7434   if (ri->bits_per_channel == 0)
   7435      return 0;
   7436
   7437   if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   7438   if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   7439
   7440   *x = s->img_x;
   7441   *y = s->img_y;
   7442   if (comp) *comp = s->img_n;
   7443
   7444   if (!stbi__mad4sizes_valid(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0))
   7445      return stbi__errpuc("too large", "PNM too large");
   7446
   7447   out = (stbi_uc *) stbi__malloc_mad4(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0);
   7448   if (!out) return stbi__errpuc("outofmem", "Out of memory");
   7449   stbi__getn(s, out, s->img_n * s->img_x * s->img_y * (ri->bits_per_channel / 8));
   7450
   7451   if (req_comp && req_comp != s->img_n) {
   7452      out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
   7453      if (out == NULL) return out; // stbi__convert_format frees input on failure
   7454   }
   7455   return out;
   7456}
   7457
   7458static int      stbi__pnm_isspace(char c)
   7459{
   7460   return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
   7461}
   7462
   7463static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
   7464{
   7465   for (;;) {
   7466      while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
   7467         *c = (char) stbi__get8(s);
   7468
   7469      if (stbi__at_eof(s) || *c != '#')
   7470         break;
   7471
   7472      while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
   7473         *c = (char) stbi__get8(s);
   7474   }
   7475}
   7476
   7477static int      stbi__pnm_isdigit(char c)
   7478{
   7479   return c >= '0' && c <= '9';
   7480}
   7481
   7482static int      stbi__pnm_getinteger(stbi__context *s, char *c)
   7483{
   7484   int value = 0;
   7485
   7486   while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
   7487      value = value*10 + (*c - '0');
   7488      *c = (char) stbi__get8(s);
   7489   }
   7490
   7491   return value;
   7492}
   7493
   7494static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
   7495{
   7496   int maxv, dummy;
   7497   char c, p, t;
   7498
   7499   if (!x) x = &dummy;
   7500   if (!y) y = &dummy;
   7501   if (!comp) comp = &dummy;
   7502
   7503   stbi__rewind(s);
   7504
   7505   // Get identifier
   7506   p = (char) stbi__get8(s);
   7507   t = (char) stbi__get8(s);
   7508   if (p != 'P' || (t != '5' && t != '6')) {
   7509       stbi__rewind(s);
   7510       return 0;
   7511   }
   7512
   7513   *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
   7514
   7515   c = (char) stbi__get8(s);
   7516   stbi__pnm_skip_whitespace(s, &c);
   7517
   7518   *x = stbi__pnm_getinteger(s, &c); // read width
   7519   stbi__pnm_skip_whitespace(s, &c);
   7520
   7521   *y = stbi__pnm_getinteger(s, &c); // read height
   7522   stbi__pnm_skip_whitespace(s, &c);
   7523
   7524   maxv = stbi__pnm_getinteger(s, &c);  // read max value
   7525   if (maxv > 65535)
   7526      return stbi__err("max value > 65535", "PPM image supports only 8-bit and 16-bit images");
   7527   else if (maxv > 255)
   7528      return 16;
   7529   else
   7530      return 8;
   7531}
   7532
   7533static int stbi__pnm_is16(stbi__context *s)
   7534{
   7535   if (stbi__pnm_info(s, NULL, NULL, NULL) == 16)
   7536	   return 1;
   7537   return 0;
   7538}
   7539#endif
   7540
   7541static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
   7542{
   7543   #ifndef STBI_NO_JPEG
   7544   if (stbi__jpeg_info(s, x, y, comp)) return 1;
   7545   #endif
   7546
   7547   #ifndef STBI_NO_PNG
   7548   if (stbi__png_info(s, x, y, comp))  return 1;
   7549   #endif
   7550
   7551   #ifndef STBI_NO_GIF
   7552   if (stbi__gif_info(s, x, y, comp))  return 1;
   7553   #endif
   7554
   7555   #ifndef STBI_NO_BMP
   7556   if (stbi__bmp_info(s, x, y, comp))  return 1;
   7557   #endif
   7558
   7559   #ifndef STBI_NO_PSD
   7560   if (stbi__psd_info(s, x, y, comp))  return 1;
   7561   #endif
   7562
   7563   #ifndef STBI_NO_PIC
   7564   if (stbi__pic_info(s, x, y, comp))  return 1;
   7565   #endif
   7566
   7567   #ifndef STBI_NO_PNM
   7568   if (stbi__pnm_info(s, x, y, comp))  return 1;
   7569   #endif
   7570
   7571   #ifndef STBI_NO_HDR
   7572   if (stbi__hdr_info(s, x, y, comp))  return 1;
   7573   #endif
   7574
   7575   // test tga last because it's a crappy test!
   7576   #ifndef STBI_NO_TGA
   7577   if (stbi__tga_info(s, x, y, comp))
   7578       return 1;
   7579   #endif
   7580   return stbi__err("unknown image type", "Image not of any known type, or corrupt");
   7581}
   7582
   7583static int stbi__is_16_main(stbi__context *s)
   7584{
   7585   #ifndef STBI_NO_PNG
   7586   if (stbi__png_is16(s))  return 1;
   7587   #endif
   7588
   7589   #ifndef STBI_NO_PSD
   7590   if (stbi__psd_is16(s))  return 1;
   7591   #endif
   7592
   7593   #ifndef STBI_NO_PNM
   7594   if (stbi__pnm_is16(s))  return 1;
   7595   #endif
   7596   return 0;
   7597}
   7598
   7599#ifndef STBI_NO_STDIO
   7600STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
   7601{
   7602    FILE *f = stbi__fopen(filename, "rb");
   7603    int result;
   7604    if (!f) return stbi__err("can't fopen", "Unable to open file");
   7605    result = stbi_info_from_file(f, x, y, comp);
   7606    fclose(f);
   7607    return result;
   7608}
   7609
   7610STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
   7611{
   7612   int r;
   7613   stbi__context s;
   7614   long pos = ftell(f);
   7615   stbi__start_file(&s, f);
   7616   r = stbi__info_main(&s,x,y,comp);
   7617   fseek(f,pos,SEEK_SET);
   7618   return r;
   7619}
   7620
   7621STBIDEF int stbi_is_16_bit(char const *filename)
   7622{
   7623    FILE *f = stbi__fopen(filename, "rb");
   7624    int result;
   7625    if (!f) return stbi__err("can't fopen", "Unable to open file");
   7626    result = stbi_is_16_bit_from_file(f);
   7627    fclose(f);
   7628    return result;
   7629}
   7630
   7631STBIDEF int stbi_is_16_bit_from_file(FILE *f)
   7632{
   7633   int r;
   7634   stbi__context s;
   7635   long pos = ftell(f);
   7636   stbi__start_file(&s, f);
   7637   r = stbi__is_16_main(&s);
   7638   fseek(f,pos,SEEK_SET);
   7639   return r;
   7640}
   7641#endif // !STBI_NO_STDIO
   7642
   7643STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   7644{
   7645   stbi__context s;
   7646   stbi__start_mem(&s,buffer,len);
   7647   return stbi__info_main(&s,x,y,comp);
   7648}
   7649
   7650STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
   7651{
   7652   stbi__context s;
   7653   stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
   7654   return stbi__info_main(&s,x,y,comp);
   7655}
   7656
   7657STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len)
   7658{
   7659   stbi__context s;
   7660   stbi__start_mem(&s,buffer,len);
   7661   return stbi__is_16_main(&s);
   7662}
   7663
   7664STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user)
   7665{
   7666   stbi__context s;
   7667   stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
   7668   return stbi__is_16_main(&s);
   7669}
   7670
   7671#endif // STB_IMAGE_IMPLEMENTATION
   7672
   7673/*
   7674   revision history:
   7675      2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
   7676      2.19  (2018-02-11) fix warning
   7677      2.18  (2018-01-30) fix warnings
   7678      2.17  (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
   7679                         1-bit BMP
   7680                         *_is_16_bit api
   7681                         avoid warnings
   7682      2.16  (2017-07-23) all functions have 16-bit variants;
   7683                         STBI_NO_STDIO works again;
   7684                         compilation fixes;
   7685                         fix rounding in unpremultiply;
   7686                         optimize vertical flip;
   7687                         disable raw_len validation;
   7688                         documentation fixes
   7689      2.15  (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
   7690                         warning fixes; disable run-time SSE detection on gcc;
   7691                         uniform handling of optional "return" values;
   7692                         thread-safe initialization of zlib tables
   7693      2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
   7694      2.13  (2016-11-29) add 16-bit API, only supported for PNG right now
   7695      2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
   7696      2.11  (2016-04-02) allocate large structures on the stack
   7697                         remove white matting for transparent PSD
   7698                         fix reported channel count for PNG & BMP
   7699                         re-enable SSE2 in non-gcc 64-bit
   7700                         support RGB-formatted JPEG
   7701                         read 16-bit PNGs (only as 8-bit)
   7702      2.10  (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
   7703      2.09  (2016-01-16) allow comments in PNM files
   7704                         16-bit-per-pixel TGA (not bit-per-component)
   7705                         info() for TGA could break due to .hdr handling
   7706                         info() for BMP to shares code instead of sloppy parse
   7707                         can use STBI_REALLOC_SIZED if allocator doesn't support realloc
   7708                         code cleanup
   7709      2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
   7710      2.07  (2015-09-13) fix compiler warnings
   7711                         partial animated GIF support
   7712                         limited 16-bpc PSD support
   7713                         #ifdef unused functions
   7714                         bug with < 92 byte PIC,PNM,HDR,TGA
   7715      2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
   7716      2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
   7717      2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
   7718      2.03  (2015-04-12) extra corruption checking (mmozeiko)
   7719                         stbi_set_flip_vertically_on_load (nguillemot)
   7720                         fix NEON support; fix mingw support
   7721      2.02  (2015-01-19) fix incorrect assert, fix warning
   7722      2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
   7723      2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
   7724      2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
   7725                         progressive JPEG (stb)
   7726                         PGM/PPM support (Ken Miller)
   7727                         STBI_MALLOC,STBI_REALLOC,STBI_FREE
   7728                         GIF bugfix -- seemingly never worked
   7729                         STBI_NO_*, STBI_ONLY_*
   7730      1.48  (2014-12-14) fix incorrectly-named assert()
   7731      1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
   7732                         optimize PNG (ryg)
   7733                         fix bug in interlaced PNG with user-specified channel count (stb)
   7734      1.46  (2014-08-26)
   7735              fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
   7736      1.45  (2014-08-16)
   7737              fix MSVC-ARM internal compiler error by wrapping malloc
   7738      1.44  (2014-08-07)
   7739              various warning fixes from Ronny Chevalier
   7740      1.43  (2014-07-15)
   7741              fix MSVC-only compiler problem in code changed in 1.42
   7742      1.42  (2014-07-09)
   7743              don't define _CRT_SECURE_NO_WARNINGS (affects user code)
   7744              fixes to stbi__cleanup_jpeg path
   7745              added STBI_ASSERT to avoid requiring assert.h
   7746      1.41  (2014-06-25)
   7747              fix search&replace from 1.36 that messed up comments/error messages
   7748      1.40  (2014-06-22)
   7749              fix gcc struct-initialization warning
   7750      1.39  (2014-06-15)
   7751              fix to TGA optimization when req_comp != number of components in TGA;
   7752              fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
   7753              add support for BMP version 5 (more ignored fields)
   7754      1.38  (2014-06-06)
   7755              suppress MSVC warnings on integer casts truncating values
   7756              fix accidental rename of 'skip' field of I/O
   7757      1.37  (2014-06-04)
   7758              remove duplicate typedef
   7759      1.36  (2014-06-03)
   7760              convert to header file single-file library
   7761              if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
   7762      1.35  (2014-05-27)
   7763              various warnings
   7764              fix broken STBI_SIMD path
   7765              fix bug where stbi_load_from_file no longer left file pointer in correct place
   7766              fix broken non-easy path for 32-bit BMP (possibly never used)
   7767              TGA optimization by Arseny Kapoulkine
   7768      1.34  (unknown)
   7769              use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
   7770      1.33  (2011-07-14)
   7771              make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
   7772      1.32  (2011-07-13)
   7773              support for "info" function for all supported filetypes (SpartanJ)
   7774      1.31  (2011-06-20)
   7775              a few more leak fixes, bug in PNG handling (SpartanJ)
   7776      1.30  (2011-06-11)
   7777              added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
   7778              removed deprecated format-specific test/load functions
   7779              removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
   7780              error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
   7781              fix inefficiency in decoding 32-bit BMP (David Woo)
   7782      1.29  (2010-08-16)
   7783              various warning fixes from Aurelien Pocheville
   7784      1.28  (2010-08-01)
   7785              fix bug in GIF palette transparency (SpartanJ)
   7786      1.27  (2010-08-01)
   7787              cast-to-stbi_uc to fix warnings
   7788      1.26  (2010-07-24)
   7789              fix bug in file buffering for PNG reported by SpartanJ
   7790      1.25  (2010-07-17)
   7791              refix trans_data warning (Won Chun)
   7792      1.24  (2010-07-12)
   7793              perf improvements reading from files on platforms with lock-heavy fgetc()
   7794              minor perf improvements for jpeg
   7795              deprecated type-specific functions so we'll get feedback if they're needed
   7796              attempt to fix trans_data warning (Won Chun)
   7797      1.23    fixed bug in iPhone support
   7798      1.22  (2010-07-10)
   7799              removed image *writing* support
   7800              stbi_info support from Jetro Lauha
   7801              GIF support from Jean-Marc Lienher
   7802              iPhone PNG-extensions from James Brown
   7803              warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
   7804      1.21    fix use of 'stbi_uc' in header (reported by jon blow)
   7805      1.20    added support for Softimage PIC, by Tom Seddon
   7806      1.19    bug in interlaced PNG corruption check (found by ryg)
   7807      1.18  (2008-08-02)
   7808              fix a threading bug (local mutable static)
   7809      1.17    support interlaced PNG
   7810      1.16    major bugfix - stbi__convert_format converted one too many pixels
   7811      1.15    initialize some fields for thread safety
   7812      1.14    fix threadsafe conversion bug
   7813              header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
   7814      1.13    threadsafe
   7815      1.12    const qualifiers in the API
   7816      1.11    Support installable IDCT, colorspace conversion routines
   7817      1.10    Fixes for 64-bit (don't use "unsigned long")
   7818              optimized upsampling by Fabian "ryg" Giesen
   7819      1.09    Fix format-conversion for PSD code (bad global variables!)
   7820      1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
   7821      1.07    attempt to fix C++ warning/errors again
   7822      1.06    attempt to fix C++ warning/errors again
   7823      1.05    fix TGA loading to return correct *comp and use good luminance calc
   7824      1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
   7825      1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
   7826      1.02    support for (subset of) HDR files, float interface for preferred access to them
   7827      1.01    fix bug: possible bug in handling right-side up bmps... not sure
   7828              fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
   7829      1.00    interface to zlib that skips zlib header
   7830      0.99    correct handling of alpha in palette
   7831      0.98    TGA loader by lonesock; dynamically add loaders (untested)
   7832      0.97    jpeg errors on too large a file; also catch another malloc failure
   7833      0.96    fix detection of invalid v value - particleman@mollyrocket forum
   7834      0.95    during header scan, seek to markers in case of padding
   7835      0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
   7836      0.93    handle jpegtran output; verbose errors
   7837      0.92    read 4,8,16,24,32-bit BMP files of several formats
   7838      0.91    output 24-bit Windows 3.0 BMP files
   7839      0.90    fix a few more warnings; bump version number to approach 1.0
   7840      0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
   7841      0.60    fix compiling as c++
   7842      0.59    fix warnings: merge Dave Moore's -Wall fixes
   7843      0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
   7844      0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
   7845      0.56    fix bug: zlib uncompressed mode len vs. nlen
   7846      0.55    fix bug: restart_interval not initialized to 0
   7847      0.54    allow NULL for 'int *comp'
   7848      0.53    fix bug in png 3->4; speedup png decoding
   7849      0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
   7850      0.51    obey req_comp requests, 1-component jpegs return as 1-component,
   7851              on 'test' only check type, not whether we support this variant
   7852      0.50  (2006-11-19)
   7853              first released version
   7854*/
   7855
   7856
   7857/*
   7858------------------------------------------------------------------------------
   7859This software is available under 2 licenses -- choose whichever you prefer.
   7860------------------------------------------------------------------------------
   7861ALTERNATIVE A - MIT License
   7862Copyright (c) 2017 Sean Barrett
   7863Permission is hereby granted, free of charge, to any person obtaining a copy of
   7864this software and associated documentation files (the "Software"), to deal in
   7865the Software without restriction, including without limitation the rights to
   7866use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
   7867of the Software, and to permit persons to whom the Software is furnished to do
   7868so, subject to the following conditions:
   7869The above copyright notice and this permission notice shall be included in all
   7870copies or substantial portions of the Software.
   7871THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   7872IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   7873FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
   7874AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
   7875LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
   7876OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   7877SOFTWARE.
   7878------------------------------------------------------------------------------
   7879ALTERNATIVE B - Public Domain (www.unlicense.org)
   7880This is free and unencumbered software released into the public domain.
   7881Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
   7882software, either in source code form or as a compiled binary, for any purpose,
   7883commercial or non-commercial, and by any means.
   7884In jurisdictions that recognize copyright laws, the author or authors of this
   7885software dedicate any and all copyright interest in the software to the public
   7886domain. We make this dedication for the benefit of the public at large and to
   7887the detriment of our heirs and successors. We intend this dedication to be an
   7888overt act of relinquishment in perpetuity of all present and future rights to
   7889this software under copyright law.
   7890THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   7891IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   7892FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
   7893AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   7894ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
   7895WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
   7896------------------------------------------------------------------------------
   7897*/