// Copyright (c) 2014-2025 The Khronos Group Inc.
// Copyright notice at https://www.khronos.org/registry/speccopyright.html
[[ASTC]]
== ASTC Compressed Texture Image Formats

_This description is derived from the Khronos
link:https://www.khronos.org/registry/gles/extensions/OES/OES_texture_compression_astc.txt[OES_texture_compression_astc]
OpenGL extension._

=== What is ASTC?

ASTC stands for Adaptive Scalable Texture Compression.
The ASTC formats form a family of related compressed texture image
formats. They are all derived from a common set of definitions.

ASTC textures may be either 2D or 3D.

ASTC textures may be encoded using either high or low dynamic range.
Low dynamic range images may optionally be specified using the sRGB
transfer function for the _RGB_ channels.

Two sub-profiles (``LDR Profile'' and ``HDR Profile'') may be implemented,
which support only 2D images at low or high dynamic range respectively.

ASTC textures may be encoded as 1, 2, 3 or 4 components, but they are
all decoded into _RGBA_. ASTC has a variable block size.

<<<

=== Design Goals

The design goals for the format are as follows:

* Random access. This is a must for any texture compression format.
* Bit-exact decode. This is a must for conformance testing and
  reproducibility.
* Suitable for mobile use. The format should be suitable for both
  desktop and mobile GPU environments. It should be low bandwidth
  and low in area.
* Flexible choice of bit rate. Current formats only offer a few bit
  rates, leaving content developers with only coarse control over
  the size/quality trade-off.
* Scalable and long-lived. The format should support existing _R_, _RG_,
  _RGB_ and _RGBA_ image types, and also have high ``headroom'', allowing
  continuing use for several years and the ability to innovate in
  encoders. Part of this is the choice to include HDR and 3D.
* Feature orthogonality. The choices for the various features of the
  format are all orthogonal to each other. This has three effects:
  first, it allows a large, flexible configuration space; second,
  it makes that space easier to understand; and third, it makes
  verification easier.
* Best in class at given bit rate. It should beat or match the current
  best in class for peak signal-to-noise ratio (PSNR) at all bit rates.
* Fast decode. Texel throughput for a cached texture should be one
  texel decode per clock cycle per decoder. Parallel decoding of several
  texels from the same block should be possible at incremental cost.
* Low bandwidth. The encoding scheme should ensure that memory access
  is kept to a minimum, cache reuse is high and memory bandwidth for
  the format is low.
* Low area. It must occupy comparable die size to competing formats.

NOTE: There are a number of implementations in the wild which have
small inaccuracies in the decoded result.
Future hardware should be bit-exact, so software should rely on the
behavior documented in this specification.

<<<

=== Basic Concepts

ASTC is a block-based lossy compression format. The compressed image
is divided into a number of blocks of uniform size, which makes it
possible to quickly determine which block a given texel resides in.

Each block has a fixed memory footprint of 128 bits, but these bits
can represent varying numbers of texels (the block ``footprint'').

NOTE: The term ``block footprint'' in ASTC refers to the same concept
as ``compressed texel block dimensions'' elsewhere in the Data Format
Specification.

Block footprint sizes are not confined to powers-of-two, and are
also not confined to be square. They may be 2D, in which case the
block dimensions range from 4 to 12 texels, or 3D, in which case
the block dimensions range from 3 to 6 texels.

Decoding one texel requires only the data from a single block. This
simplifies cache design, reduces bandwidth and improves encoder throughput.

[NOTE]
====
ASTC has multiple types of ``mode'', which should not be conflated:

* ASTC has three _operation modes_: sRGB (which is inherently LDR),
linear LDR and HDR.
These operation modes apply to the entire texture, and are configured by
the user or implementation.
<<astc_ldr_profile,LDR profile>> indicates that HDR operation mode is
unavailable; otherwise implementations may treat linear LDR operation
mode as HDR operation mode if not restricted by other configuration.

* ASTC has three _decode modes_: ++decode_float16++, ++decode_unorm8++ and
++decode_rgb9e5++.
These decode modes indicate how texels are processed during the decoding
operation, and, when available, are configured by the user for the entire
texture.
Implementations must provide at least the ++decode_float16++ decode mode.

* ASTC texel blocks have different _block modes_: these allow different
combinations of endpoints and other configuration data on a per-block
basis.

* ASTC texel blocks have multiple _endpoint modes_: these represent
different ways to encode the interpolation endpoints in a texel block.
Some _endpoint modes_ support HDR values and are collectively known as
_HDR endpoint modes_, and others are restricted to the LDR range and are
collectively known as _LDR endpoint modes_.
_HDR endpoint modes_ are decoded as the error color except in _HDR operation mode_.

* Some _HDR endpoint modes_ have further configuration _mode bits_.

* An ASTC texel block may optionally operate in _dual-plane mode_,
which allows different weights for different endpoint components.

====

<<<
=== Block Encoding

To understand how the blocks are stored and decoded, it is useful to start
with a simple example, and then introduce additional features.

The simplest block encoding starts by defining two color ``endpoints''. The
endpoints define two colors, and a number of additional colors are generated
by interpolating between them. We can define these colors using 1, 2, 3,
or 4 components (usually corresponding to _R_, _RG_, _RGB_ and _RGBA_ textures),
and using low or high dynamic range.

We then store a color interpolant weight for each texel in the image, which
specifies how to calculate the color to use. From this, a weighted average
of the two endpoint colors is used to generate the intermediate color,
which is the returned color for this texel.

There are several different ways of specifying the endpoint colors, and the
weights, but once they have been defined, calculation of the texel colors
proceeds identically for all of them. Each block is free to choose whichever
encoding scheme best represents its color endpoints, within the constraint
that all the data fits within the 128 bit block.

For blocks which have a large number of texels (e.g. a 12{times}12 block), there is
not enough space to explicitly store a weight for every texel. In this case,
a sparser grid with fewer weights is stored, and interpolation is used to
determine the effective weight to be used for each texel position. This allows
very low bit rates to be used with acceptable quality. This can also be used
to more efficiently encode blocks with low detail, or with strong vertical
or horizontal features.

For blocks which have a mixture of disparate colors, a single line in the
color space is not a good fit to the colors of the pixels in the original
image. It is therefore possible to partition the texels into multiple sets,
the pixels within each set having similar colors. For each of these
``partitions'', we specify separate endpoint pairs, and choose which pair of
endpoints to use for a particular texel by looking up the _partition index_
from a partitioning pattern table. In ASTC, this partition table is actually
implemented as a function.

The endpoint encoding for each partition is independent.

For blocks which have uncorrelated channels -- for example an image with a
transparency mask, or an image used as a normal map -- it may be necessary
to specify two weights for each texel. Interpolation between the components
of the endpoint colors can then proceed independently for each ``plane'' of
the image. The assignment of channels to planes is selectable.

Since each of the above options is independent, it is possible to specify any
combination of channels, endpoint color encoding, weight encoding,
interpolation, multiple partitions and single or dual planes.

Since these values are specified per block, it is important that they are
represented with the minimum possible number of bits. As a result, these
values are packed together in ways which can be difficult to read, but
which are nevertheless highly amenable to hardware decode.

All of the values used as weights and color endpoint values can be specified
with a variable number of bits. The encoding scheme used allows a fine-grained
trade-off between weight bits and color endpoint bits using ``integer
sequence encoding''. This can pack adjacent values together, allowing us to
use fractional numbers of bits per value.

Finally, a block may be just a single color. This is a so-called ``void
extent block'' and has a special coding which also allows it to identify
nearby regions of single color. This may be used to short-circuit fetching of
what would be identical blocks, and further reduce memory bandwidth.


<<<

=== sRGB, Linear LDR and HDR Operation Modes

The decoding process can be simplified if it is known in advance that
sRGB output or HDR support is required.
This selection is therefore included as part of the global configuration.
Implementations must support at least sRGB operation mode, and either or
both of linear LDR and HDR operation mode as described in <<astc_ldr_profile>>
and <<astc_hdr_profile>>; linear LDR operation mode is non-optional unless
an implementation supports HDR operation mode and no interface is provided
to select between Linear LDR and HDR operation modes.
Decoders may not support decode modes other than ++decode_float16++, and
may substitute HDR operation mode for a requested linear LDR operation mode
if not constrained by ++decode_unorm8++ support.

The operation modes differ in various ways, as shown in
<<astc-differences-between-ldr-and-hdr-modes>>.

[[astc-differences-between-ldr-and-hdr-modes]]
.ASTC differences between sRGB, linear LDR and HDR operation modes
[options="header"]
|======================
|  | sRGB operation mode | Linear LDR operation mode | HDR operation mode
| Endpoint decoding precision | 8 bits | 16 bits | 16 bits
| _HDR endpoint mode_ results | Error color | Error color | As decoded
|======================

Except in sRGB mode, the type of the values returned by the decoding process
for different operation modes is determined by the decode mode as shown in
<<astc_decode_modes>>.

[[astc_decode_modes]]
.ASTC decode modes
[options="header",cols="1,2,1"]
|================
| Decode mode | Linear LDR operation mode | HDR operation mode
| ++decode_float16++ 2+| Vector of FP16 values
| ++decode_unorm8++ | Vector of 8-bit unsigned normalized values | Invalid
| ++decode_rgb9e5++ 2+| Vector using a shared exponent format
|================

++decode_float16++ returns a value encoded as <<16bitfp,16-bit floating point>>.

++decode_unorm8++ returns a value encoded as an unsigned, normalized 8-bit value
-- that is, the encoded value in the range 0..1.0 is the ++unorm++ integer value
divided by 255.

++decode_rgb9e5++ returns an RGB result encoded as a 32-bit value which contains
three channel mantissae with a shared exponent, as provided as an example in
<<ConversionFormulae>> (and, were it written to memory, described by the example
descriptor block <<exampledescriptor_e5b9g9r9>>).

For sRGB operation mode, the decode mode is ignored, and the decoding always
returns a vector of 8-bit unsigned normalized values.
To treat the result as linear-light results, the decoded values for
_R&prime;_, _G&prime;_ and _B&prime;_ (in the range 0..1) are then
converted with the <<TRANSFER_SRGB_EOTF,sRGB EOTF>>.

Using the ++decode_unorm8++ decoding mode in HDR operation mode gives
undefined results.

NOTE: Some implementations do not provide an explicit way to distinguish
linear LDR operation mode from HDR operation mode, and automatically
select HDR behavior in ++decode_float16++ or ++decode_rgb9e5++ decode modes.
Such implementations may still support ++decode_unorm8++ by implicitly
selecting linear LDR operation mode when this decode mode is chosen.

The error color for each operation mode is determined as shown in
<<astc_error_colors>>.

[[astc_error_colors]]
.ASTC error colors
[options="header"]
|================
| Decode mode        | sRGB operation mode | Linear LDR operation mode | HDR operation mode
| ++decode_float16++ | Magenta   | Magenta  or _NaNs_  | Magenta or _NaNs_
| ++decode_unorm8++  | Magenta   | Magenta             | Invalid
| ++decode_rgb9e5++  | Magenta   | Magenta             | Magenta
|================

The magenta error color is opaque fully-saturated magenta
(_R_,_G_,_B_,_A_)&nbsp;=&nbsp;(1.0,&nbsp;0.0,&nbsp;1.0,&nbsp;1.0).
This has been chosen as it is much more noticeable than black or white, and occurs
far less often in valid images.

The _NaNs_ error color is a vector of four _NaNs_
(_R_,_G_,_B_,_A_) = (_NaN_, _NaN_, _NaN_, _NaN_).
In this case, the recommended _NaN_ value returned is the bit pattern 0xFFFF.

The error color is returned as an informative response to invalid
conditions, including invalid block encodings or use of reserved _endpoint
modes_.
Future, forward-compatible extensions to ASTC
may define valid interpretations of these conditions, which will decode to
some other color.
Therefore, encoders and applications must not rely on invalid encodings as
a way of generating the error color.

=== Configuration Summary

The global configuration data for the format are as follows:

*   Block dimension (2D or 3D)
*   Block footprint size
*   sRGB, linear LDR or HDR operation mode*
*   Decode mode

(*Subject to an implementation implicitly promoting linear LDR operation mode to HDR operation mode.)

The data specified per block are as follows:

*   Texel weight grid size
*   Texel weight range
*   Texel weight values
*   Number of partitions
*   Partition pattern index
*   Color _endpoint modes_ (includes LDR or HDR selection)
*   Color endpoint data
*   Number of planes
*   Plane-to-channel assignment

<<<

=== Decode Procedure

To decode one texel:

-------------------
(Optimization: If within known void-extent, immediately return single color)

Find block containing texel
Read block mode
If void-extent block, store void extent and immediately return single color

For each plane in image
  If block mode requires infill
    Find and decode stored weights adjacent to texel, unquantize and interpolate
  Else
    Find and decode weight for texel, and unquantize

Read number of partitions
If number of partitions > 1
  Read partition table pattern index
  Look up partition number from pattern

Read color endpoint mode and endpoint data for selected partition
Unquantize color endpoints
Interpolate color endpoints using weight (or weights in dual-plane mode)
Return interpolated color
-------------------

<<<

=== Block Determination and Bit Rates

The block footprint is a global setting for any given texture, and is
therefore not encoded in the individual blocks.

For 2D textures, the block footprint's width and height are selectable
from a number of predefined sizes, namely 4, 5, 6, 8, 10 and 12 pixels.

For square and nearly-square blocks, this gives the bit rates in
<<astc_footprint_2D>>.

[[astc_footprint_2D]]
.ASTC 2D footprint and bit rates
[cols="1,1,1,1",width="50%"]
|======================
2+^| *Footprint* .2+^.^| *Bit Rate* .2+^.^| *Increment*
^| *Width* ^| *Height*
^| 4         ^| 4         ^| 8.00     ^| 125%
^| 5         ^| 4         ^| 6.40     ^| 125%
^| 5         ^| 5         ^| 5.12     ^| 120%
^| 6         ^| 5         ^| 4.27     ^| 120%
^| 6         ^| 6         ^| 3.56     ^| 114%
^| 8         ^| 5         ^| 3.20     ^| 120%
^| 8         ^| 6         ^| 2.67     ^| 105%
^| 10        ^| 5         ^| 2.56     ^| 120%
^| 10        ^| 6         ^| 2.13     ^| 107%
^| 8         ^| 8         ^| 2.00     ^| 125%
^| 10        ^| 8         ^| 1.60     ^| 125%
^| 10        ^| 10        ^| 1.28     ^| 120%
^| 12        ^| 10        ^| 1.07     ^| 120%
^| 12        ^| 12        ^| 0.89     ^|
|=============

The ``Increment'' column indicates the ratio of bit rate against the next
lower available rate. A consistent value in this column indicates an even
spread of bit rates.

For 3D textures, the block footprint's width, height and depth are
selectable from a number of predefined sizes, namely 3, 4, 5, and 6 pixels.

For cubic and near-cubic blocks, this gives the bit rates in
<<astc_footprint_3D>>.

[[astc_footprint_3D]]
.ASTC 3D footprint and bit rates
[cols="1,1,1,1,1",width="60%"]
|======================
3+^| *Block Footprint* .2+^.^| *Bit Rate* .2+^.^| *Increment*
^| *Width* ^| *Height* ^| *Depth*
^| 3     ^| 3      ^| 3     ^| 4.74 ^|   133%
^| 4     ^| 3      ^| 3     ^| 3.56 ^|   133%
^| 4     ^| 4      ^| 3     ^| 2.67 ^|   133%
^| 4     ^| 4      ^| 4     ^| 2.00 ^|   125%
^| 5     ^| 4      ^| 4     ^| 1.60 ^|   125%
^| 5     ^| 5      ^| 4     ^| 1.28 ^|   125%
^| 5     ^| 5      ^| 5     ^| 1.02 ^|   120%
^| 6     ^| 5      ^| 5     ^| 0.85 ^|   120%
^| 6     ^| 6      ^| 5     ^| 0.71 ^|   120%
^| 6     ^| 6      ^| 6     ^| 0.59 ^|
|======================

The full profile supports only those block footprints listed in
<<astc_footprint_2D>> and <<astc_footprint_3D>>. Other block sizes are not supported.

For images which are not an integer multiple of the block size, additional
texels are added to the edges with maximum X and Y (and Z for 3D textures).
These texels may be any color, as they will not be accessed.

Although these are not all powers of two, it is possible to calculate block
addresses and pixel addresses within the block, for legal image sizes,
without undue complexity.

Given an image which is _W_ {times} _H_ {times} _D_ pixels in size, with block size
_w_ {times} _h_ {times} _d_, the size of the image in blocks is:

[latexmath]
++++++++++++++++++++++
\begin{align*}
\textrm{B}_\textrm{w} & = \left\lceil { W \over w } \right\rceil \\
\textrm{B}_\textrm{h} & = \left\lceil { H \over h } \right\rceil \\
\textrm{B}_\textrm{d} & = \left\lceil { D \over d } \right\rceil
\end{align*}
++++++++++++++++++++++

For a 3D image built from 2D slices, each 2D slice is a single texel thick,
so that for an image which is _W_ {times} _H_ {times} _D_ pixels in size, with block size
_w_ {times} _h_, the size of the image in blocks is:

[latexmath]
++++++++++++++++++++++
\begin{align*}
\textrm{B}_\textrm{w} & = \left\lceil { W \over w } \right\rceil \\
\textrm{B}_\textrm{h} & = \left\lceil { H \over h } \right\rceil \\
\textrm{B}_\textrm{d} & = D
\end{align*}
++++++++++++++++++++++

=== Block Layout

Each block in the image is stored as a single 128-bit block in memory. These
blocks are laid out in raster order, starting with the block at (0,&nbsp;0,&nbsp;0), then
ordered sequentially by X, Y and finally Z (if present). They are aligned to
128-bit boundaries in memory.

The bits in the block are labeled in little-endian order -- the byte at the
lowest address contains bits 0..7. Bit 0 is the least significant bit in the
byte.

Each block has the same basic layout, shown in <<astc-block-layout>>.

[[astc-block-layout]]
.ASTC block layout
[width="97%",cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"]
|====================
^| ~127~ ^| ~126~ ^| ~125~ ^| ~124~ ^| ~123~ ^| ~122~ ^| ~121~ ^| ~120~ ^| ~119~ ^| ~118~ ^| ~117~ ^| ~116~ ^| ~115~ ^| ~114~ ^| ~113~ ^| ~112~
12+| &nbsp; Texel weight data (variable width) 4+| &nbsp; Fill direction latexmath:[$\rightarrow$]
^| ~111~ ^| ~110~ ^| ~109~ ^| ~108~ ^| ~107~ ^| ~106~ ^| ~105~ ^| ~104~ ^| ~103~ ^| ~102~ ^| ~101~ ^| ~100~ ^| ~99~ ^| ~98~ ^| ~97~ ^| ~96~
16+| &nbsp; Texel weight data
^| ~95~ ^| ~94~ ^| ~93~ ^| ~92~ ^| ~91~ ^| ~90~ ^| ~89~ ^| ~88~ ^| ~87~ ^| ~86~ ^| ~85~ ^| ~84~ ^| ~83~ ^| ~82~ ^| ~81~ ^| ~80~
16+| &nbsp; Texel weight data
^| ~79~ ^| ~78~ ^| ~77~ ^| ~76~ ^| ~75~ ^| ~74~ ^| ~73~ ^| ~72~ ^| ~71~ ^| ~70~ ^| ~69~ ^| ~68~ ^| ~67~ ^| ~66~ ^| ~65~ ^| ~64~
16+| &nbsp; Texel weight data
^| ~63~ ^| ~62~ ^| ~61~ ^| ~60~ ^| ~59~ ^| ~58~ ^| ~57~ ^| ~56~ ^| ~55~ ^| ~54~ ^| ~53~ ^| ~52~ ^| ~51~ ^| ~50~ ^| ~49~ ^| ~48~
5+| 6+| &nbsp; More config data 5+|
^| ~47~ ^| ~46~ ^| ~45~ ^| ~44~ ^| ~43~ ^| ~42~ ^| ~41~ ^| ~40~ ^| ~39~ ^| ~38~ ^| ~37~ ^| ~36~ ^| ~35~ ^| ~34~ ^| ~33~ ^| ~32~
8+| &nbsp; latexmath:[$\leftarrow$] Fill direction 8+| &nbsp; Color endpoint data
^| ~31~ ^| ~30~ ^| ~29~ ^| ~28~ ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~ ^| ~22~ ^| ~21~ ^| ~20~ ^| ~19~ ^| ~18~ ^| ~17~ ^| ~16~
4+| 12+| &nbsp; Extra configuration data
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
3+| &nbsp; Extra 2+| &nbsp; _Part_ 11+| &nbsp; _Block mode_
|====================

Since the size of the ``texel weight data'' field is variable, the
positions shown for the ``more config data'' field and ``color endpoint
data'' field are only representative and not fixed.

The _block mode_ field specifies how the Texel Weight Data is encoded.

The _part_ field specifies the number of partitions, minus one. If
dual-plane mode is enabled, the number of partitions must be 3 or fewer.
If 4 partitions are specified, the error color is returned for all
texels in the block.

The size and layout of the extra configuration data depends on the
number of partitions, and the number of planes in the image, as shown
in <<astc-single-partition-layout>> and <<astc-multi-partition-layout>>
(only the bottom 32 bits are shown), or whether the texel block is
a void-extent block, as described in <<astc-void-extent-blocks>>.

==== ASTC Single-Partition Encoding

[[astc-single-partition-layout]]
.ASTC single-partition block layout
[width="97%",cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"]
|====================
^| ~31~ ^| ~30~ ^| ~29~ ^| ~28~ ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~ ^| ~22~ ^| ~21~ ^| ~20~ ^| ~19~ ^| ~18~ ^| ~17~ ^| ~16~
15+| &nbsp; Color endpoint data ^| _CEM_
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
3+| &nbsp; _CEM_ ^| 0 ^| 0 11+| &nbsp; _Block mode_
|====================

_CEM_ is the _color endpoint mode_ field, which determines how the color
endpoint data is encoded.

If dual-plane mode is active, the _color component selector_ bits appear
directly below the weight bits.

==== ASTC Multi-Partition Encoding

[[astc-multi-partition-layout]]
.ASTC multi-partition block layout
[width="97%",cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"]
|====================
^| ~31~ ^| ~30~ ^| ~29~ ^| ~28~ ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~ ^| ~22~ ^| ~21~ ^| ~20~ ^| ~19~ ^| ~18~ ^| ~17~ ^| ~16~
3+| 6+| &nbsp; _CEM_ 7+| &nbsp; _Partition index_
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
3+| &nbsp; _Partition index_ 2+| &nbsp; _Part_ 11+| &nbsp; _Block mode_
|====================

The _partition index_ field specifies which partition layout to use. _CEM_ is
the first 6 bits of _color endpoint mode_ information for the various
partitions. For _endpoint modes_ which require more than 6 bits of _CEM_ data, the
additional bits appear at a variable position directly beneath the texel
weight data.

If dual-plane mode is active, the _color component selector_ bits then appear
directly below the additional _CEM_ bits.

==== ASTC Void-Extent Block Encoding

The final special case is that if bits [8..0] of the block are ``111111100'',
then the block is a void-extent block, which has a separate encoding
described in <<astc-void-extent-blocks>>.

<<<

=== Block Mode

The _block mode_ field specifies the width, height and depth of the grid of
weights, what range of values they use, and whether dual weight planes are
present. Since some these are not represented using powers of two (there
are 12 possible weight widths, for example), and not all combinations are
allowed, this is not a simple bit packing. However, it can be unpacked
quickly in hardware.

The weight ranges are encoded using a 3-bit range value {rho}, which is
interpreted together with a low/high-precision bit _P_, as shown in
<<astc-weight-range-encodings>>.
Each weight value is encoded using the specified number of Trits, Quints
and Bits. The details of this encoding can be found in
<<astc-integer-sequence-encoding>>.

[[astc-weight-range-encodings]]
.ASTC weight range encodings
[width="97%",cols="1,2,1,1,1,2,1,1,1"]
|====================
1.2+^.^| *{rho}^2..0^* 4+^.^| *Low-precision range (_P_=0)* 4+^.^| *High-precision range (_P_=1)*
^.^| *Weight range* ^.^| *Trits* ^.^| *Quints* ^.^| *Bits* ^.^| *Weight range* ^.^| *Trits* ^.^| *Quints* ^.^| *Bits*
^| 000 4+^| Invalid 4+^| Invalid
^| 001 4+^| Invalid 4+^| Invalid
^| 010 ^| 0..1 ^| ^| ^| 1 ^| 0..9 ^| ^| 1 ^| 1
^| 011 ^| 0..2 ^| 1 ^| ^| ^| 0..11 ^| 1 ^| ^| 2
^| 100 ^| 0..3 ^| ^| ^| 2 ^| 0..15 ^| ^| ^| 4
^| 101 ^| 0..4 ^| ^| 1 ^| ^| 0..19 ^| ^| 1 ^| 2
^| 110 ^| 0..5 ^| 1 ^| ^| 1 ^| 0..23 ^| 1 ^| ^| 3
^| 111 ^| 0..7 ^| ^| ^| 3 ^| 0..31 ^| ^| ^| 5
|====================

For 2D blocks, the _block mode_ field is laid out as shown in
<<astc-2d-block-mode-layout>>.

[[astc-2d-block-mode-layout]]
.ASTC 2D _block mode_ layout, weight grid width and height
[width="97%",cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"]
|====================
^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~ 2+^| *W~width~* 2+^| *W~height~* 3+^| *Notes*
^| _D~P~_ ^| _P_ 2+^| _W_ 2+^| _H_ ^| {rho}^0^ ^| 0 ^| 0 ^| {rho}^2^ ^| {rho}^1^ 2+^| _W_+4 2+^| _H_+2 3+|
^| _D~P~_ ^| _P_ 2+^| _W_ 2+^| _H_ ^| {rho}^0^ ^| 0 ^| 1 ^| {rho}^2^ ^| {rho}^1^ 2+^| _W_+8 2+^| _H_+2 3+|
^| _D~P~_ ^| _P_ 2+^| _H_ 2+^| _W_ ^| {rho}^0^ ^| 1 ^| 0 ^| {rho}^2^ ^| {rho}^1^ 2+^| _W_+2 2+^| _H_+8 3+|
^| _D~P~_ ^| _P_ ^| 0 ^| _H_ 2+^| _W_ ^| {rho}^0^ ^| 1 ^| 1 ^| {rho}^2^ ^| {rho}^1^ 2+^| _W_+2 2+^| _H_+6 3+|
^| _D~P~_ ^| _P_ ^| 1 ^| _W_ 2+^| _H_ ^| {rho}^0^ ^| 1 ^| 1 ^| {rho}^2^ ^| {rho}^1^ 2+^| _W_+2 2+^| _H_+2 3+|
^| _D~P~_ ^| _P_ ^| 0 ^| 0 2+^| _H_ ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 12 2+^| _H_+2 3+|
^| _D~P~_ ^| _P_ ^| 0 ^| 1 2+^| _W_ ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| _W_+2 2+^| 12 3+|
^| _D~P~_ ^| _P_ ^| 1 ^| 1 ^| 0 ^| 0 ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 6 2+^| 10 3+|
^| _D~P~_ ^| _P_ ^| 1 ^| 1 ^| 0 ^| 1 ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 10 2+^| 6 3+|
2+^| _H_ ^| 1 ^| 0 2+^| _W_ ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| _W_+6 2+^| _H_+6 3+^| _D~P~_=0, _P_=0
^| x ^| x ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 0 ^| 0 2+^| - 2+^| - 3+^| Void-extent
^| x ^| x ^| 1 ^| 1 ^| 1 ^| x ^| x ^| x ^| x ^| 0 ^| 0 2+^| - 2+^| - 3+^| Reserved*
^| x ^| x ^| x ^| x ^| x ^| x ^| x ^| 0 ^| 0 ^| 0 ^| 0 2+^| - 2+^| - 3+^| Reserved
|====================

Note that, due to the encoding of the {rho} field, as described in the
previous page, bits {rho}^2^ and {rho}^1^ cannot both be zero, which disambiguates
the first five rows from the rest of the table.

Bit positions with a value of x are ignored for purposes of determining
if a block is a void-extent block or reserved, but may have defined
encodings for specific void-extent blocks.

The penultimate row of <<astc-2d-block-mode-layout>> is reserved only if
bits [5..2] are not all 1, in which case it encodes a void-extent block
(as shown in the previous row).

<<<

For 3D blocks, the _block mode_ field is laid out as shown in <<astc-3d-block-mode-layout>>.

[[astc-3d-block-mode-layout]]
.ASTC 3D _block mode_ layout, weight grid width, height and depth
[width="97%",cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"]
|====================
^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~ 2+^| *W~width~* 2+^| *W~height~* 2+^| *W~depth~* 3+^| *Notes*
^| _D~P~_  ^| _P_ 2+^| _H_ 2+^| _W_ ^| {rho}^0^ 2+^| _D_ ^| {rho}^2^ ^| {rho}^1^ 2+^| _W_+2 2+^| _H_+2 2+^| _D_+2 3+^|
2+^| _H_ ^| 0 ^| 0 2+^| _D_ ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 6 2+^| _H_+2 2+^| _D_+2 3+^| _D~P~_=0, _P_=0
2+^| _D_ ^| 0 ^| 1 2+^| _W_ ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| _W_+2 2+^| 6 2+^| _D_+2 3+^| _D~P~_=0, _P_=0
2+^| _H_ ^| 1 ^| 0 2+^| _W_ ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| _W_+2 2+^| _H_+2 2+^| 6 3+^| _D~P~_=0, _P_=0
^| _D~P~_ ^| _P_ ^| 1 ^| 1 ^| 0 ^| 0  ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 6 2+^| 2 2+^| 2 3+^|
^| _D~P~_ ^| _P_ ^| 1 ^| 1 ^| 0 ^| 1  ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 2 2+^| 6 2+^| 2 3+^|
^| _D~P~_ ^| _P_ ^| 1 ^| 1 ^| 1 ^| 0  ^| {rho}^0^ ^| {rho}^2^ ^| {rho}^1^ ^| 0 ^| 0 2+^| 2 2+^| 2 2+^| 6 3+^|
^| x ^| x ^| 1 ^| 1 ^| 1 ^| 1  ^| 1  ^| 1  ^| 1  ^| 0 ^| 0 2+^| - 2+^| - 2+^| - 3+^| Void-extent
^| x ^| x ^| 1 ^| 1 ^| 1 ^| 1  ^| x  ^| x  ^| x  ^| 0 ^| 0 2+^| - 2+^| - 2+^| - 3+^| Reserved*
^| x ^| x ^| x ^| x ^| x ^| x  ^| x  ^| 0  ^| 0  ^| 0 ^| 0 2+^| - 2+^| - 2+^| - 3+^| Reserved
|====================

The _D~P~_ bit is set to indicate dual-plane mode.
In dual-plane mode, the maximum allowed number of partitions is 3.

The penultimate row of <<astc-3d-block-mode-layout>> is reserved only
if bits [4..2] are not all 1, in which case it encodes a void-extent
block (as shown in the previous row).

The size of the weight grid in each dimension must be less than or equal
to the corresponding dimension of the block footprint.
If the grid size is greater than the footprint dimension in any axis, then
this is an illegal block encoding and all texels will decode to the error
color.

=== Color Endpoint Mode

If the texel block encodes a single partition, the _color endpoint mode_ (_CEM_)
field stores one of 16 possible values. Each of these specifies how many raw
data values are encoded, and how to convert these raw values into two _RGBA_ color
endpoints. They can be summarized as shown in <<astc-color-endpoint-modes>>.

[[astc-color-endpoint-modes]]
.ASTC _color endpoint modes_
[cols="2,11,2",options="header",width="55%"]
|====================
^| _CEM_ | &nbsp; Description                        ^| Class
>|  0 &nbsp;&nbsp; | &nbsp; LDR Luminance, direct              ^| 0
>|  1 &nbsp;&nbsp; | &nbsp; LDR Luminance, base+offset         ^| 0
>|  2 &nbsp;&nbsp; | &nbsp; HDR Luminance, large range         ^| 0
>|  3 &nbsp;&nbsp; | &nbsp; HDR Luminance, small range         ^| 0
>|  4 &nbsp;&nbsp; | &nbsp; LDR Luminance+Alpha, direct        ^| 1
>|  5 &nbsp;&nbsp; | &nbsp; LDR Luminance+Alpha, base+offset   ^| 1
>|  6 &nbsp;&nbsp; | &nbsp; LDR _RGB_, base+scale                ^| 1
>|  7 &nbsp;&nbsp; | &nbsp; HDR _RGB_, base+scale                ^| 1
>|  8 &nbsp;&nbsp; | &nbsp; LDR _RGB_, direct                    ^| 2
>|  9 &nbsp;&nbsp; | &nbsp; LDR _RGB_, base+offset               ^| 2
>| 10 &nbsp;&nbsp; | &nbsp; LDR _RGB_, base+scale plus two _A_     ^| 2
>| 11 &nbsp;&nbsp; | &nbsp; HDR _RGB_, direct                    ^| 2
>| 12 &nbsp;&nbsp; | &nbsp; LDR _RGBA_, direct                   ^| 3
>| 13 &nbsp;&nbsp; | &nbsp; LDR _RGBA_, base+offset              ^| 3
>| 14 &nbsp;&nbsp; | &nbsp; HDR _RGB_, direct + LDR Alpha        ^| 3
>| 15 &nbsp;&nbsp; | &nbsp; HDR _RGB_, direct + HDR Alpha        ^| 3
|====================

If the texel block encodes multiple partitions, the _CEM_ field is of variable width,
from 6 to 14 bits. The lowest 2 bits of the _CEM_ field specify how the _endpoint mode_
for each partition is calculated as shown in <<astc-multi-partition-color-endpoint-modes>>.

[[astc-multi-partition-color-endpoint-modes]]
.ASTC multi-partition _color endpoint modes_
[cols="1,5",options="header",width="45%"]
|====================
^| Value | &nbsp; Meaning
^| 00 | &nbsp; All color endpoint pairs are of the same

&nbsp; type; a full 4-bit _CEM_ is stored in block

&nbsp; bits [28..25]
    and is used for all partitions
^| 01 | &nbsp; All endpoint pairs are of class 0 or 1
^| 10 | &nbsp; All endpoint pairs are of class 1 or 2
^| 11 | &nbsp; All endpoint pairs are of class 2 or 3
|====================

If the _CEM_ selector value in bits [24..23] is not 00,
then data layout is as shown in <<astc-multi-partition-color-endpoint-mode-layout>>
and <<astc-multi-partition-color-endpoint-mode-layout-2>>.

[[astc-multi-partition-color-endpoint-mode-layout]]
.ASTC multi-partition _color endpoint mode_ layout
[cols="2,1,2,1,1,1,1,1,1,1,1,1",width="90%"]
|====================
^| *Part* ^| ^| ^| ~n~ ^| ~m~ ^| ~l~ ^| ~k~ ^| ~j~ ^| ~i~ ^| ~h~ ^| ~g~ ^|
^| 2 ^| ... ^| Weight 2+^| M~1~ 6+| | ...
^| 3 ^| ... ^| Weight 2+^| M~2~ 2+^| M~1~ ^| M~0~ 3+| | ...
^| 4 ^| ... ^| Weight 2+^| M~3~ 2+^| M~2~ 2+^| M~1~ 2+^| M~0~ | ...
|====================
[[astc-multi-partition-color-endpoint-mode-layout-2]]
.ASTC multi-partition _color endpoint mode_ layout (2)
[cols="2,1,1,1,1,1,1",width="60%"]
|====================
^| *Part* ^| ~28~ ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~
^| 2 2+^| M~0~ ^| C~1~ ^| C~0~ 2+^| _CEM_
^| 3 ^| M~0~ ^| C~2~ ^| C~1~ ^| C~0~ 2+^| _CEM_
^| 4 ^| C~3~ ^| C~2~ ^| C~1~ ^| C~0~ 2+^| _CEM_
|====================

In this view, each partition _i_ has two fields. C~_i_~ is the class selector
bit, choosing between the two possible _CEM_ classes (0 indicates the
lower of the two classes), and M~_i_~ is a two-bit field specifying the low
bits of the _color endpoint mode_ within that class. The additional bits
appear at a variable bit position, immediately below the texel weight
data.

The ranges used for the data values are not explicitly specified.
Instead, they are derived from the number of available bits remaining
after the configuration data and weight data have been specified.

Details of the decoding procedure for Color Endpoints can be found in
<<astc-endpoint-unquantization>>.

<<<
[[astc-integer-sequence-encoding]]
=== Integer Sequence Encoding

Both the weight data and the endpoint color data are variable width, and
are specified using a sequence of integer values. The range of each
value in a sequence (e.g. a color weight) is constrained.

Since it is often the case that the most efficient range for these
values is not a power of two, each value sequence is encoded using a
technique known as ``integer sequence encoding''. This allows efficient,
hardware-friendly packing and unpacking of values with non-power-of-two
ranges.

In a sequence, each value has an identical range. The range is specified
in one of the forms shown in <<astc-range-forms>> and
<<astc-range-encodings>>.

[[astc-range-forms]]
.ASTC range forms
[options="header",width="65%",cols="2,3,3"]
|====================
| &nbsp; Value range        | &nbsp; MSB encoding           | &nbsp; LSB encoding
| &nbsp; latexmath:[$0 \dots 2^n-1$]   | &nbsp; -                      | &nbsp; _n_-bit value _m_ (_n_ {leq} 8)
| &nbsp; latexmath:[$0 \dots (3 \times 2^n)-1$]  | &nbsp; Base-3 ``trit'' value _t_  | &nbsp; _n_-bit value _m_ (_n_ {leq} 6)
| &nbsp; latexmath:[$0 \dots (5 \times 2^n)-1$]  | &nbsp; Base-5 ``quint'' value _q_ | &nbsp; _n_-bit value _m_ (_n_ {leq} 5)
|====================

[[astc-range-encodings]]
.ASTC encoding for different ranges
[options="header",width="60%",cols="4,3,2,5"]
|====================
| &nbsp; Value range         | &nbsp; Value | &nbsp; Block | &nbsp; Packed block size
| &nbsp; latexmath:[$0 \dots 2^n-1$]        | &nbsp; _m_   | &nbsp; 1      | &nbsp; _n_
| &nbsp; latexmath:[$0 \dots (3 \times 2^n)-1$] | &nbsp; latexmath:[$t \times 2^n + m$]   | &nbsp; 5 | &nbsp; 8 {plus} 5 {times} _n_
| &nbsp; latexmath:[$0 \dots (5 \times 2^n)-1$] | &nbsp; latexmath:[$q \times 2^n + m$]   | &nbsp; 3 | &nbsp; 7 {plus} 3 {times} _n_
|====================

Since 3^5^ is 243, it is possible to pack five trits into 8 bits (which has
256 possible values), so a trit can effectively be encoded as 1.6 bits.
Similarly, since 5^3^ is 125, it is possible to pack three quints into
7 bits (which has 128 possible values), so a quint can be encoded as
2.33 bits.

The encoding scheme packs the trits or quints, and then interleaves the _n_
additional bits in positions that satisfy the requirements of an
arbitrary-length stream. This makes it possible to correctly specify
lists of values whose length is not an integer multiple of 3 or 5 values.
It also makes it possible to easily select a value at random within the stream.

If there are insufficient bits in the stream to fill the final block, then
unused (higher-order) bits are assumed to be 0 when decoding.

To decode the bits for value number _i_ in a sequence of bits _b_, both
indexed from 0, perform the following:

If the range is encoded as _n_ bits per value, then the value is bits
latexmath:[$b^{i\times n + n - 1 .. i \times n}$] -- a simple multiplexing operation.

If the range is encoded using a trit, then each block contains 5 values
(v~0~ to v~4~), each of which contains a trit (t~0~ to t~4~) and a corresponding
LSB value (m~0~ to m~4~). The first bit of the packed block is bit
latexmath:[$\left\lfloor {i\over 5} \right\rfloor \times (8 + 5 \times n)$]. The bits in the
block are packed as shown in <<astc-trit-based-packing>> (in this example, _n_ is 4).

[[astc-trit-based-packing]]
.ASTC trit-based packing
[width="80%"]
|===========================================
4+^| ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~ ^| ~22~ ^| ~21~ ^| ~20~ ^| ~19~ ^| ~18~ ^| ~17~ ^| ~16~
4+^| ^| T^7^ 4+^|     m~4~        ^| T^6^ ^| T^5^ 4+^|     m~3~        ^| T^4^
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^|  ~9~ ^|  ~8~ ^|  ~7~ ^|  ~6~ ^|  ~5~ ^|  ~4~ ^|  ~3~ ^|  ~2~ ^|  ~1~ ^|  ~0~
4+^|    m~2~        ^| T^3^ ^| T^2^ 4+^|      m~1~       ^| T^1^ ^| T^0^ 4+^|      m~0~       ^|
|===========================================

<<<

The five trits t~0~ to t~4~ are obtained by bit manipulations of the 8 bits
T^7..0^ as follows:

-----
if T[4:2] = 111
    C = { T[7:5], T[1:0] }; t4 = t3 = 2
else
    C = T[4:0]
    if T[6:5] = 11
        t4 = 2; t3 = T[7]
    else
        t4 = T[7]; t3 = T[6:5]

if C[1:0] = 11
    t2 = 2; t1 = C[4]; t0 = { C[3], C[2]&~C[3] }
else if C[3:2] = 11
    t2 = 2; t1 = 2; t0 = C[1:0]
else
    t2 = C[4]; t1 = C[3:2]; t0 = { C[1], C[0]&~C[1] }
-----

If the range is encoded using a quint, then each block contains 3 values
(v~0~ to v~2~), each of which contains a quint (q~0~ to q~2~) and a corresponding
LSB value (m~0~ to m~2~). The first bit of the packed block is bit
latexmath:[$ \left\lfloor {i\over 3} \right\rfloor \times (7+3 \times n)$].

The bits in the block are packed as described in <<astc-quint-based-packing>> and
<<astc-quint-based-packing-2>> (in this example, _n_ is 4).

[[astc-quint-based-packing]]
.ASTC quint-based packing
[width="20%"]
|===========
^| ~18~ ^| ~17~ ^| ~16~
^| Q^6^ ^| Q^5^ ^| m~2~
|===========

[[astc-quint-based-packing-2]]
.ASTC quint-based packing (2)
[width="80%"]
|===========
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
3+^|  m~2~       ^| Q^4^ ^| Q^3^ 4+^|     m~1~        ^| Q^2^ ^| Q^1^ ^| Q^0^ 4+^| m~0~
|===========

The three quints q~0~ to q~2~ are obtained by bit manipulations of the 7 bits
Q^6..0^ as follows:

-----
if Q[2:1] = 11 and Q[6:5] = 00
    q2 = { Q[0], Q[4]&~Q[0], Q[3]&~Q[0] }; q1 = q0 = 4
else
    if Q[2:1] = 11
        q2 = 4; C = { Q[4:3], ~Q[6:5], Q[0] }
    else
        q2 = Q[6:5]; C = Q[4:0]

    if C[2:0] = 101
        q1 = 4; q0 = C[4:3]
    else
        q1 = C[4:3]; q0 = C[2:0]
-----

Both these procedures ensure a valid decoding for all 128 possible values
(even though a few are duplicates). They can also be implemented
efficiently in software using small tables.

Encoding methods are not specified here, although table-based mechanisms
work well.

<<<

[[astc-endpoint-unquantization]]
=== Endpoint Unquantization

Each color endpoint is specified as a sequence of integers in a given
range. These values are packed using integer sequence encoding, as a
stream of bits stored from just above the configuration data, and
growing upwards.

Once unpacked, the values must be unquantized from their storage range,
returning them to a standard range of 0..255.

For bit-only representations with fewer than 8 bits, this is simple bit replication
from the most significant bit of the value.
For example, the three-bit value ++C2 C1 C0++ would become ++C2 C1 C0 C2 C1 C0 C2 C1++
(from most- to least-significant bit).

For trit or quint-based representations, this involves a set of bit
manipulations and adjustments to avoid the expense of full-width
multipliers. This procedure ensures correct scaling, but scrambles
the order of the decoded values relative to the encoded values.
This must be compensated for using a table in the encoder.

The initial inputs to the procedure are denoted ++A++ (9 bits), ++B++
(9 bits), ++C++ (9 bits) and ++D++ (3 bits), and are decoded using the
range as described in <<astc-color-unquantization-parameters>>.

[[astc-color-unquantization-parameters]]
.ASTC color unquantization parameters
[cols="4,3,4,3,5,6,6,3,8",width="95%",options="header"]
|============
^| Range  ^| #Trits ^| #Quints ^| #Bits ^| Bit layout ^| ++A++ ^| ++B++ ^| ++C++ ^| ++D++
^| 0..5   ^| 1 ^|   ^| 1 ^| a      ^| aaaaaaaaa ^| 000000000 ^| 204 ^| Trit value
^| 0..9   ^|   ^| 1 ^| 1 ^| a      ^| aaaaaaaaa ^| 000000000 ^| 113 ^| Quint value
^| 0..11  ^| 1 ^|   ^| 2 ^| ba     ^| aaaaaaaaa ^| b000b0bb0 ^| 93  ^| Trit value
^| 0..19  ^|   ^| 1 ^| 2 ^| ba     ^| aaaaaaaaa ^| b0000bb00 ^| 54  ^| Quint value
^| 0..23  ^| 1 ^|   ^| 3 ^| cba    ^| aaaaaaaaa ^| cb000cbcb ^| 44  ^| Trit value
^| 0..39  ^|   ^| 1 ^| 3 ^| cba    ^| aaaaaaaaa ^| cb0000cbc ^| 26  ^| Quint value
^| 0..47  ^| 1 ^|   ^| 4 ^| dcba   ^| aaaaaaaaa ^| dcb000dcb ^| 22  ^| Trit value
^| 0..79  ^|   ^| 1 ^| 4 ^| dcba   ^| aaaaaaaaa ^| dcb0000dc ^| 13  ^| Quint value
^| 0..95  ^| 1 ^|   ^| 5 ^| edcba  ^| aaaaaaaaa ^| edcb000ed ^| 11  ^| Trit value
^| 0..159 ^|   ^| 1 ^| 5 ^| edcba  ^| aaaaaaaaa ^| edcb0000e ^| 6   ^| Quint value
^| 0..191 ^| 1 ^|   ^| 6 ^| fedcba ^| aaaaaaaaa ^| fedcb000f ^| 5   ^| Trit value
|============

These are then processed as follows:

-----
unq = D * C + B;
unq = unq ^ A;
unq = (A & 0x80) | (unq >> 2);
-----

Note that the multiply in the first line is nearly trivial as it only
needs to multiply by 0, 1, 2, 3 or 4.

<<<

=== Endpoint Interpretation

The decoding method used depends on the _color endpoint mode_ (_CEM_) field,
which specifies how many values are used to represent the endpoint.

The _CEM_ field also specifies how to take the _n_ unquantized color endpoint
values v~0~ to v~n-1~ and convert them into two _RGBA_ color endpoints e~0~
and e~1~.

The methods can be summarized as shown in <<astc-color-endpoint-values-by-mode>>.

[[astc-color-endpoint-values-by-mode]]
.ASTC LDR color endpoint values by mode
[options="header",cols="2,3,11,1",width="55%"]
|==============
^| _CEM_ ^| Range | &nbsp; Description                         ^| _n_
>| 0  &nbsp; ^| LDR | &nbsp; Luminance, direct                      ^| 2
>| 1  &nbsp; ^| LDR | &nbsp; Luminance, base+offset                 ^| 2
>| 2  &nbsp; ^| HDR | &nbsp; Luminance, large range                 ^| 2
>| 3  &nbsp; ^| HDR | &nbsp; Luminance, small range                 ^| 2
>| 4  &nbsp; ^| LDR | &nbsp; Luminance+Alpha, direct                ^| 4
>| 5  &nbsp; ^| LDR | &nbsp; Luminance+Alpha, base+offset           ^| 4
>| 6  &nbsp; ^| LDR | &nbsp; _RGB_, base+scale                        ^| 4
>| 7  &nbsp; ^| HDR | &nbsp; _RGB_, base+scale                        ^| 4
>| 8  &nbsp; ^| LDR | &nbsp; _RGB_, direct                            ^| 6
>| 9  &nbsp; ^| LDR | &nbsp; _RGB_, base+offset                       ^| 6
>| 10 &nbsp; ^| LDR | &nbsp; _RGB_, base+scale plus two _A_             ^| 6
>| 11 &nbsp; ^| HDR | &nbsp; _RGB_                                    ^| 6
>| 12 &nbsp; ^| LDR | &nbsp; _RGBA_, direct                           ^| 8
>| 13 &nbsp; ^| LDR | &nbsp; _RGBA_, base+offset                      ^| 8
>| 14 &nbsp; ^| HDR | &nbsp; _RGB_ + LDR Alpha                        ^| 8
>| 15 &nbsp; ^| HDR | &nbsp; _RGB_ + HDR Alpha                        ^| 8
|==============

Color endpoint mode 14 is special in that the alpha values are interpolated linearly,
but the color components are interpolated logarithmically. This is the
only _color endpoint_ mode with mixed-mode operation, and will return the
error color if encountered in LDR _operation mode_.

==== LDR Endpoint Decoding

Decode the different LDR _endpoint modes_ as follows:

===== LDR Endpoint Mode 0:  Luminance, direct

-----
e0 = (v0, v0, v0, 0xFF);
e1 = (v1, v1, v1, 0xFF);
-----

===== LDR Endpoint Mode 1:  Luminance, base+offset

-----
L0 = (v0 >> 2) | (v1 & 0xC0);
L1 = L0 + (v1 & 0x3F);
if (L1 > 0xFF) { L1 = 0xFF; }
e0 = (L0, L0, L0, 0xFF);
e1 = (L1, L1, L1, 0xFF);
-----

===== LDR Endpoint Mode 4:  Luminance+Alpha,direct

-----
e0 = (v0, v0, v0, v2);
e1 = (v1, v1, v1, v3);
-----

===== LDR Endpoint Mode 5:  Luminance+Alpha, base+offset

-----
bit_transfer_signed( v1, v0 );
bit_transfer_signed( v3, v2 );
e0 = (v0, v0, v0, v2);
e1 = (v0 + v1, v0 + v1, v0 + v1, v2 + v3);
clamp_unorm8( e0 );
clamp_unorm8( e1 );
-----

===== LDR Endpoint Mode 6  _RGB_, base+scale

-----
e0 = (v0 * v3 >> 8, v1 * v3 >> 8, v2 * v3 >> 8, 0xFF);
e1 = (v0, v1, v2, 0xFF);
-----

===== LDR Endpoint Mode 8:  _RGB_, Direct

-----
s0 = v0 + v2 + v4;
s1 = v1 + v3 + v5;
if (s1 >= s0) {
    e0 = (v0, v2, v4, 0xFF);
    e1 = (v1, v3, v5, 0xFF);
} else {
    e0 = blue_contract( v1, v3, v5, 0xFF );
    e1 = blue_contract( v0, v2, v4, 0xFF );
}
-----

===== LDR Endpoint Mode 9:  _RGB_, base+offset

-----
bit_transfer_signed( v1, v0 );
bit_transfer_signed( v3, v2 );
bit_transfer_signed( v5, v4 );
if (v1 + v3 + v5 >= 0) {
    e0 = (v0, v2, v4, 0xFF);
    e1 = (v0 + v1, v2 + v3, v4 + v5, 0xFF);
} else {
    e0 = blue_contract( v0 + v1, v2 + v3, v4 + v5, 0xFF );
    e1 = blue_contract( v0, v2, v4, 0xFF );
}
clamp_unorm8( e0 );
clamp_unorm8( e1 );
-----

===== LDR Endpoint Mode 10: _RGB_, base+scale plus two _A_

-----
e0 = (v0 * v3 >> 8, v1 * v3 >> 8, v2 * v3 >> 8, v4);
e1 = (v0, v1, v2, v5);
-----

===== LDR Endpoint Mode 12: _RGBA_, direct

-----
s0 = v0 + v2 + v4;
s1 = v1 + v3 + v5;
if (s1 >= s0) {
    e0 = (v0, v2, v4, v6);
    e1 = (v1, v3, v5, v7);
} else {
    e0 = blue_contract( v1, v3, v5, v7 );
    e1 = blue_contract( v0, v2, v4, v6 );
}
-----

===== LDR Endpoint Mode 13: _RGBA_, base+offset

-----
bit_transfer_signed( v1, v0 );
bit_transfer_signed( v3, v2 );
bit_transfer_signed( v5, v4 );
bit_transfer_signed( v7, v6 );
if (v1 + v3 + v5 >= 0) {
    e0 = (v0, v2, v4, v6);
    e1 = (v0 + v1, v2 + v3, v4 + v5, v6 + v7);
} else {
    e0 = blue_contract( v0 + v1, v2 + v3, v4 + v5, v6 + v7 );
    e1 = blue_contract( v0, v2, v4, v6 );
}
clamp_unorm8( e0 ); clamp_unorm8( e1 );
-----

The `bit_transfer_signed()` procedure transfers a bit from one value (_a_)
to another (_b_). Initially, both _a_ and _b_ are in the range 0..255.
After calling this procedure, _a_'s range becomes -32..31, and _b_ remains
in the range 0..255. Note that, as is often the case, this is easier to
express in hardware than in C:

-----
bit_transfer_signed( int& a, int& b )
{
    b >>= 1;
    b |= a & 0x80;
    a >>= 1;
    a &= 0x3F;
    if ((a & 0x20) != 0) { a -= 0x40; }
}
-----

The `blue_contract()` procedure is used to give additional precision to
_RGB_ colors near gray:

-----
color blue_contract( int r, int g, int b, int a )
{
    color c;
    c.r = (r + b) >> 1;
    c.g = (g + b) >> 1;
    c.b = b;
    c.a = a;
    return c;
}
-----

The `clamp_unorm8()` procedure is used to clamp a color into
8-bit unsigned normalized fixed-point range:

-----
void clamp_unorm8( color c )
{
    if (c.r < 0) { c.r = 0; } else if (c.r > 255) { c.r = 255; }
    if (c.g < 0) { c.g = 0; } else if (c.g > 255) { c.g = 255; }
    if (c.b < 0) { c.b = 0; } else if (c.b > 255) { c.b = 255; }
    if (c.a < 0) { c.a = 0; } else if (c.a > 255) { c.a = 255; }
}
-----

<<<

==== HDR Endpoint Decoding

For _HDR endpoint modes_, color values are represented in a 12-bit
pseudo-logarithmic representation.

===== HDR Endpoint Mode 2

Endpoint mode 2 represents luminance-only data with a large range. It encodes
using two values (v~0~, v~1~). The complete decoding procedure is as follows:

-----
if (v1 >= v0) {
    y0 = (v0 << 4);
    y1 = (v1 << 4);
} else {
    y0 = (v1 << 4) + 8;
    y1 = (v0 << 4) - 8;
}
// Construct RGBA result (0x780 is 1.0f)
e0 = (y0, y0, y0, 0x780);
e1 = (y1, y1, y1, 0x780);
-----

===== HDR Endpoint Mode 3

Endpoint mode 3 represents luminance-only data with a small range. It packs the
bits for a base luminance value, together with an offset, into two values
(v~0~, v~1~), according to <<astc-hdr-mode-3-value-layout>>.

[[astc-hdr-mode-3-value-layout]]
.ASTC HDR endpoint mode 3 value layout
[width="60%",cols="2h,1,1,1,1,1,1,1,1"]
|=====================
^| *Value*  ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
^| v~0~     ^| M  7+^|         L^6..0^
^| v~1~     4+^|    X^3..0^     4+^|   d^3..0^
|=====================

The bit field marked as X allocates different bits to L or d depending
on the value of the mode bit M.

The complete decoding procedure is as follows:

-----
// Check mode bit and extract.
if ((v0 & 0x80) !=0) {
    y0 = ((v1 & 0xE0) << 4) | ((v0 & 0x7F) << 2);
    d  =  (v1 & 0x1F) << 2;
} else {
    y0 = ((v1 & 0xF0) << 4) | ((v0 & 0x7F) << 1);
    d  =  (v1 & 0x0F) << 1;
}

// Add delta and clamp
y1 = y0 + d;
if (y1 > 0xFFF) { y1 = 0xFFF; }

// Construct RGBA result (0x780 is 1.0f)
e0 = (y0, y0, y0, 0x780);
e1 = (y1, y1, y1, 0x780);
-----

<<<

===== HDR Endpoint Mode 7

Endpoint mode 7 packs the bits for a base _RGB_ value, a scale factor, and some
mode bits into the four values (v~0~, v~1~, v~2~, v~3~), as shown in
<<astc-hdr-mode-7-value-layout>>.

[[astc-hdr-mode-7-value-layout]]
.ASTC HDR endpoint mode 7 value layout
[width="50%",cols="3,1,1,1,1,1,1,1,1"]
|===================
^| *Value* ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
^|v~0~     2+^|M^3..2^ 6+^|       _R_^5..0^
^|v~1~     ^|M^1^ ^|X^0^ ^|X^1^ 5+^|      _G_^4..0^
^|v~2~     ^|M^0^ ^|X^2^ ^|X^3^ 5+^|      _B_^4..0^
^|v~3~     ^|X^4^ ^|X^5^ ^|X^6^ 5+^|      S^4..0^
|===================

The mode bits M^0^ to M^3^ are a packed representation of an endpoint bit
mode, together with the major component index. For modes 0 to 4, the
component (red, green, or blue) with the largest magnitude is identified,
and the values swizzled to ensure that it is decoded from the red channel.

The endpoint bit mode is used to determine the number of bits assigned
to each component of the endpoint, and the destination of each of the
extra bits X^0^ to X^6^, as shown in <<astc-hdr-mode-7-endpoint-bit-mode>>.

[[astc-hdr-mode-7-endpoint-bit-mode]]
.ASTC HDR mode 7 endpoint bit mode
[width="85%",cols="2h,1,1,1,1,1,2,2,2,2,2,2,2"]
|===============
| 4+^| *Number of bits* | 7+^| *Destination of extra bits*
^| Mode   ^| *_R_*  ^| *_G_* ^| *_B_* ^| *S* |     ^| *X^0^* ^| *X^1^* ^| *X^2^* ^| *X^3^* ^| *X^4^* ^| *X^5^* ^| *X^6^*
^| 0      ^| 11 ^| 5 ^| 5 ^| 7 ^|     ^| _R_^9^ ^| _R_^8^ ^| _R_^7^ ^| _R_^10^ ^| _R_^6^ ^| S^6^  ^| S^5^
^| 1      ^| 11 ^| 6 ^| 6 ^| 5 ^|     ^| _R_^8^ ^| _G_^5^ ^| _R_^7^ ^| _B_^5^  ^| _R_^6^ ^| _R_^10^ ^| _R_^9^
^| 2      ^| 10 ^| 5 ^| 5 ^| 8 ^|     ^| _R_^9^ ^| _R_^8^ ^| _R_^7^ ^| _R_^6^  ^| S^7^ ^| S^6^  ^| S^5^
^| 3      ^| 9  ^| 6 ^| 6 ^| 7 ^|     ^| _R_^8^ ^| _G_^5^ ^| _R_^7^ ^| _B_^5^  ^| _R_^6^ ^| S^6^  ^| S^5^
^| 4      ^| 8  ^| 7 ^| 7 ^| 6 ^|     ^| _G_^6^ ^| _G_^5^ ^| _B_^6^ ^| _B_^5^  ^| _R_^6^ ^| _R_^7^  ^| S^5^
^| 5      ^| 7  ^| 7 ^| 7 ^| 7 ^|     ^| _G_^6^ ^| _G_^5^ ^| _B_^6^ ^| _B_^5^  ^| _R_^6^ ^| S^6^  ^| S^5^
|===============

As noted before, this appears complex when expressed in C, but much
easier to achieve in hardware: bit masking, extraction, shifting
and assignment usually ends up as a single wire or multiplexer.

The complete decoding procedure is as follows:

<<<

-----
// Extract mode bits and unpack to major component and mode.
int majcomp; int mode; int modeval = ((v0&0xC0)>>6) | ((v1&0x80)>>5) | ((v2&0x80)>>4);

if ((modeval & 0xC) != 0xC) {
    majcomp = modeval >> 2; mode = modeval & 3;
} else if (modeval != 0xF) {
    majcomp = modeval & 3;  mode = 4;
} else {
    majcomp = 0; mode = 5;
}

// Extract low-order bits of r, g, b, and s.
int red  = v0 & 0x3f; int green = v1 & 0x1f;
int blue = v2 & 0x1f; int scale = v3 & 0x1f;

// Extract high-order bits, which may be assigned depending on mode
int x0 = (v1 >> 6) & 1; int x1 = (v1 >> 5) & 1; int x2 = (v2 >> 6) & 1;
int x3 = (v2 >> 5) & 1; int x4 = (v3 >> 7) & 1; int x5 = (v3 >> 6) & 1;
int x6 = (v3 >> 5) & 1;

// Now move the high-order xs into the right place.
int ohm = 1 << mode;
if (ohm & 0x30) { green |= x0 << 6; }
if (ohm & 0x3A) { green |= x1 << 5; }
if (ohm & 0x30) { blue |= x2 << 6; }
if (ohm & 0x3A) { blue |= x3 << 5; }
if (ohm & 0x3D) { scale |= x6 << 5; }
if (ohm & 0x2D) { scale |= x5 << 6; }
if (ohm & 0x04) { scale |= x4 << 7; }
if (ohm & 0x3B) { red |= x4 << 6; }
if (ohm & 0x04) { red |= x3 << 6; }
if (ohm & 0x10) { red |= x5 << 7; }
if (ohm & 0x0F) { red |= x2 << 7; }
if (ohm & 0x05) { red |= x1 << 8; }
if (ohm & 0x0A) { red |= x0 << 8; }
if (ohm & 0x05) { red |= x0 << 9; }
if (ohm & 0x02) { red |= x6 << 9; }
if (ohm & 0x01) { red |= x3 << 10; }
if (ohm & 0x02) { red |= x5 << 10; }

// Shift the bits to the top of the 12-bit result.
static const int shamts[6] = { 1, 1, 2, 3, 4, 5 };
int shamt = shamts[mode];
red <<= shamt; green <<= shamt; blue <<= shamt; scale <<= shamt;

// Minor components are stored as differences
if (mode != 5) { green = red - green; blue = red - blue; }

// Swizzle major component into place
if (majcomp == 1) { swap( red, green ); }
if (majcomp == 2) { swap( red, blue ); }

// Clamp output values, set alpha to 1.0
e1.r = clamp( red, 0, 0xFFF );
e1.g = clamp( green, 0, 0xFFF );
e1.b = clamp( blue, 0, 0xFFF );
e1.alpha = 0x780;
e0.r = clamp( red - scale, 0, 0xFFF );
e0.g = clamp( green - scale, 0, 0xFFF );
e0.b = clamp( blue - scale, 0, 0xFFF );
e0.alpha = 0x780;
-----

===== HDR Endpoint Mode 11

Mode 11 specifies two _RGB_ values, which it calculates from a number of
bitfields (a, b~0~, b~1~, c, d~0~ and d~1~) which are packed together with some
mode bits into the six values (v~0~, v~1~, v~2~, v~3~, v~4~, v~5~) as shown in
<<astc-hdr-mode-11-value-layout>>.

[[astc-hdr-mode-11-value-layout]]
.ASTC HDR mode 11 value layout
[cols="3h,1,1,1,1,1,1,1,1",width="50%"]
|======================
^| *Value*  ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
^| v~0~     8+^|            a^7..0^
^| v~1~     ^|m~0~  ^|a^8^ 6+^|      c^5..0^
^| v~2~     ^|m~1~  ^|X^0^ 6+^|     b~0~^5..0^
^| v~3~     ^|m~2~  ^|X^1^ 6+^|     b~1~^5..0^
^| v~4~     ^|mj~0~ ^|X^2^ ^|X^4^ 5+^|     d~0~^4..0^
^| v~5~     ^|mj~1~ ^|X^3^ ^|X^5^ 5+^|     d~1~^4..0^
|======================

If the major component bits mj^1..0^ are both 1, then the _RGB_ values
are specified directly by <<astc-hdr-mode-11-direct-value-layout>>.

[[astc-hdr-mode-11-direct-value-layout]]
.ASTC HDR mode 11 direct value layout
[cols="3h,1,1,1,1,1,1,1,1",width="50%"]
|===================
^| *Value*  ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
^| v~0~     8+^|         _R_~0~^11..4^
^| v~1~     8+^|         _R_~1~^11..4^
^| v~2~     8+^|         _G_~0~^11..4^
^| v~3~     8+^|         _G_~1~^11..4^
^| v~4~     ^| 1 7+^|    _B_~0~^11..5^
^| v~5~     ^| 1 7+^|    _B_~1~^11..5^
|===================

The mode bits m^2..0^ specify the bit allocation for the different
values, and the destinations of the extra bits X^0^ to X^5^ as shown
in <<astc-hdr-mode-11-endpoint-bit-mode>>.

[[astc-hdr-mode-11-endpoint-bit-mode]]
.ASTC HDR mode 11 endpoint bit mode
[cols="2h,1,1,1,1,1,2,2,2,2,2,2",width="85%"]
|=================
^|     4+^|  *Number of bits* | 6+^| *Destination of extra bits*
^| Mode  ^|  *a*  ^| *b* ^| *c* ^| *d* ^|  ^| *X^0^* ^| *X^1^* ^| *X^2^* ^| *X^3^* ^| *X^4^* ^| *X^5^*
^| 0     ^|  9  ^| 7 ^| 6 ^| 7 ^|  ^|  b~0~^6^ ^| b~1~^6^ ^| d~0~^6^ ^| d~1~^6^ ^| d~0~^5^ ^| d~1~^5^
^| 1     ^|  9  ^| 8 ^| 6 ^| 6 ^|  ^|  b~0~^6^ ^| b~1~^6^ ^| b~0~^7^ ^| b~1~^7^ ^| d~0~^5^ ^| d~1~^5^
^| 2     ^|  10 ^| 6 ^| 7 ^| 7 ^|  ^|  a^9^  ^| c^6^  ^| d~0~^6^ ^| d~1~^6^ ^| d~0~^5^ ^| d~1~^5^
^| 3     ^|  10 ^| 7 ^| 7 ^| 6 ^|  ^|  b~0~^6^ ^| b~1~^6^ ^| a^9^  ^| c^6^  ^| d~0~^5^ ^| d~1~^5^
^| 4     ^|  11 ^| 8 ^| 6 ^| 5 ^|  ^|  b~0~^6^ ^| b~1~^6^ ^| b~0~^7^ ^| b~1~^7^ ^| a^9^  ^| a^10^
^| 5     ^|  11 ^| 6 ^| 7 ^| 6 ^|  ^|  a^9^  ^| a^10^ ^| c^7^  ^| c^6^  ^| d~0~^5^ ^| d~1~^5^
^| 6     ^|  12 ^| 7 ^| 7 ^| 5 ^|  ^|  b~0~^6^ ^| b~1~^6^ ^| a^11^ ^| c^6^  ^| a^9^  ^| a^10^
^| 7     ^|  12 ^| 6 ^| 7 ^| 6 ^|  ^|  a^9^  ^| a^10^ ^| a^11^ ^| c^6^  ^| d~0~^5^ ^| d~1~^5^
|=================

The complete decoding procedure is as follows:

<<<

-----
// Find major component
int majcomp = ((v4 & 0x80) >> 7) | ((v5 & 0x80) >> 6);

// Deal with simple case first
if (majcomp == 3) {
    e0 = (v0 << 4, v2 << 4, (v4 & 0x7f) << 5, 0x780);
    e1 = (v1 << 4, v3 << 4, (v5 & 0x7f) << 5, 0x780);
    return;
}

// Decode mode, parameters.
int mode = ((v1&0x80)>>7) | ((v2&0x80)>>6) | ((v3&0x80)>>5);
int va  = v0 | ((v1 & 0x40) << 2);
int vb0 = v2 & 0x3f; int vb1 = v3 & 0x3f;
int vc  = v1 & 0x3f;
int vd0 = v4 & 0x7f; int vd1 = v5 & 0x7f;

// Assign top bits of vd0, vd1.
static const int dbitstab[8] = {7,6,7,6,5,6,5,6};
vd0 = signextend( vd0, dbitstab[mode] );
vd1 = signextend( vd1, dbitstab[mode] );

// Extract and place extra bits
int x0 = (v2 >> 6) & 1;
int x1 = (v3 >> 6) & 1;
int x2 = (v4 >> 6) & 1;
int x3 = (v5 >> 6) & 1;
int x4 = (v4 >> 5) & 1;
int x5 = (v5 >> 5) & 1;

int ohm = 1 << mode;
if (ohm & 0xA4) { va |= x0 << 9; }
if (ohm & 0x08) { va |= x2 << 9; }
if (ohm & 0x50) { va |= x4 << 9; }
if (ohm & 0x50) { va |= x5 << 10; }
if (ohm & 0xA0) { va |= x1 << 10; }
if (ohm & 0xC0) { va |= x2 << 11; }
if (ohm & 0x04) { vc |= x1 << 6; }
if (ohm & 0xE8) { vc |= x3 << 6; }
if (ohm & 0x20) { vc |= x2 << 7; }
if (ohm & 0x5B) { vb0 |= x0 << 6; }
if (ohm & 0x5B) { vb1 |= x1 << 6; }
if (ohm & 0x12) { vb0 |= x2 << 7; }
if (ohm & 0x12) { vb1 |= x3 << 7; }

// Now shift up so that major component is at top of 12-bit value
int shamt = (modeval >> 1) ^ 3;
va <<= shamt; vb0 <<= shamt; vb1 <<= shamt;
vc <<= shamt; vd0 <<= shamt; vd1 <<= shamt;

e1.r = clamp( va, 0, 0xFFF );
e1.g = clamp( va - vb0, 0, 0xFFF );
e1.b = clamp( va - vb1, 0, 0xFFF );
e1.alpha = 0x780;
e0.r = clamp( va - vc, 0, 0xFFF );
e0.g = clamp( va - vb0 - vc - vd0, 0, 0xFFF );
e0.b = clamp( va - vb1 - vc - vd1, 0, 0xFFF );
e0.alpha = 0x780;

if (majcomp == 1)      { swap( e0.r, e0.g ); swap( e1.r, e1.g ); }
else if (majcomp == 2) { swap( e0.r, e0.b ); swap( e1.r, e1.b ); }
-----

===== HDR Endpoint Mode 14

Endpoint mode 14 specifies two _RGBA_ values, using the eight values
(v~0~, v~1~, v~2~, v~3~, v~4~, v~5~, v~6~, v~7~).
First, the _RGB_ values are decoded from (v~0~..v~5~) using the method
from endpoint mode 11, then the alpha values are filled in from v~6~ and v~7~:

-----
// Decode RGB as for mode 11
(e0, e1) = decode_mode_11( v0, v1, v2, v3, v4, v5 );

// Now fill in the alphas
e0.alpha = v6;
e1.alpha = v7;
-----

Note that in this _endpoint mode_, the alpha values are interpreted (and
interpolated) as 8-bit unsigned normalized values, as in the LDR modes.
This is the only _HDR endpoint mode_ that exhibits this behavior.

===== HDR Endpoint Mode 15

Endpoint mode 15 specifies two _RGBA_ values, using the eight values (v~0~, v~1~,
v~2~, v~3~, v~4~, v~5~, v~6~, v~7~).
First, the _RGB_ values are decoded from (v~0~..v~5~) using the method
from endpoint mode 11.
The alpha values are stored in values v~6~ and v~7~ as a mode and
two values which are interpreted according to the mode _M_, as shown
in <<astc-hdr-mode-15-alpha-value-layout>>.

[[astc-hdr-mode-15-alpha-value-layout]]
.ASTC HDR mode 15 alpha value layout
[cols="h,1,1,1,1,1,1,1,1",width="75%"]
|====================
^| *Value*  ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
^| v~6~     ^| _M_^0^ 7+^|        A^6..0^
^| v~7~     ^| _M_^1^ 7+^|        B^6..0^
|====================

The alpha values are decoded from v~6~ and v~7~ as follows:

-----
// Decode RGB as for endpoint mode 11
(e0, e1) = decode_mode_11( v0, v1, v2, v3, v4, v5 );

// Extract mode bits
mode = ((v6 >> 7) & 1) | ((v7 >> 6) & 2);
v6 &= 0x7F;
v7 &= 0x7F;

if (mode == 3) {
    // Directly specify alphas
    e0.alpha = v6 << 5;
    e1.alpha = v7 << 5;
} else {
    // Transfer bits from v7 to v6 and sign extend v7.
    v6 |= (v7 << (mode + 1)) & 0x780;
    v7 &= (0x3F >> mode);
    v7 ^= 0x20 >> mode;
    v7 -= 0x20 >> mode;
    v6 <<= (4 - mode);
    v7 <<= (4 - mode);

    // Add delta and clamp
    v7 += v6;
    v7 = clamp( v7, 0, 0xFFF );
    e0.alpha = v6;
    e1.alpha = v7;
}
-----

Note that in this _endpoint mode_, the alpha values are interpreted
(and interpolated) as 12-bit HDR values, and are interpolated as
for any other HDR component.

[[astc-weight-decoding]]
=== Weight Decoding

The weight information is stored as a stream of bits, growing downwards
from the most significant bit in the block. Bit _n_ in the stream is thus
bit 127-_n_ in the block.

For each position in the weight grid, a value (in the specified range)
is packed into the stream. These are ordered in a raster pattern
starting from position (0,0,0), with the X dimension increasing fastest,
and the Z dimension increasing slowest. If dual-plane mode is selected,
both weights are emitted together for each position, plane 0 first,
then plane 1.

[[astc-weight-unquantization]]
=== Weight Unquantization

Each weight plane is specified as a sequence of integers in a given
range. These values are packed using integer sequence encoding.

Once unpacked, the values must be unquantized from their storage
range, returning them to a standard range of 0..64. The procedure
for doing so is similar to the color endpoint unquantization.

First, we unquantize the actual stored weight values to the range 0..63.

For bit-only representations of fewer than 5 bits, this is simple bit
replication from the most significant bit of the value.
For example, a 3-bit weight ++C2 C1 C0++ becomes the 5-bit value
++C2 C1 C0 C2 C1++ (from most- to least-significant bit).

For trit or quint-based representations, this involves a set of bit
manipulations and adjustments to avoid the expense of full-width
multipliers.

For representations with no additional bits, the results are as shown
in <<astc-weight-unquantization-values>>.

[[astc-weight-unquantization-values]]
.ASTC weight unquantization values
[cols="3,1,1,1,1,1",width="40%",options="header"]
|===========
^| Range  ^| 0  ^| 1  ^| 2  ^| 3  ^| 4
^| 0..2   ^| 0  ^| 32 ^| 63 ^| -  ^| -
^| 0..4   ^| 0  ^| 16 ^| 32 ^| 47 ^| 63
|===========

For other values, we calculate the initial inputs to a bit manipulation
procedure. These are denoted ++A++ (7 bits), ++B++ (7 bits), ++C++ (7 bits),
and ++D++ (3 bits) and are decoded using the range as shown in
<<astc-weight-unquantization-parameters>>.

[[astc-weight-unquantization-parameters]]
.ASTC weight unquantization parameters
[cols="4,3,4,3,5,6,6,3,8",width="95%",options="header"]
|============
^| Range  ^| #Trits ^| #Quints ^| #Bits ^| Bit layout ^| ++A++ ^| ++B++ ^| ++C++ ^| ++D++
^| 0..5   ^| 1 ^|   ^| 1  ^| a      ^| aaaaaaa ^| 0000000 ^| 50 ^| Trit value
^| 0..9   ^|   ^| 1 ^| 1  ^| a      ^| aaaaaaa ^| 0000000 ^| 28 ^| Quint value
^| 0..11  ^| 1 ^|   ^| 2  ^| ba     ^| aaaaaaa ^| b000b0b ^| 23 ^| Trit value
^| 0..19  ^|   ^| 1 ^| 2  ^| ba     ^| aaaaaaa ^| b0000b0 ^| 13 ^| Quint value
^| 0..23  ^| 1 ^|   ^| 3  ^| cba    ^| aaaaaaa ^| cb000cb ^| 11 ^| Trit value
|===========

These are then processed as follows:

-----
unq = D * C + B;
unq = unq ^ A;
unq = (A & 0x20) | (unq >> 2);
-----

Note that the multiply in the first line is nearly trivial as it only
needs to multiply by 0, 1, 2, 3 or 4.

As a final step, for all types of value, the range is expanded from
0..63 up to 0..64 as follows:

-----
if (unq > 32) { unq += 1; }
-----

This allows the implementation to use 64 as a divisor during interpolation,
which is much easier than using 63.

=== Weight Infill

After unquantization, the weights are subject to weight selection and
infill. The infill method is used to calculate the weight for a texel
position, based on the weights in the stored weight grid array (which
may be a different size).
The procedure below must be followed exactly, to ensure bit exact
results.

The block size is specified as three dimensions along the _s_, _t_ and
_r_ axes (B~s~, B~t~, B~r~).
Texel coordinates within the block (b~s~,&nbsp;b~t~,&nbsp;b~r~) can
have values from 0 to one less than the block dimension in that axis.
For each block dimension, we compute scale factors (_D~s~_, _D~t~_, _D~r~_):

[latexmath]
+++++++++++++
\begin{align*}
D_s = \left\lfloor {\left(1024 + \left\lfloor { \textrm{B}_\textrm{s} \over 2 }\right\rfloor\right) \over (\textrm{B}_\textrm{s}-1) } \right\rfloor \\
D_t = \left\lfloor {\left(1024 + \left\lfloor { \textrm{B}_\textrm{t} \over 2 }\right\rfloor\right) \over (\textrm{B}_\textrm{t}-1) } \right\rfloor \\
D_r = \left\lfloor {\left(1024 + \left\lfloor { \textrm{B}_\textrm{r} \over 2 }\right\rfloor\right) \over (\textrm{B}_\textrm{r}-1) } \right\rfloor
\end{align*}
+++++++++++++

Since the block dimensions are constrained, these are easily looked up
in a table.
These scale factors are then used to scale the (b~s~, b~t~, b~r~) coordinates
to a homogeneous coordinate (c~s~, c~t~, c~r~):

-----
cs = Ds * bs;
ct = Dt * bt;
cr = Dr * br;
-----

This homogeneous coordinate (c~s~, c~t~, c~r~) is then scaled again to
give a coordinate (g~s~, g~t~, g~r~) in the weight-grid space.
The weight-grid is of size (W~width~, W~height~, W~depth~), as
specified in the _block mode_ field (<<astc-2d-block-mode-layout>> and
<<astc-3d-block-mode-layout>>):

-----
gs = (cs * (Wwidth - 1) + 32) >> 6;
gt = (ct * (Wheight - 1) + 32) >> 6;
gr = (cr * (Wdepth - 1) + 32) >> 6;
-----

The resulting coordinates may be in the range 0..176. These are interpreted
as 4:4 unsigned fixed point numbers in the range 0.0 .. 11.0.

If we label the integral parts of these (j~s~, j~t~, j~r~) and the fractional
parts (f~s~, f~t~, f~r~), then:

-----
js = gs >> 4; fs = gs & 0x0F;
jt = gt >> 4; ft = gt & 0x0F;
jr = gr >> 4; fr = gr & 0x0F;
-----

These values are then used to interpolate between the stored weights.
This process differs for 2D and 3D.

For 2D, bilinear interpolation is used:

-----
v0 = js + jt * Wwidth;
p00 = decode_weight( v0 );
p01 = decode_weight( v0 + 1 );
p10 = decode_weight( v0 + Wwidth );
p11 = decode_weight( v0 + Wwidth + 1 );
-----

The function `decode_weight(n)` decodes the _n_^th^ weight in the stored
weight stream.
The values p~00~ to p~11~ are the weights at the corner of the square
in which the texel position resides. These are then weighted using the
fractional position to produce the effective weight _i_ as follows:

-----
w11 = (fs * ft + 8) >> 4;
w10 = ft - w11;
w01 = fs - w11;
w00 = 16 - fs - ft + w11;
i = (p00 * w00 + p01 * w01 + p10 * w10 + p11 * w11 + 8) >> 4;
-----

For 3D, simplex interpolation is used as it is cheaper than a na&iuml;ve
trilinear interpolation. First, we pick some parameters for the interpolation
based on comparisons of the fractional parts of the texel
position as shown in <<astc-simplex-interpolation-parameters>>.

[[astc-simplex-interpolation-parameters]]
.ASTC simplex interpolation parameters
[options="header",width="90%",cols="1,1,1,2,2,1,1,1,1"]
|==================
^| _f~s~_ > _f~t~_ ^| _f~t~_ > _f~r~_ ^| _f~s~_ > _f~r~_  ^| _s_~1~ ^| _s_~2~ ^| _w_~0~    ^| _w_~1~    ^| _w_~2~    ^| _w_~3~
^| True   ^| True   ^| _True_ ^| 1   ^| W~width~   ^| 16 - _f~s~_  ^| _f~s~_ - _f~t~_  ^| _f~t~_ - _f~r~_  ^| _f~r~_
^| False  ^| _True_ ^| True   ^| W~width~   ^| 1   ^| 16 - _f~t~_  ^| _f~t~_ - _f~s~_  ^| _f~s~_ - _f~r~_ ^| _f~r~_
^| _True_ ^| False  ^| True   ^| 1   ^| W~width~ {times} W~height~ ^| 16 - _f_~s~  ^| _f~s~_ - _f~r~_ ^| _f~r~_ - _f~t~_ ^| _f~t~_
^| True   ^| _False_  ^| False  ^| W~width~ {times} W~height~ ^| 1   ^| 16 - _f~r~_  ^| _f~r~_ - _f~s~_  ^| _f~s~_ - _f~t~_  ^| _f~t~_
^| _False_  ^| True   ^| False  ^| W~width~   ^| W~width~ {times} W~height~ ^| 16 - _f~t~_  ^| _f~t~_ - _f~r~_  ^| _f~r~_ - _f~s~_  ^| _f~s~_
^| False  ^| False  ^| _False_  ^| W~width~ {times} W~height~ ^| W~width~   ^| 16 - _f~r~_  ^| _f~r~_ - _f~t~_  ^| _f~t~_ - _f~s~_ ^| _f~s~_
|==================

Italicized test results are implied by the others. The effective
weight _i_ is then calculated as:

-----
v0 = js + jt * Wwidth + jr * Wwidth * Wheight;
p0 = decode_index( v0 );
p1 = decode_index( v0 + s1 );
p2 = decode_index( v0 + s1 + s2 );
p3 = decode_index( v0 + Wwidth * Wheight + Wwidth + 1 );
i = (p0 * w0 + p1 * w1 + p2 * w2 + p3 * w3 + 8) >> 4;
-----

<<<

[[astc_weight_application]]
=== Weight Application

Once the effective weight _i_ for the texel has been calculated, the color
endpoints are interpolated and expanded.

==== _LDR endpoint modes_

For _LDR endpoint modes_, each color component C is calculated from the
corresponding 8-bit endpoint components C~0~ and C~1~ as follows:

If sRGB conversion is not enabled, C~0~ and C~1~ are first expanded
to 16 bits by bit replication:

-----
C0 = (C0 << 8) | C0;
C1 = (C1 << 8) | C1;
-----

If sRGB conversion is enabled, C~0~ and C~1~ are expanded to 16 bits
differently, as follows:

-----
C0 = (C0 << 8) | 0x80;
C1 = (C1 << 8) | 0x80;
-----

C~0~ and C~1~ are then interpolated to produce a UNORM16 result C:

-----
C = floor( (C0 * (64 - i) + C1 * i + 32) / 64 );
-----

If sRGB conversion is not enabled and the decode mode is ++decode_float16++,
then if C = 65535 the final result is 1.0 (0x3C00); otherwise C is
divided by 65536 and the infinite-precision result of the division is
converted to FP16 with round-to-zero semantics.

If sRGB conversion is not enabled and the decode mode is ++decode_unorm8++,
then the top 8 bits of the interpolation result for the _R_, _G_, _B_
and _A_ channels are used as the final result.

If sRGB conversion is not enabled and the decode mode is ++decode_rgb9e5++,
then the final result is a combination of the (UNORM16) values of C
for the three color components (_C~r~_, _C~g~_ and _C~b~_) computed as follows:

-----
int lz = clz17( Cr | Cg | Cb | 1 );
if (Cr == 65535) {
    Cr = 65536;
    lz = 0;
}
if (Cg == 65535) {
    Cg = 65536;
    lz = 0;
}
if (Cb == 65535) {
    Cb = 65536;
    lz = 0;
}
Cr <<= lz;
Cg <<= lz;
Cb <<= lz;
Cr = (Cr >> 8) & 0x1FF;
Cg = (Cg >> 8) & 0x1FF;
Cb = (Cb >> 8) & 0x1FF;
uint32_t exponent = 16 - lz;
uint32_t texel = (exponent << 27) | (Cb << 18) | Cg << 9) | Cr;
-----

The ++clz17()++ function counts leading zeroes in a 17-bit value.

If sRGB conversion is enabled, then the decoding mode is ignored
and the top 8 bits of the interpolation result for the _R_, _G_ and _B_
channels are treated as an unsigned, normalized 8-bit value to be
converted with the <<TRANSFER_SRGB_EOTF,sRGB EOTF>>.
The top 8 bits of the _A_ channel are used directly as the final
unsigned, normalized 8-bit result.

==== _HDR endpoint modes_

For _HDR endpoint modes_, color values are represented in a 12-bit
pseudo-logarithmic representation, and interpolation occurs in a
piecewise-approximate logarithmic manner as follows:

In sRGB and Linear LDR operation modes, the error color is returned.

In _HDR endpoint modes_, the color components from each endpoint, C~0~ and C~1~, are
initially shifted left 4 bits to become 16-bit integer values and these
are interpolated in the same way as LDR. The 16-bit value C is then
decomposed into the top five bits, E, and the bottom 11 bits M, which
are then processed and recombined with E to form the final value C~f~:

-----
C = floor( (C0 * (64 - i) + C1 * i + 32) / 64 );
E = (C & 0xF800) >> 11;
M = C & 0x7FF;

if (M < 512) {
    Mt = 3 * M;
} else if (M >= 1536) {
    Mt = 5 * M - 2048;
} else {
    Mt = 4 * M - 512;
}

Cf = (E << 10) + (Mt >> 3);
-----

This interpolation is a considerably closer approximation to a
logarithmic space than simple 16-bit interpolation.

This final value C~f~ is interpreted as an IEEE FP16 value. If the result
is +Inf or NaN, it is converted to the bit pattern 0x7BFF, which is the
largest representable finite value.

<<<
If the decode mode is ++decode_rgb9e5++, then the final result is a
combination of the (IEEE FP16) values of C~f~ for the three color components
(_C~r~_, _C~g~_ and _C~b~_) computed as follows:

-----
if (Cr > 0x7c00) {
    Cr = 0;
} else if (Cr == 0x7c00) {
    Cr = 0x7bff;
}
if (Cg > 0x7c00) {
    Cg = 0;
} else if (Cg == 0x7c00) {
    Cg = 0x7bff;
}
if (Cb > 0x7c00) {
    Cb = 0;
} else if (Cb == 0x7c00) {
    Cb = 0x7bff;
}

int Re = (Cr >> 10) & 0x1F;
int Ge = (Cg >> 10) & 0x1F;
int Be = (Cb >> 10) & 0x1F;
int Rex = Re == 0 ? 1 : Re;
int Gex = Ge == 0 ? 1 : Ge;
int Bex = Be == 0 ? 1 : Be;
int Xm = ((Cr | Cg | Cb) & 0x200) >> 9;
int Xe = Re | Ge | Be;
uint32_t rshift, gshift, bshift, expo;

if (Xe == 0) {
    expo = rshift = gshift = bshift = Xm;
} else if (Re >= Ge && Re >= Be) {
    expo = Rex + 1;
    rshift = 2;
    gshift = Rex - Gex + 2;
    bshift = Rex - Bex + 2;
} else if (Ge >= Be) {
    expo = Gex + 1;
    rshift = Gex - Rex + 2;
    gshift = 2;
    bshift = Gex - Bex + 2;
} else {
    expo = Bex + 1;
    rshift = Bex - Rex + 2;
    gshift = Bex - Gex + 2;
    bshift = 2;
}

int Rm = (Cr & 0x3FF) | (Re == 0 ? 0 : 0x400);
int Gm = (Cg & 0x3FF) | (Ge == 0 ? 0 : 0x400);
int Bm = (Cb & 0x3FF) | (Be == 0 ? 0 : 0x400);
Rm = (Rm >> rshift) & 0x1FF;
Gm = (Gm >> gshift) & 0x1FF;
Bm = (Bm >> bshift) & 0x1FF;

uint32_t texel = (expo << 27) | (Bm << 18) | (Gm << 9) | (Rm << 0);
-----

<<<

=== Dual-Plane Decoding

If dual-plane mode is disabled, all of the endpoint components are interpolated
using the same weight value.

If dual-plane mode is enabled, two weights are stored with each texel.
One component is then selected to use the second weight for interpolation,
instead of the first weight. The first weight is then used for all other
components.

The component to treat specially is indicated using the 2-bit _Color
Component Selector_ (_CCS_) field as shown in <<astc-dual-plane-color-component-selector-values>>.

[[astc-dual-plane-color-component-selector-values]]
.ASTC dual plane _color component selector_ values
[options="header",width="40%"]
|==========
^| Value  ^| Weight 0 ^| Weight 1
^| 0      ^|   _GBA_  ^|    _R_
^| 1      ^|   _RBA_  ^|    _G_
^| 2      ^|   _RGA_  ^|    _B_
^| 3      ^|   _RGB_  ^|    _A_
|==========

The _CCS_ bits are stored at a variable position directly below the weight
bits and any additional _CEM_ bits.

=== Partition Pattern Generation

When multiple partitions are active, each texel position is assigned a
_partition index_. This _partition index_ is calculated using a seed (the
partition pattern index), the texel's _x_, _y_, _z_ position within
the block, and the number of partitions.
An additional argument, `small_block`, is set to 1 if the number of
texels in the block is less than 31, otherwise it is set to 0.

This function is specified in terms of _x_, _y_ and _z_ in order to
support 3D textures.
For 2D textures and texture slices, _z_ will always be 0.

The full partition selection algorithm is as follows:

<<<
-----
int select_partition( int seed, int x, int y, int z,
                      int partitioncount, int small_block )
{
    if (small_block) {
        x <<= 1;
        y <<= 1;
        z <<= 1;
    }
    seed += (partitioncount - 1) * 1024;
    uint32_t rnum = hash52( seed );
    uint8_t seed1  =  rnum        & 0xF;
    uint8_t seed2  = (rnum >>  4) & 0xF;
    uint8_t seed3  = (rnum >>  8) & 0xF;
    uint8_t seed4  = (rnum >> 12) & 0xF;
    uint8_t seed5  = (rnum >> 16) & 0xF;
    uint8_t seed6  = (rnum >> 20) & 0xF;
    uint8_t seed7  = (rnum >> 24) & 0xF;
    uint8_t seed8  = (rnum >> 28) & 0xF;
    uint8_t seed9  = (rnum >> 18) & 0xF;
    uint8_t seed10 = (rnum >> 22) & 0xF;
    uint8_t seed11 = (rnum >> 26) & 0xF;
    uint8_t seed12 = ((rnum >> 30) | (rnum << 2)) & 0xF;

    seed1  *= seed1;    seed2  *= seed2;
    seed3  *= seed3;    seed4  *= seed4;
    seed5  *= seed5;    seed6  *= seed6;
    seed7  *= seed7;    seed8  *= seed8;
    seed9  *= seed9;    seed10 *= seed10;
    seed11 *= seed11;   seed12 *= seed12;

    int sh1, sh2, sh3;
    if (seed & 1) {
        sh1 = (seed & 2 ? 4 : 5);
        sh2 = (partitioncount == 3 ? 6 : 5);
    } else {
        sh1 = (partitioncount == 3 ? 6 : 5);
        sh2 = (seed & 2 ? 4 : 5);
    }
    sh3 = (seed & 0x10) ? sh1 : sh2;

    seed1 >>= sh1; seed2  >>= sh2; seed3  >>= sh1; seed4  >>= sh2;
    seed5 >>= sh1; seed6  >>= sh2; seed7  >>= sh1; seed8  >>= sh2;
    seed9 >>= sh3; seed10 >>= sh3; seed11 >>= sh3; seed12 >>= sh3;

    int a = seed1 * x + seed2 * y + seed11 * z + (rnum >> 14);
    int b = seed3 * x + seed4 * y + seed12 * z + (rnum >> 10);
    int c = seed5 * x + seed6 * y + seed9  * z + (rnum >>  6);
    int d = seed7 * x + seed8 * y + seed10 * z + (rnum >>  2);

    a &= 0x3F; b &= 0x3F; c &= 0x3F; d &= 0x3F;

    if (partitioncount < 4) { d = 0; }
    if (partitioncount < 3) { c = 0; }

    if (a >= b && a >= c && a >= d) { return 0; }
    else if (b >= c && b >= d) { return 1; }
    else if (c >= d) { return 2; }
    else { return 3; }
}
-----

As has been observed before, the bit selections are much easier to
express in hardware than in C.

<<<

The seed is expanded using a hash function `hash52()`, which is defined
as follows:

-----
uint32_t hash52( uint32_t p )
{
    p ^= p >> 15;  p -= p << 17;  p += p << 7; p += p <<  4;
    p ^= p >>  5;  p += p << 16;  p ^= p >> 7; p ^= p >> 3;
    p ^= p <<  6;  p ^= p >> 17;
    return p;
}
-----

This assumes that all operations act on 32-bit values

=== Data Size Determination

The size of the data used to represent color endpoints is not
explicitly specified. Instead, it is determined from the _block mode_ and
number of partitions as follows:

-----
config_bits = 17;
if (num_partitions > 1) {
    if (single_CEM) { config_bits = 29; }
    else { config_bits = 25 + 3 * num_partitions; }
}

num_weights = Wwidth * Wheight * Wdepth; // size of weight grid

if (dual_plane) {
    config_bits += 2;
    num_weights *= 2;
}

weight_bits = ceil( num_weights * 8 * trits_in_weight_range / 5 ) +
              ceil( num_weights * 7 * quints_in_weight_range / 3 ) +
              num_weights * bits_in_weight_range;

remaining_bits = 128 - config_bits - weight_bits;

num_CEM_pairs = base_CEM_class + 1 + count_bits( extra_CEM_bits );
-----

The _CEM_ value range is then looked up from a table indexed by remaining
bits and `num_CEM_pairs`.
This table is initialized such that the range is as large as possible,
consistent with the constraint that the number of bits required to
encode `num_CEM_pairs` pairs of values is not more than the number of
remaining bits.

An equivalent iterative algorithm would be:

-----
num_CEM_values = num_CEM_pairs*2;

for(range = each possible CEM range in descending order of size)
{
    CEM_bits = ceil( num_CEM_values * 8 * trits_in_CEM_range / 5 ) +
               ceil( num_CEM_values * 7 * quints_in_CEM_range / 3 ) +
               num_CEM_values * bits_in_CEM_range;

    if (CEM_bits <= remaining_bits) { break; }
}
return range;
-----

In cases where this procedure results in unallocated bits, these bits
are not read by the decoding process and can have any value.

[[astc-void-extent-blocks]]
=== Void-Extent Blocks

A void-extent block is a block encoded with a single color.
It also specifies some additional information about the extent of the
single-color area beyond this block, which can optionally be used by a
decoder to reduce or prevent redundant block fetches.
Void-extent blocks must be supported, but void extents may not be checked.

In the HDR case, if the decode mode is ++decode_rgb9e5++, then any negative
color component values are set to 0 before conversion to the shared
exponent format (as described in <<astc_weight_application>>).

The layout of a 2D void-extent block is as shown in <<astc-2d-void-extent-block-layout-overview>>.

[[astc-2d-void-extent-block-layout-overview]]
.ASTC 2D void-extent block layout overview
[cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1",width="97%"]
|=================
^| ~127~ ^| ~126~ ^| ~125~ ^| ~124~ ^| ~123~ ^| ~122~ ^| ~121~ ^| ~120~ ^| ~119~ ^| ~118~ ^| ~117~ ^| ~116~ ^| ~115~ ^| ~114~ ^| ~113~ ^| ~112~
16+^|                 Block color _A_ component^15..0^
^| ~111~ ^| ~110~ ^| ~109~ ^| ~108~ ^| ~107~ ^| ~106~ ^| ~105~ ^| ~104~ ^| ~103~ ^| ~102~ ^| ~101~ ^| ~100~ ^| ~99~ ^| ~98~ ^| ~97~ ^| ~96~
16+^|                 Block color _B_ component^15..0^
^| ~95~ ^| ~94~ ^| ~93~ ^| ~92~ ^| ~91~ ^| ~90~ ^| ~89~ ^| ~88~ ^| ~87~ ^| ~86~ ^| ~85~ ^| ~84~ ^| ~83~ ^| ~82~ ^| ~81~ ^| ~80~
16+^|                 Block color _G_ component^15..0^
^| ~79~ ^| ~78~ ^| ~77~ ^| ~76~ ^| ~75~ ^| ~74~ ^| ~73~ ^| ~72~ ^| ~71~ ^| ~70~ ^| ~69~ ^| ~68~ ^| ~67~ ^| ~66~ ^| ~65~ ^| ~64~
16+^|                 Block color _R_ component^15..0^
^| ~63~ ^| ~62~ ^| ~61~ ^| ~60~ ^| ~59~ ^| ~58~ ^| ~57~ ^| ~56~ ^| ~55~ ^| ~54~ ^| ~53~ ^| ~52~ ^| ~51~ ^| ~50~ ^| ~49~ ^| ~48~
13+^|    Void-extent maximum _t_ coordinate^12..0^              3+^|    Min _t_ coord^12..10^
^| ~47~ ^| ~46~ ^| ~45~ ^| ~44~ ^| ~43~ ^| ~42~ ^| ~41~ ^| ~40~ ^| ~39~ ^| ~38~ ^| ~37~ ^| ~36~ ^| ~35~ ^| ~34~ ^| ~33~ ^| ~32~
10+^| Void-extent minimum _t_ coordinate^9..0^       6+^|   Void-extent maximum _s_ coordinate^12..7^
^| ~31~ ^| ~30~ ^| ~29~ ^| ~28~ ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~ ^| ~22~ ^| ~21~ ^| ~20~ ^| ~19~ ^| ~18~ ^| ~17~ ^| ~16~
7+^| Void-extent maximum _s_ coordinate^5..0^    9+^|  Void-extent minimum _s_ coordinate^12..4^
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
4+^| Minimum _s_ coordinate^3..0^ ^| 1 ^| 1 ^| D ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 0 ^| 0
|=================

The layout of a 3D void-extent block is as shown in <<astc-3d-void-extent-block-layout-overview>>.

[[astc-3d-void-extent-block-layout-overview]]
.ASTC 3D void-extent block layout overview
[cols="1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1",width="97%"]
|=================
^| ~127~ ^| ~126~ ^| ~125~ ^| ~124~ ^| ~123~ ^| ~122~ ^| ~121~ ^| ~120~ ^| ~119~ ^| ~118~ ^| ~117~ ^| ~116~ ^| ~115~ ^| ~114~ ^| ~113~ ^| ~112~
16+^|                 Block color _A_ component^15..0^
^| ~111~ ^| ~110~ ^| ~109~ ^| ~108~ ^| ~107~ ^| ~106~ ^| ~105~ ^| ~104~ ^| ~103~ ^| ~102~ ^| ~101~ ^| ~100~ ^| ~99~ ^| ~98~ ^| ~97~ ^| ~96~
16+^|                 Block color _B_ component^15..0^
^| ~95~ ^| ~94~ ^| ~93~ ^| ~92~ ^| ~91~ ^| ~90~ ^| ~89~ ^| ~88~ ^| ~87~ ^| ~86~ ^| ~85~ ^| ~84~ ^| ~83~ ^| ~82~ ^| ~81~ ^| ~80~
16+^|                 Block color _G_ component^15..0^
^| ~79~ ^| ~78~ ^| ~77~ ^| ~76~ ^| ~75~ ^| ~74~ ^| ~73~ ^| ~72~ ^| ~71~ ^| ~70~ ^| ~69~ ^| ~68~ ^| ~67~ ^| ~66~ ^| ~65~ ^| ~64~
16+^|                 Block color _R_ component^15..0^
^| ~63~ ^| ~62~ ^| ~61~ ^| ~60~ ^| ~59~ ^| ~58~ ^| ~57~ ^| ~56~ ^| ~55~ ^| ~54~ ^| ~53~ ^| ~52~ ^| ~51~ ^| ~50~ ^| ~49~ ^| ~48~
9+^|   Void-extent maximum _r_ coordinate^8..0^   7+^|    Void-extent minimum _r_ coordinate^8..2^
^| ~47~ ^| ~46~ ^| ~45~ ^| ~44~ ^| ~43~ ^| ~42~ ^| ~41~ ^| ~40~ ^| ~39~ ^| ~38~ ^| ~37~ ^| ~36~ ^| ~35~ ^| ~34~ ^| ~33~ ^| ~32~
2+^| Min _r_ coord^1..0^ 9+^|   Void-extent maximum _t_ coordinate^8..0^    5+^|  Void-extent min _t_ coordinate^8..4^
^| ~31~ ^| ~30~ ^| ~29~ ^| ~28~ ^| ~27~ ^| ~26~ ^| ~25~ ^| ~24~ ^| ~23~ ^| ~22~ ^| ~21~ ^| ~20~ ^| ~19~ ^| ~18~ ^| ~17~ ^| ~16~
4+^| Minimum _t_ coordinate^3..0^   9+^|  Void-extent minimum _s_ coordinate^8..0^ 3+^| Min _s_ coord^8..6^
^| ~15~ ^| ~14~ ^| ~13~ ^| ~12~ ^| ~11~ ^| ~10~ ^| ~9~ ^| ~8~ ^| ~7~ ^| ~6~ ^| ~5~ ^| ~4~ ^| ~3~ ^| ~2~ ^| ~1~ ^| ~0~
6+^| Void-extent minimum _s_ coordinate^5..0^       ^| D ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 1 ^| 0 ^| 0
|=================

Bit 9 is the Dynamic Range flag, which indicates the format in which
colors are stored. A 0 value indicates LDR, and a 1 indicates HDR.

For LDR encodings the color components are stored in the same UNORM16
format as the output of the interpolator.
The values stored here directly provide the LDR values of C defined in
<<astc_weight_application>>.
These values follow the same post-interpolator path as non-void-extent
LDR blocks, with regards to handling of sRGB conversion and decoding
modes.

For HDR encodings the color components are stored as FP16 values.
The values stored here directly provide the HDR values of C~f~ defined
in <<astc_weight_application>>.
These values follow the same post-interpolator path as non-void-extent
HDR blocks, with regards to handling of decoding modes.

If a void-extent block with HDR values is decoded in LDR operation
mode, then the result will be the error color, opaque magenta, for
all texels within the block.

In the HDR case, if the color component values are infinity or NaN, this
will result in undefined behavior. As usual, this must not lead to an
API's interruption or termination.

Bits 10 and 11 are reserved and must be 1.

The minimum and maximum coordinate values are treated as unsigned
integers and then normalized into the range 0..1 (by dividing by 2^13^-1
or 2^9^-1, for 2D and 3D respectively). The maximum values for each
dimension must be greater than the corresponding minimum values,
unless they are all all-1s.

If all the coordinates are all-1s, then the void extent is ignored,
and the block is simply a constant-color block.

The existence of single-color blocks with void extents must not produce
results different from those obtained if these single-color blocks are
defined without void-extents. Any situation in which the results would
differ is invalid. Results from invalid void extents are undefined.

If a void-extent appears in a MIPmap level other than the most detailed
one, then the extent will apply to all of the more detailed levels too.
This allows decoders to avoid sampling more detailed MIPmaps.

If the more detailed MIPmap level is not a constant color in this region,
then the block may be marked as constant color, but without a void extent,
as detailed above.

If a void-extent extends to the edge of a texture, then filtered texture
colors may not be the same color as that specified in the block, due to
texture border colors, wrapping, or cube face wrapping.

Care must be taken when updating or extracting partial image data that
void-extents in the image do not become invalid.

<<<

=== Illegal Encodings

In ASTC, there is a variety of ways to encode an illegal block. Decoders
are required to recognize all illegal blocks and emit the standard error
color value upon encountering an illegal block.

Here is a comprehensive list of situations that represent illegal block
encodings:

*   The _block mode_ specified is one of the block modes explicitly
    listed as Reserved.
*   A 2D void-extent block that has any of the reserved bits not
    set to 1.
*   A _block mode_ has been specified that would require more than
    64 weights total.
*   A _block mode_ has been specified that would require more than
    96 bits for integer sequence encoding of the weight grid.
*   A _block mode_ has been specified that would require fewer than
    24 bits for integer sequence encoding of the weight grid.
*   The size of the weight grid exceeds the size of the block footprint
    in any dimension.
*   _Color endpoint modes_ have been specified such that the color
    integer sequence encoding would require more than 18 integers.
*   The number of bits available for color endpoint encoding after all
    the other fields have been counted is less than
    latexmath:[$\left\lceil{13\times C\over 5}\right\rceil$] where _C_
    is the number of color endpoint integers (this would restrict color
    integers to a range smaller than 0..5, which is not supported).
*   Dual-weight mode is enabled for a block with 4 partitions.
*   Void-extent blocks where the low coordinate for some texture axis
    is greater than or equal to the high coordinate.

Note also that, in LDR operation mode, a block which has both HDR and
LDR _endpoint modes_ assigned to different partitions is not an error block.
Only those texels which belong to the HDR partition will result in the
error color.
Texels belonging to a LDR partition will be decoded as normal.

[[astc_ldr_profile]]
=== LDR Profile Support

In order to ease verification and accelerate adoption, an LDR-only
subset of the full ASTC specification has been made available.

Implementations of this LDR Profile must satisfy the following requirements:

*   All textures with valid encodings for LDR Profile must decode
    identically using either a LDR Profile, HDR Profile, or Full Profile
    decoder.
*   All features included only in the HDR Profile or Full Profile must be
    treated as reserved in the LDR Profile, and return the error color on
    decoding.
*   Any sequence of API calls valid for the LDR Profile must also be valid
    for the HDR Profile or Full Profile and return identical results when
    given a texture encoded for the LDR Profile.

The feature subset for the LDR profile is:

*   2D textures only.
*   Only those block sizes listed in <<astc_footprint_2D>> are supported.
*   sRGB and linear LDR operation modes only.
*   Only _LDR endpoint modes_ must be supported, namely modes
    0, 1, 4, 5, 6, 8, 9, 10, 12, 13.
*   Decoding from a HDR endpoint results in the error color.
*   LDR, 2D void-extent blocks must be supported.

[[astc_hdr_profile]]
=== HDR Profile Support

In order to ease verification and accelerate adoption, a second subset
of the full ASTC specification has been made available, known as the
HDR profile.

Implementations of the HDR Profile must satisfy the following requirements:

*   The HDR profile is a superset of the LDR profile and therefore all
    valid LDR encodings must decode identically using a HDR profile
    decoder.
*   All textures with valid encodings for HDR Profile must decode
    identically using either a HDR Profile or Full Profile decoder.
*   All features included only in the Full Profile must be treated as
    reserved in the HDR Profile, and return the error color on decoding.
*   Any sequence of API calls valid for the HDR Profile must also be valid
    for the Full Profile and return identical results when given a texture
    encoded for the HDR Profile.

The feature subset for the HDR profile is:

*   2D textures only.
*   Only those block sizes listed in <<astc_footprint_2D>> are supported.
*   All _endpoint modes_ must be supported.
*   2D void-extent blocks must be supported.