Metadata-Version: 2.4
Name: gfloat
Version: 0.5.2
Summary: Generic floating point handling in Python
Author-email: Andrew Fitzgibbon <awf@fitzgibbon.ie>
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.8.1
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: more_itertools
Requires-Dist: array-api-compat
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: nbval; extra == "dev"
Requires-Dist: ml_dtypes; extra == "dev"
Requires-Dist: jaxlib; extra == "dev"
Requires-Dist: jax; extra == "dev"
Requires-Dist: torch; extra == "dev"
Requires-Dist: array-api-strict; extra == "dev"
Requires-Dist: torchao==0.10; extra == "dev"
Requires-Dist: airium; extra == "dev"
Requires-Dist: pandas; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: black[jupyter]; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: sphinx==7.1.2; extra == "dev"
Requires-Dist: sphinx-rtd-theme==1.3.0rc1; extra == "dev"
Requires-Dist: sphinx_paramlinks; extra == "dev"
Requires-Dist: myst_nb; extra == "dev"
Dynamic: license-file

<!-- Copyright (c) 2024 Graphcore Ltd. All rights reserved. -->

# gfloat: Generic floating-point types in Python

An implementation of generic floating point encode/decode logic,
handling various current and proposed floating point types:

 - [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754): Binary16, Binary32
 - [OCP Float8](https://www.opencompute.org/documents/ocp-8-bit-floating-point-specification-ofp8-revision-1-0-2023-06-20-pdf): E5M2, E4M3
 - [IEEE WG P3109](https://github.com/P3109/Public/blob/main/Shared%20Reports/IEEE%20WG%20P3109%20Interim%20Report.pdf): P3109_{K}p{P} for K > 2, and 1 <= P < K.
 - [OCP MX Formats](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf): E2M1, M2M3, E3M2, E8M0, INT8, and the MX block formats.

The library favours readability and extensibility over speed (although the *_ndarray functions are reasonably fast for large arrays, see the [benchmarking  notebook](docs/source/04-benchmark.ipynb)).
For other implementations of these datatypes more focused on speed see, for example, [ml_dtypes](https://github.com/jax-ml/ml_dtypes),
[bitstring](https://github.com/scott-griffiths/bitstring),
[MX PyTorch Emulation Library](https://github.com/microsoft/microxcaling).

See https://gfloat.readthedocs.io for documentation, or dive into the notebooks to explore the formats.

For example, here's a table from the [02-value-stats](docs/source/02-value-stats.ipynb) notebook:

|name|B: Bits in the format|P: Precision in bits|E: Exponent field width in bits|Exact in float16?|Exact in float32?|0<x<1|1<x<Inf|minSubnormal|maxSubnormal|minNormal|maxNormal| |
|--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |--- |
| name         |   B |   P |   E | rt16   | rt32   |   lt1 |   gt1 | minSubnormal   | maxSubnormal   | minNormal   | maxNormal     |
|--------------|-----|-----|-----|--------|--------|-------|-------|----------------|----------------|-------------|---------------|
| p3109_k3p2sf |   3 |   2 |   1 | True   | True   |     1 |     1 | 0.5            | 0.5            | 1           | 1.5           |
| ocp_e2m1     |   4 |   2 |   2 | True   | True   |     1 |     5 | 0.5            | 0.5            | 1           | 6             |
| p3109_k4p2sf |   4 |   2 |   2 | True   | True   |     3 |     3 | 0.25           | 0.25           | 0.5         | 3             |
| ocp_e2m3     |   6 |   4 |   2 | True   | True   |     7 |    23 | 0.125          | 0.875          | 1           | 7.5           |
| ocp_e3m2     |   6 |   3 |   3 | True   | True   |    11 |    19 | 0.0625         | 0.1875         | 0.25        | 28            |
| p3109_k6p3sf |   6 |   3 |   3 | True   | True   |    15 |    15 | 0.03125        | 0.09375        | 0.125       | 14            |
| p3109_k6p4sf |   6 |   4 |   2 | True   | True   |    15 |    15 | 0.0625         | 0.4375         | 0.5         | 3.75          |
| ocp_e4m3     |   8 |   4 |   4 | True   | True   |    55 |    70 | 2^-9           | 7/4*2^-7       | 0.015625    | 448           |
| ocp_e5m2     |   8 |   3 |   5 | True   | True   |    59 |    63 | 2^-16          | 3/2*2^-15      | 2^-14       | 57344         |
| p3109_k8p1se |   8 |   1 |   7 | False  | True   |    63 |    62 | n/a            | n/a            | 2^-63       | 2^62          |
| p3109_k8p1ue |   8 |   1 |   8 | False  | True   |   127 |   125 | n/a            | n/a            | 2^-127      | 2^125         |
| p3109_k8p3se |   8 |   3 |   5 | True   | True   |    63 |    62 | 2^-17          | 3/2*2^-16      | 2^-15       | 49152         |
| p3109_k8p3sf |   8 |   3 |   5 | True   | True   |    63 |    63 | 2^-17          | 3/2*2^-16      | 2^-15       | 57344         |
| p3109_k8p3ue |   8 |   3 |   6 | False  | True   |   127 |   125 | 2^-33          | 3/2*2^-32      | 2^-31       | 5/4*2^31      |
| p3109_k8p3uf |   8 |   3 |   6 | False  | True   |   127 |   126 | 2^-33          | 3/2*2^-32      | 2^-31       | 3/2*2^31      |
| p3109_k8p4se |   8 |   4 |   4 | True   | True   |    63 |    62 | 2^-10          | 7/4*2^-8       | 0.0078125   | 224           |
| p3109_k8p4sf |   8 |   4 |   4 | True   | True   |    63 |    63 | 2^-10          | 7/4*2^-8       | 0.0078125   | 240           |
| p3109_k8p4ue |   8 |   4 |   5 | True   | True   |   127 |   125 | 2^-18          | 7/4*2^-16      | 2^-15       | 53248         |
| p3109_k8p4uf |   8 |   4 |   5 | True   | True   |   127 |   126 | 2^-18          | 7/4*2^-16      | 2^-15       | 57344         |
| p3109_k8p7sf |   8 |   7 |   1 | True   | True   |    63 |    63 | 0.015625       | 63/32*2^-1     | 1           | 127/64*2^0    |
| p3109_k8p8uf |   8 |   8 |   1 | True   | True   |   127 |   126 | 0.0078125      | 127/64*2^-1    | 1           | 127/64*2^0    |
| binary16     |  16 |  11 |   5 | True   | True   | 15359 | 16383 | 2^-24          | 1023/512*2^-15 | 2^-14       | 65504         |
| bfloat16     |  16 |   8 |   8 | False  | True   | 16255 | 16383 | 2^-133         | 127/64*2^-127  | 2^-126      | 255/128*2^127 |
| ocp_e8m0     |   8 |   1 |   8 | False  | True   |   127 |   127 | n/a            | n/a            | 2^-127      | 2^127         |
| ocp_int8     |   8 |   8 |   0 | True   | True   |    63 |    63 | 0.015625       | 127/64*2^0     | n/a         | n/a           |
#### Notes

All NaNs are the same, with no distinction between signalling or quiet,
or between differently encoded NaNs.
