Eric Radman : a Journal

Size and Units

Historically data size was interpreted as a power of 2 or 10 based on the context. For example, 4KB of storage or memory is 4096 bytes, whereas a bandwidth of 4Kbps means 4000 bytes/second.

A new set of difficult-to-pronounce terms (known as IEC) have been devised to remove this ambiguity:

kibibyte
mebibyte
gibibyte
tebibyte
pebibyte
exbibyte

Charts

Most charting libraries are based on D3, which will summarize number in powers of 10 using the SI-prefix,

File Age Distribution

While there is no way to change the X or Y scale in Plotly to use the base-2 notation, text labels can be set manually

var data = [{
  values: ys['size'],   /* bytes */
  text: ys['size_hum'], /* KiB, MiB, ... */
  // ...
}]

The same stragegy for a pie chart

var data = [{
  // ...
  textinfo: "percent",
  textposition: "inside",
  hovertemplate: '%{label}<br>%{text}<br>%{percent:.1%}<extra></extra>',
}]

Conversion

In some cases lookup table can be used to provide a conversion

SELECT
  unit,
  power(10, base_10) AS "base 10",
  power(2, base_2) AS "base 2"
FROM
  (VALUES
    ('k', 3, 10),
    ('M', 6, 20),
    ('G', 9, 30),
    ('T', 12, 40),
    ('P', 15, 50)
  ) powers(unit, base_10, base_2);
  unit |    base 10    |        base 2
 ------+---------------+-----------------------
  k    |          1000 |                  1024
  M    |       1000000 |               1048576
  G    |    1000000000 |            1073741824
  T    | 1000000000000 |         1099511627776
  P    |         1e+15 | 1.125899906842624e+15
  (5 rows)

Since the conversion is a compounding difference, there is no single conversion factor.

Conversion Functions

PostgresSQL provides pg_size_pretty() but this only prints whole numbers. Arbitrary precision may be calculated using a custom function

CREATE FUNCTION size_iec(IN byte_size bigint, IN round_num int)
returns text
language plpgsql
AS
$$
DECLARE
    size numeric := byte_size;
    power_base smallint := 1024;
    prefixes text[] := ARRAY['K', 'M', 'G', 'T', 'P', 'E'];
    counter smallint := 1;
    output text;
BEGIN
    output := byte_size::text;
    WHILE abs(byte_size) >= power_base AND counter <= array_length(prefixes, 1) LOOP
        byte_size := byte_size / power_base;
        size := size / power_base;
        output := concat(round(size, round_num)::text, ' ', prefixes[counter], 'iB');
        counter := counter + 1;
    END LOOP;
    RETURN output;
END;
$$;