📌 Systems Cheatsheet

Essential Powers of Two

Powers of two is fundamental in computing, as they frequently represent limits, capacities, and common units of measurement. The following table consolidates key powers of two that are particularly useful for quick recall.

📝 $2^{n}$
$2^{8} \rightarrow$ u8
$2^{16} \rightarrow$ u16
$2^{32} \rightarrow$ u32
$2^{64} \rightarrow$ u64
$2^{10}$ Bytes $\rightarrow$ 1 KiB
$2^{20}$ Bytes $\rightarrow$ 1 MiB
$2^{30}$ Bytes $\rightarrow$ 1 GiB

$2^n$	Value	Common Use Case / Significance
$2^8$	$256$	Maximum value for an 8-bit unsigned integer (`u8`) number of possible ASCII characters, byte values (integers that can fit within 1-Byte or 8-bits e.g. 0-255).
$2^{10}$	$1,024$	Exactly 1 Kilobyte (KB). Also relevant for stack sizes and memory allocation.
$2^{16}$	$65,536$	Maximum value for a 16-bit unsigned integer (`u16`), common range for network port numbers (0-65535).
$2^{20}$	$1,048,576$	Exactly 1 Megabyte (MB). Represents a million in binary contexts.
$2^{24}$	$16.7 \times 10^6$	Number of colors in 24-bit RGB color depth (True Color) Approximated as 16.7 Million.
$2^{32}$	$4.29 \times 10^9$	Maximum value for a 32-bit unsigned integer (`u32`), total address space for IPv4, common for file IDs and large counters. Approximated as 4.29 Billion.
$2^{64}$	$1.84 \times 10^{19}$	Maximum value for a 64-bit unsigned integer (`u64`), used for extremely large IDs, precise timestamps (e.g., nanoseconds since epoch), and representing vast amounts of memory. Approximated as $1.84 \times 10^{19}$.

Core Computer Storage Units

The fundamental units of digital information and their conversions is crucial in computer science and engineering. Unlike the decimal system (base-10) used in everyday life, computers primarily operate using the binary system (base-2). This often leads to differences in how “kilo,” “mega,” and “giga” are interpreted in computing contexts compared to their standard metric definitions.

📝 Note
1 bit $\to \hspace{0.1em}$ $0$ or $1$
$8$ bits $\to \hspace{0.1em}$ $1$ Byte
💻 IEC
$2^{10}$ Bytes $\to \hspace{0.1em}$ 1 KiB
$2^{20}$ Bytes $\to \hspace{0.1em}$ 1 MiB
$2^{30}$ Bytes $\to \hspace{0.1em}$ 1 GiB
used during programming for memory estimations
📏 SI
1000 bytes $ \rightarrow $ 1 KB
1000 KB $ \rightarrow $ 1 MB
1000 MB $ \rightarrow $ 1 GB
used by hardware manufacturers for reporting hard drive capacities

A consolidated reference for common bit and byte units are shown below:

Unit	Value / Description	Notes
1 Bit	The smallest unit of data	Represents a binary digit, either 0 or 1.
1 Byte (B)	8 bits	The fundamental addressable unit of memory. Commonly stores a single character (e.g., ASCII).
1 Kilobyte (KB/KiB)	$2^{10}$ = 1,024 Bytes	1 KB = 1,000 bytes; 1 KiB (kibibyte) = 1024
1 Megabyte (MB/MiB)	$2^{20}$ = 1,048,576 Bytes	1 MB = 1 million bytes (1000 KB); MiB (mebibyte) = 1024 KiB
1 Gigabyte (GB/GiB)	$2^{30}$ = 1,073,741,824 Bytes	1 GB = 1 billion bytes (1000 MB); 1 GiB (gibibyte) = 1024 MiB
1 Terabyte (TB/TiB)	$2^{40}$ = 1,099,511,627,776 Bytes	1 TB = 1 trillion bytes (1000 GB), 1 TiB (tebibyte) = 1024 (GiB)
Cache Line (e.g., x86-64)	Typically 64 Bytes	The smallest unit of data that can be transferred between main memory and CPU cache. Optimizing for cache lines is critical for performance.
Page Size (e.g., x86-64)	Typically 4 KB (4,096 Bytes)	The smallest unit of memory that the operating system manages in virtual memory. Memory is allocated and protected in pages.

Prefix Comparison

It’s worth noting the distinction between the traditional binary prefixes (powers of 2) and the decimal prefixes (powers of 10) used in the International System of Units (SI).

Binary Prefixes (IEC Standard): KiB (kibibyte), MiB (mebibyte), GiB (gibibyte), TiB (tebibyte) use powers of $2^{10}$ $\to \hspace{0.1em}$ precise for computer memory and storage during programming.
Decimal Prefixes (SI Standard): KB (kilobyte), MB (megabyte), GB (gigabyte), TB (terabyte) typically use powers of $10^3$ $\to \hspace{0.1em}$ used by hard drive manufacturers to express capacity, leading to slight discrepancies with the actual binary capacity reported by operating systems.

For example, a 1 TB hard drive is $10^{12}$ bytes, which is slightly less than 1 TiB ($2^{40}$ bytes).

Primitive Data Types: (64-bit Systems)

Understanding the characteristics of primitive data types—their size in memory and the range of values they can hold—is fundamental for efficient programming, especially in systems-level languages like C++ and Rust. This knowledge is crucial for memory optimization, avoiding overflow errors, and ensuring data integrity. While Python’s dynamic typing abstracts many of these details, it’s still beneficial to grasp the underlying concepts.

The below tables consolidates information on common primitive types, their typical sizes on a 64-bit architecture, and their value ranges, along with comparisons across C++, Rust, and Python.

Signed Data Types️

These types can represent both positive and negative values. They use a bit (typically the most significant one) to indicate the sign.

Type Category	C++ Type(s)	Rust Type(s)	Size (Bytes)	Signed Range (Approximate)	Notes
Integer (8-bit)	`int8_t`, `char` (often)	`i8`	1	$-128$ to $127$	Used for small integer values; the sign of C++ `char` can be platform-dependent.
Integer (16-bit)	`short`, `int16_t`	`i16`	2	$-32,768$ to $32,767$	Common for port numbers, smaller counters.
Integer (32-bit)	`int`, `long`, `int32_t`	`i32`	4	$-2.14 \text{ Billion}$ to $2.14 \text{ Billion}$	A standard integer size on many systems.
Integer (64-bit)	`long long`, `int64_t`	`i64`	8	$\pm9.22 \text{ Quintillion}$	Used for large numbers, timestamps, and unique IDs.
Pointer-sized Integer	`ptrdiff_t`	`isize`	8	$\pm2^{63}-1$ (on 64-bit)	Represents the difference between two pointers, allowing for negative offsets.
Floating Point (Single)	`float`	`f32`	4	$\sim\pm3.4 \times 10^{38}$	IEEE 754 type with ~7 decimal digits of precision.
Floating Point (Double)	`double`	`f64`	8	$\sim\pm1.8 \times 10^{308}$	IEEE 754 type with ~15 decimal digits of precision.

Unsigned Data Types

These types can only represent non-negative values (zero and positive numbers), allowing them to store larger positive values than their signed counterparts of the same size.

Type Category	C++ Type(s)	Rust Type(s)	Size (Bytes)	Unsigned Range (Approximate)	Notes
Character	`unsigned char`, `char`	`char`, `u8`	1 (C++) 4 (Rust)	C++: 0 to 255; Rust: 0 to $0x10FFFF$	Rust’s `char` is a 4-byte Unicode scalar value. C++ `char` is often 1 byte. `u8` is used for byte manipulation.
Integer (8-bit)	`uint8_t`	`u8`	1	$0$ to $255$	Ideal for representing a single byte of data.
Integer (16-bit)	`uint16_t`	`u16`	2	$0$ to $65,535$	Useful for data that won’t exceed this limit, like image dimensions.
Integer (32-bit)	`uint32_t`	`u32`	4	$0$ to $4.29 \text{ Billion}$	Commonly used for IPv4 addresses and file IDs.
Integer (64-bit)	`uint64_t`	`u64`	8	$0$ to $18.4 \text{ Quintillion}$	Essential for large counts or bit manipulation on 64-bit values.
Pointer-sized Integer	`size_t`	`usize`	8	$0$ to $2^{64}-1$ (on 64-bit)	The standard type for memory sizes and collection indices.

Special Types

These types don’t fit into the signed/unsigned numeric categories but are fundamental building blocks. Python’s int and str types are included here as they are dynamically sized and don’t have fixed-width signed/unsigned counterparts.

Type Category	C++ Type(s)	Rust Type(s)	Python Type(s)	Size (Bytes)	Notes
Boolean	`bool`	`bool`	`bool`	1	Stores a simple `true` or `false` value.
Raw Pointer	`T` (e.g., `int`)	`const T`, `mut T`	N/A	8 (on 64-bit)	Stores a memory address. Its “value” is an address, not a signed/unsigned number.
Character	`char`	`char`	`str` (len 1)	1 B (Python/C++), 4 B (Rust)	Python uses strings for single characters. C++ `char` is typically an 8-bit integer representing an ASCII character. Rust `char` is a 32-bit Unicode Scalar Value.
Dynamic String	`std::string`	`String`, `&str`	`str`	Varies	A heap-allocated data structure for text. Python `str` is immutable.
String View/Slice	`std::string_view`	`&str`	(Slicing `str`)	Varies	An immutable, non-owning view into a part of a string.

Best Practices

Integer Sizing:
- C++ & Rust: Offer fixed-width integers (e.g., int32_t, i32) which guarantee the same size on any platform. For portability and unambiguous code in C++, it is recommended to use these over generic types like int or long.
- Python: Its int type has arbitrary precision, meaning it automatically uses more memory as needed to store larger numbers and has no practical size limit.
- Mixing Signed & Unsigned Types: This is a classic C++ bug. A condition like i < n where i is signed and n is unsigned can lead to unexpected behavior due to implicit type promotion rules. Rust’s strict type system prevents this by disallowing operations between different types without an explicit cast.
Characters:
- The char type is fundamentally different. In C++, char is a 1-byte type typically holding an ASCII value. In Rust, char is a 4-byte type representing any Unicode Scalar Value, making it suitable for all languages.
- char Discrepancy: A C++ char is a 1-byte type representing an ASCII character. In contrast, a Rust char is always 4 bytes, representing a full Unicode Scalar Value.
Pointer-Sized Integers:
- usize (Rust) and size_t (C++) are crucial for indexing and memory-related calculations. Their size matches the system’s address space (8 bytes on a 64-bit system), ensuring they can always hold the size of any object in memory.
Strings:
- All three languages provide dynamic, heap-allocated strings (std::string, String, str) for mutable text.
- Rust and modern C++ also offer lightweight, immutable string “views” or “slices” (&str and std::string_view) for efficient, non-owning access to string data.

Think in…	To Achieve…
Bytes, Bits, & Pages	A clear understanding of memory layout, alignment, and fragmentation.
Powers of Two	Quick and accurate estimations for storage, memory, and latency.
Signed vs. Unsigned	Prevention of subtle overflow and comparison bugs in loops and conditions.
Platform-Sized Pointers	Optimized indexing and correct interaction with low-level system APIs.
Stack vs. Heap	Proper memory management: use the stack for fast, scoped, fixed-size data and the heap for large or dynamically-sized data.
Endianness	Correct data serialization and network communication by understanding byte order (Big vs. Little-Endian).

Quick Memory Estimation Guide

This table provides quick, practical estimates for the memory footprint of common data types and system-level structures, essential for performance-aware software design.

Item / Data Structure	Approximate Size	Notes
Large Data Collections
1 million 32-bit integers	~4 MB	(1,000,000 items × 4 bytes/item)
100 million 32-bit floats	~400 MB	(100,000,000 items × 4 bytes/item)
A 10-character string	~10 to 40 bytes	1 byte/char for ASCII; 1-4 bytes/char for UTF-8.
A 100-character string	~100 to 400 bytes	Does not include metadata overhead from string objects.
UNIX Timestamp	8 bytes	Typically stored as a 64-bit integer (`i64`).

Identifiers & Hashes
IPv4 Address	4 bytes	A 32-bit numerical address `(uint32_t)`.
64-bit Pointer	8 bytes	The size of a memory address on a 64-bit system.
UUID (v4)	16 bytes	A 128-bit universally unique identifier.
SHA-256 Hash	32 bytes	A 256-bit secure hash algorithm output.
Hex String Representation	1 byte per 2 hex digits	Example: The string `"0xDEADBEEF"` represents 4 bytes of data.
ASCII Character	1 byte
UTF-8 Character	1 to 4 bytes

System-Level Constants
Memory Page	4 KB	(4,096 bytes) The basic unit of memory managed by the OS.
Typical L1/L2 Cache Line	64 bytes	The smallest unit of data transferred between memory and a CPU cache.

Memory Allocation

Indexing & Memory Allocation

Use the Right Type for Sizes: For indexing or representing object sizes, always use size_t in C++ and usize in Rust. These types are guaranteed to be the width of a memory pointer on the target platform (e.g., 64 bits on a 64-bit system).
Stack vs. Heap Allocation: Know when to use the heap. Prefer the heap (Vec, Box in Rust) when dealing with large amounts of data or when the data’s size is not known at compile time. A typical thread’s stack size is small (1-8 MB).

Stack vs Heap

Feature	Stack	Heap
Allocation Time	Fast (LIFO)	Slower (dynamic alloc/free)
Lifetime	Auto (scope-bound)	Manual (must free/drop)
Size	Limited (MBs)	Large (GBs)
Location	Per-thread	Shared
Use Case	Local vars, small arrays	Dynamic structures, `Box`, `Vec`

Cold Estimations

✅ Estimate RAM usage of a struct with multiple fields
✅ Say how much memory vector<int> of size 1M uses
✅ Recall signed/unsigned integer ranges instantly
✅ Explain overflow behavior in Rust vs C++ vs Python
✅ Index an array in Rust with usize and know why
✅ Know what sizeof(int*) is on your target system
✅ Use std::numeric_limits<T>::max() or T::MAX in Rust