Prerequisites
Before diving in, you should understand:
- Linear algebra (vectors, matrices, linear transformations)
- Modular arithmetic
- Basic probability theory
- Computational complexity (P, NP, polynomial vs exponential)
Part 1: What is a Lattice?
Formal Definition
A lattice L is a discrete additive subgroup of R^n. More practically:
Given n linearly independent vectors b₁, b₂, ..., bₙ ∈ R^n (the basis), the lattice L is:
L = { Σᵢ aᵢbᵢ : aᵢ ∈ Z }All integer linear combinations of the basis vectors.
Visual Intuition (2D Example)
b₂ = (1, 2)
↗
●───●───●───●───●
/ / / / /
●───●───●───●───●
/ / / / /
●───●───●───●───● → b₁ = (2, 0)
/ / / / /
●───●───●───●───●
The dots (●) are lattice points - all integer combinations of b₁ and b₂.
Example: 3b₁ + 2b₂ = 3(2,0) + 2(1,2) = (6,0) + (2,4) = (8,4) ← a lattice pointKey Properties
1. Discreteness:
- Lattice points don't cluster - minimum distance between any two points
- This is what makes lattice problems hard
2. Infiniteness:
- Lattice extends infinitely in all directions
- But we work with finite representations (the basis)
3. Multiple Bases:
- Same lattice can have many different bases
- Some bases are "better" (more orthogonal) than others
- Finding good bases is computationally hard!
Basis Quality: The Gram-Schmidt Perspective
For basis B = {b₁, ..., bₙ}, compute Gram-Schmidt orthogonalization B* = {b₁*, ..., bₙ*}:
b₁* = b₁
b₂* = b₂ - proj(b₂, b₁*)
b₃* = b₃ - proj(b₃, b₁*) - proj(b₃, b₂*)
...Good basis: All bᵢ* have similar lengths → nearly orthogonal
Bad basis: Some bᵢ* are very short → highly non-orthogonal
GOOD BASIS BAD BASIS
↑ b₂ ↗ b₂ (almost parallel to b₁)
| /
| /
----+---→ b₁ --/----→ b₁
Easy to work with Hard to work withPart 2: Hard Lattice Problems
The Shortest Vector Problem (SVP)
Definition: Given a lattice basis B, find the shortest non-zero vector in the lattice.
Find: v ∈ L such that ||v|| is minimal and v ≠ 0
● ● ● ● ●
● ● ● ● ●
● ● O ● ● O = origin
● ↑ ● ● ● ↑ = shortest vector (what we're looking for)
● ● ● ● ●Computational Hardness:
- Best classical algorithm: ~2^(0.292n) time (lattice sieving)
- Best quantum algorithm: ~2^(0.265n) time (only small speedup!)
- No known polynomial-time quantum algorithm
Why it's hard:
- Exponentially many lattice points to consider
- Bad basis hides short vectors
- Even approximating within polynomial factor is hard (under standard assumptions)
The Closest Vector Problem (CVP)
Definition: Given a lattice basis B and a target point t, find the lattice point closest to t.
Find: v ∈ L such that ||v - t|| is minimal
● ● ● ● ●
● ● ● ● ●
● ● ● X ● X = target point t
● ● ↑ ● ● ↑ = closest lattice point (what we're looking for)
● ● ● ● ●Relationship to SVP:
- CVP is at least as hard as SVP
- Many crypto constructions use CVP-like problems
- Solving CVP with small error is the basis of LWE
The Learning With Errors Problem (LWE)
This is the foundation of ML-KEM and ML-DSA.
Setup:
- Secret vector s ∈ Z_q^n
- Public matrix A ∈ Z_q^(m×n) (random)
- Error vector e ∈ Z_q^m (small, from discrete Gaussian)
The LWE Problem:
Given (A, b = As + e mod q), find s.
A s e b
┌─────────┐ ┌───┐ ┌───┐ ┌───┐
│ random │ × │ ? │ + │small│ = │ │
│ matrix │ │ │ │noise│ │ │
└─────────┘ └───┘ └───┘ └───┘
(public) (secret) (secret) (public)
You see: A and b
You want to find: s
Problem: The noise e makes this hardWhy LWE is Hard:
Without noise: Solve linear system (easy, Gaussian elimination)
With noise: The error makes the system "over-determined" in a useless way
Without noise: With noise:
b = As b = As + e
A⁻¹b = s ✓ A⁻¹b = s + A⁻¹e ✗
(A⁻¹e is huge, not useful)The Lattice Connection:
LWE can be reformulated as a CVP problem on a specific lattice:
Lattice L = {(y, yA) : y ∈ Z^m}
Target t = (b, 0)
Closest vector reveals sDecision LWE vs Search LWE
Search LWE: Given (A, b = As + e), find s
Decision LWE: Distinguish (A, As + e) from (A, random b)
Both are believed equally hard (reductions exist).
Ring-LWE and Module-LWE
Ring-LWE (RLWE):
- Work in polynomial ring R = Z[X]/(X^n + 1)
- Smaller keys (polynomial instead of matrix)
- Used in earlier PQC proposals
Module-LWE (MLWE):
- Hybrid: Module of rank k over ring R
- Balance between structure and security
- Used in ML-KEM and ML-DSA (the NIST standards)
Standard LWE: Matrix A is m×n random integers
Ring-LWE: Single polynomial a (implicitly defines structured matrix)
Module-LWE: k×k matrix of polynomials (structured but more flexible)
Security vs Efficiency Tradeoff:
Standard LWE ←────────────────────→ Ring-LWE
Most secure Most efficient
Largest keys Smallest keys
Module-LWE
(Sweet spot - NIST choice)Part 3: Security Reductions
What is a Security Reduction?
A reduction proves: "If you can break my crypto scheme, you can solve a hard problem."
Contrapositive: "If the hard problem is hard, my scheme is secure."
Formally:
If there exists an adversary A that breaks scheme S with advantage ε,
then there exists an algorithm B that solves hard problem H with advantage ε'.
A(breaks S) → B(solves H)
If H is hard (no efficient B), then S is secure (no efficient A).ML-KEM Security Reduction (Simplified)
Theorem: ML-KEM is IND-CCA2 secure in the ROM if Module-LWE is hard.
IND-CCA2 = Indistinguishability under Adaptive Chosen-Ciphertext Attack
ROM = Random Oracle Model (hash functions are "ideal")
Reduction:
Adversary that breaks ML-KEM → Algorithm that solves Module-LWE
Key insight: The encryption/decryption structure of ML-KEM
is designed so that breaking it requires solving MLWE.Concrete Security Estimates
Security isn't just "hard" - we need concrete bit security estimates.
Current estimates for ML-KEM-768:
- Best known classical attack: ~2^182 operations
- Best known quantum attack: ~2^164 operations (Grover-like speedup only)
- Target: 192-bit classical security
How estimates are made:
- Analyze best known attacks (lattice sieving, BKZ algorithm)
- Estimate running time based on lattice dimension and modulus
- Apply cryptanalytic improvements over time (estimates decrease!)
NIST Security Levels:
Level 1: At least as hard as AES-128 (~2^143 quantum)
Level 3: At least as hard as AES-192 (~2^207 quantum)
Level 5: At least as hard as AES-256 (~2^272 quantum)
ML-KEM-512: Level 1
ML-KEM-768: Level 3 ← Recommended default
ML-KEM-1024: Level 5Part 4: How ML-KEM Works (The Math)
Parameters (ML-KEM-768)
n = 256 (polynomial degree)
k = 3 (module rank)
q = 3329 (modulus, prime)
η₁ = 2 (secret key noise parameter)
η₂ = 2 (encryption noise parameter)
dᵤ = 10 (ciphertext compression, u)
dᵥ = 4 (ciphertext compression, v)Key Generation
1. Sample matrix A ∈ R_q^(k×k) from seed ρ (public)
2. Sample secret s ∈ R_q^k with coefficients from CBD(η₁)
3. Sample error e ∈ R_q^k with coefficients from CBD(η₁)
4. Compute t = As + e mod q
Public key: (ρ, t) -- seed for A and the vector t
Private key: s -- the secret vector
A s e t
┌─────────┐ ┌───┐ ┌───┐ ┌───┐
│k×k poly │ × │ k │ + │ k │ = │ k │
│ matrix │ │poly│ │poly│ │poly│
└─────────┘ └───┘ └───┘ └───┘
(from ρ) (secret) (small) (public)CBD(η) = Centered Binomial Distribution:
Sample 2η bits, count ones in each half, return difference.
For η=2: Sample 4 bits, coefficients ∈ {-2, -1, 0, 1, 2}
Encapsulation (Encrypt a Shared Secret)
Input: Public key (ρ, t), randomness m
1. Generate (K̄, r) = G(m) where G is a hash function
2. Sample r ∈ R_q^k with coefficients from CBD(η₁) using r as seed
3. Sample e₁ ∈ R_q^k with coefficients from CBD(η₂)
4. Sample e₂ ∈ R_q with coefficients from CBD(η₂)
5. Compute u = Aᵀr + e₁
6. Compute v = tᵀr + e₂ + ⌈q/2⌋·m̄ (m̄ is the message/key encoded)
7. Compress: c₁ = Compress(u, dᵤ), c₂ = Compress(v, dᵥ)
Ciphertext: c = (c₁, c₂)
Shared secret: K = KDF(K̄, H(c))
The v component "encodes" the message by adding ⌈q/2⌋ (half the modulus)
for each 1-bit and 0 for each 0-bit. This creates a "gap" that survives noise.Decapsulation (Recover the Shared Secret)
Input: Private key s, ciphertext c = (c₁, c₂)
1. Decompress: u = Decompress(c₁, dᵤ), v = Decompress(c₂, dᵥ)
2. Compute m̄' = v - sᵀu
3. Decode: Round m̄' to nearest multiple of ⌈q/2⌋ to recover m
The magic:
v - sᵀu = tᵀr + e₂ + ⌈q/2⌋m̄ - sᵀ(Aᵀr + e₁)
= (As + e)ᵀr + e₂ + ⌈q/2⌋m̄ - sᵀAᵀr - sᵀe₁
= sᵀAᵀr + eᵀr + e₂ + ⌈q/2⌋m̄ - sᵀAᵀr - sᵀe₁
= eᵀr + e₂ - sᵀe₁ + ⌈q/2⌋m̄
= small_noise + ⌈q/2⌋m̄
The noise terms (eᵀr, e₂, sᵀe₁) are small enough that they don't
flip the message bits when rounding to ⌈q/2⌋.Why Quantum Computers Can't Break This
To break ML-KEM, an attacker must either:
1. Find s from (A, As + e):
- This is MLWE - no efficient quantum algorithm known
- Best quantum attack: ~2^0.265n (still exponential)
2. Find r from (Aᵀ, u = Aᵀr + e₁):
- Same MLWE problem
- Also exponential for quantum
Shor's algorithm doesn't help because:
- It exploits periodic structure in modular exponentiation
- Lattice problems have no such exploitable structure
- The "noise" in LWE specifically destroys algebraic structurePart 5: How ML-DSA Works (The Math)
High-Level Approach
ML-DSA is a "Fiat-Shamir with Aborts" signature scheme based on Module-LWE.
Classic Fiat-Shamir:
1. Commit to random value
2. Get challenge from hash
3. Respond using secret
ML-DSA addition: "Abort and retry" if response would leak secretParameters (ML-DSA-65)
n = 256 (polynomial degree)
k = 6 (rows in A)
ℓ = 5 (columns in A)
q = 8380417 (modulus, prime)
η = 4 (secret key bound)
τ = 49 (challenge weight)
β = 196 (rejection bound)
γ₁ = 2^19 (commitment range)
γ₂ = (q-1)/32 (low-order rounding)Key Generation
1. Sample matrix A ∈ R_q^(k×ℓ) from seed ρ
2. Sample secret vectors s₁ ∈ R_q^ℓ and s₂ ∈ R_q^k (small coefficients)
3. Compute t = As₁ + s₂ mod q
4. Split t into (t₁, t₀) using Power2Round
Public key: (ρ, t₁)
Private key: (ρ, s₁, s₂, t₀)Signing (Fiat-Shamir with Aborts)
Input: Message M, private key (ρ, s₁, s₂, t₀)
Loop:
1. Sample y ∈ R_q^ℓ uniformly with ||y||∞ < γ₁
2. Compute w = Ay mod q
3. Extract high bits: w₁ = HighBits(w)
4. Compute challenge: c̃ = H(μ || w₁) where μ = H(tr || M)
5. Sample c ∈ R_q with τ non-zero ±1 coefficients based on c̃
6. Compute z = y + cs₁
7. Check rejection conditions:
- If ||z||∞ ≥ γ₁ - β, ABORT and restart
- If ||LowBits(w - cs₂)||∞ ≥ γ₂ - β, ABORT and restart
8. Compute hint h = MakeHint(w - cs₂, ct₀)
Output: Signature σ = (c̃, z, h)
The rejection sampling prevents leaking information about s₁.
Without it, many signatures would reveal the secret key!Verification
Input: Message M, public key (ρ, t₁), signature σ = (c̃, z, h)
1. Check ||z||∞ < γ₁ - β
2. Regenerate A from ρ
3. Compute μ = H(tr || M)
4. Sample c from c̃
5. Compute w₁' = UseHint(h, Az - ct₁)
6. Check c̃ = H(μ || w₁')
Accept if all checks pass.Part 6: Number Theoretic Transform (NTT)
Why We Need It
Polynomial multiplication in R_q = Z_q[X]/(X^n + 1) is expensive:
- Naive: O(n²) operations
- With NTT: O(n log n) operations
For n = 256, that's 65,536 vs ~2,048 operations!
The NTT Concept
NTT is like FFT but over finite fields instead of complex numbers.
Regular polynomial: a(X) = a₀ + a₁X + a₂X² + ... + aₙ₋₁Xⁿ⁻¹
NTT representation: â = (a(ω⁰), a(ω¹), a(ω²), ..., a(ωⁿ⁻¹))
Where ω is a primitive 2n-th root of unity mod q.
Key property: Multiplication in NTT domain is pointwise!
â · b̂ = (a(ω⁰)b(ω⁰), a(ω¹)b(ω¹), ..., a(ωⁿ⁻¹)b(ωⁿ⁻¹))The ML-KEM/ML-DSA Optimization
For q = 3329 (ML-KEM) and q = 8380417 (ML-DSA), there exists ω such that ωⁿ = -1.
This allows efficient "negacyclic" NTT for the ring Z_q[X]/(X^n + 1).
Algorithm: Cooley-Tukey NTT
for len = 1 to n/2 by len *= 2:
for start = 0 to n-1 by 2*len:
zeta = powers_of_omega[...]
for j = start to start+len-1:
t = zeta * a[j + len]
a[j + len] = a[j] - t
a[j] = a[j] + tPart 7: Implementation Considerations
Constant-Time Implementation
Why: Side-channel attacks can extract secrets from timing variations.
Rule: Execution time must not depend on secret data.
// BAD - timing depends on secret bit
if (secret_bit) {
do_something(); // Takes time T1
} else {
do_other(); // Takes time T2
}
// GOOD - constant time selection
result = (secret_bit & something) | (~secret_bit & other);Avoiding Timing Leaks in ML-KEM
Critical operations:
- CBD sampling - must be constant-time
- Polynomial multiplication - NTT is naturally constant-time
- Compression/decompression - careful implementation needed
- Rejection sampling in encapsulation - use constant-time comparison
Memory Safety
// CRITICAL: Zero sensitive data after use
void ml_kem_keygen(...) {
uint8_t secret_seed[32];
// ... use secret_seed ...
// Zero before return
secure_zero(secret_seed, 32);
}
// Use volatile to prevent compiler optimization
void secure_zero(void *ptr, size_t len) {
volatile uint8_t *p = ptr;
while (len--) *p++ = 0;
}Random Number Generation
ML-KEM/ML-DSA security depends on quality randomness!
Requirements:
1. Cryptographically secure (not rand())
2. Sufficient entropy
3. Properly seeded
Acceptable sources:
- /dev/urandom (Linux)
- getrandom() syscall
- BCryptGenRandom (Windows)
- Hardware RNG (with DRBG)
NOT acceptable:
- rand(), random()
- time-based seeds
- predictable sourcesPart 8: Attacks and Cryptanalysis
Known Lattice Attacks
1. LLL Algorithm (Lenstra-Lenstra-Lovász):
- Finds "short" vectors in polynomial time
- But only approximation factor ~2^(n/2) - not enough to break crypto
- Used as subroutine in stronger attacks
2. BKZ (Block Korkine-Zolotarev):
- Stronger than LLL
- Parameterized by block size β
- Time ~2^(0.292β) for best variant
- This is what parameters are designed to resist
3. Lattice Sieving:
- Find many short vectors, combine them
- Best asymptotic complexity: ~2^(0.292n)
- Memory-intensive
4. Hybrid Attacks:
- Combine lattice reduction with brute force
- Guess some secret coefficients, reduce on the rest
- Considered in parameter selection
Why Quantum Doesn't Help Much
Classical BKZ: ~2^(0.292n)
Quantum BKZ: ~2^(0.265n) ← Only ~10% improvement in exponent
This is because:
1. Grover's algorithm gives √ speedup for search
2. But lattice sieving is not pure search
3. The "quantum speedup" for lattice problems is modest
4. Still exponential in dimension
Compare to Shor's:
Classical factoring: ~2^(n^(1/3))
Quantum factoring: ~n^3 ← Exponential to polynomial!
Lattice problems don't have exploitable structure for Shor-like attacks.Current Research Directions
Algebraic attacks on Ring/Module-LWE:
- Exploit structure of polynomial ring
- So far: no significant break
- Active research area
Side-channel attacks:
- Power analysis during NTT
- Timing attacks on rejection sampling
- Fault injection attacks
- Implementation security is critical
Quantum cryptanalysis:
- Are there better quantum algorithms for lattices?
- Current consensus: unlikely, but not proven
- This is why NIST also standardized hash-based signatures (SLH-DSA)
Summary: What You Need to Remember
Core Concepts
- Lattices are discrete grids in high-dimensional space
- SVP/CVP are hard problems - no efficient classical or quantum algorithms
- LWE turns lattice hardness into encryption/signatures
- Module-LWE balances security and efficiency - used in NIST standards
Key Formulas
LWE: b = As + e (mod q)
Public: (A, b)
Secret: s
Hard because: noise e prevents solving linear system
ML-KEM security: Based on Module-LWE hardness
ML-DSA security: Based on Module-LWE + SelfTargetMSIS hardnessParameters to Know
| Scheme | n | k | q | Security Level |
|---|---|---|---|---|
| ML-KEM-768 | 256 | 3 | 3329 | Level 3 (192-bit) |
| ML-DSA-65 | 256 | 6×5 | 8380417 | Level 3 (192-bit) |
Why This Matters
Understanding the math lets you:
- Evaluate whether new attacks affect your security
- Make informed parameter choices
- Understand implementation requirements
- Assess research developments
- Credibly advise on PQC security