From 8c3789dc317524e9fd7e9ba310c538a4cf897b86 Mon Sep 17 00:00:00 2001
From: Adrian Utrilla <adrianutrilla@gmail.com>
Date: Fri, 13 Oct 2017 14:04:40 -0700
Subject: [PATCH] Explain how Shamir's Secret Sharing works

---
 shamir/README.md | 163 +++++++++++++++++++++++++++++++++++++++++++++++
 shamir/shamir.go |  38 ++++++++++-
 2 files changed, 199 insertions(+), 2 deletions(-)

diff --git a/shamir/README.md b/shamir/README.md
index f73198baa..d92f3b557 100644
--- a/shamir/README.md
+++ b/shamir/README.md
@@ -1 +1,164 @@
 Forked from [Vault](https://github.com/hashicorp/vault/tree/master/shamir)
+
+## How it works
+
+We want to split a secret into parts.
+
+Any two points on the cartesian plane define a line. Three points define a
+parabola. Four points define a cubic curve, and so on. In general, `n` points
+define an function of degree `(n - 1)`. If our secret was somehow an function of
+degree `(n - 1)`, we could just compute `n` different points of that function
+and give `n` different people one point each. In order to recover the secret,
+then we'd need all the `n` points. If we wanted to, we could compute more than n
+points, but even then still only `n` points out of our whole set of computed
+points would be required to recover the function.
+
+A concrete example: our secret is the function `y = 2x + 1`. This function is of
+degree 1, so we need at least 2 points to define it. For example, let's set
+`x = 1`, `x = 2` and `x = 3`. From this follows that `y = 3`, `y = 5` and
+`y = 7`, respectively. Now, with the information that our secret is of degree 1,
+we can use any 2 of the 3 points we computed to recover our original function.
+For example, let's use the points `x = 1; y = 3` and `x = 2; y = 5`.
+We know that first degree functions are lines, defined by their slope and their
+intersection point with the y axis. We can easily compute the slope given our
+two points: it's the change in `y` divided by the change in `x`:
+`(5 - 3)/(2 - 1) = 2`. Now, knowing the slope we can compute the intersection
+point with the `y` axis by "working our way back". We know that at `x = 1`,
+`y` equals `3`, so naturally because the slope is `2`, at `x = 0`, `y` must be
+`1`.
+
+## Lagrange interpolation
+
+The method we've used for this isn't very general: it only works for polynomials
+of degree 1. Lagrange interpolation is a more general way that lets us obtain
+the function of degree `(n - 1)` that passes through `n` arbitrary points.
+
+Understanding how to perform Lagrange interpolation isn't really necessary to
+understand Shamir's Secret Sharing: it's enough to know that there's only one
+function of degree `(n - 1)` that passes through `n` given points and that
+computing this function given the points is computationally efficient.
+
+But for those interested, here's an explanation:
+
+Let's say our points are `(x_0, y_0),...,(x_j, y_j),...,(x_(n-1), y_(n-1))`.
+Then, the Lagrange polynomial `L(x)`, the polynomial we're looking for, is
+defined as follows:
+
+`L(x) = sum from j=0 to j=(n-1) of {y_j * l_j(x)}`
+
+and `l_j(x) = product from m=0 to m=(n-1) except when m=j of {(x - x_m)/(x_j - x_m)}`
+
+A concrete example, with 3 points:
+
+```
+x_0 = 1   y_0 = 1
+x_1 = 2   y_1 = 4
+x_2 = 3   y_2 = 9
+```
+
+Let's apply the formula:
+
+```
+L(x) =
+        y_0 * l_0(x) +
+        y_1 * l_1(x) +
+        y_2 * l_2(x)
+```
+
+Substitute `y_j` for the actual value:
+
+```
+L(x) =
+        1 * l_0(x) +
+        4 * l_1(x) +
+        9 * l_2(x)
+```
+
+Replace `l_j(x)`:
+
+```
+l_0(x) = (x - 2)/(1 - 2) * (x - 3)/(1 - 3) =  0.5x^2 - 2.5x + 3
+l_1(x) = (x - 1)/(2 - 1) * (x - 3)/(2 - 3) = -   x^2 +   4x - 3
+l_2(x) = (x - 1)/(3 - 1) * (x - 2)/(3 - 2) =  0.5x^2 - 1.5x + 1
+```
+
+```
+
+L(x) =
+        1 * ( 0.5x^2 - 2.5x + 3) +
+        4 * (   -x^2 +   4x - 3) +
+        9 * ( 0.5x^2 - 1.5x + 1)
+```
+
+```
+
+L(x) =
+        ( 0.5x^2 -  2.5x +  3) +
+        (  -4x^2 +   16x - 12) +
+        ( 4.5x^2 - 13.5x +  9)
+     =       x^2 +    0x +  0
+     = x^2
+```
+
+So the polynomial we were looking for is `y = x^2`.
+
+
+## Splitting a secret
+
+So we have the ability of splitting a function into parts, but in the context
+of computing we generally want to split a number, not a function. For this,
+let's define a function of degree `threshold`. `threshold` is the amount of
+parts we want to require in order to recover the secret. Let's set the parameter
+of degree zero to our secret `S` and make the rest of the parameters random:
+
+`y = ax^(threshold) + bx^(threshold-1) + ... + zx^1 + S`
+
+With `a, b, ...` random.
+
+Then, we want to generate our parts. For this, we evaluate our function at as
+many points as we want parts. For example, say our secret is 123, we want 5
+parts and a threshold of 2. Because the threshold is 2, we're going to need a
+polynomial of degree 2:
+
+`y = ax^2 + bx + 123`
+
+We randomly set `a = 7` and `b = 1`:
+
+`y = 7x^2 + x + 123`
+
+Because we want 5 parts, we need to compute 5 points:
+
+```
+x = 0 -> y = 123 # woops! This is the secret itself. Let's not use that one.
+x = 1 -> y = 131
+x = 2 -> y = 153
+x = 3 -> y = 189
+x = 4 -> y = 239
+x = 5 -> y = 303
+```
+
+And that's it. Each of the computed points is one part of the secret.
+
+## Combining a secret
+
+Now that we have our parts, we have to define a way to recover them. Using
+the example from the previous section, we only need any two points out of the
+five we created to recover the secret, because we set the threshold to two.
+So with any two of the five points we created, we can recover the original
+polynomial, and because the secret is the free term in the polynomial, we can
+recover the secret.
+
+## Finite fields
+
+In the previous examples we've only used integers, and this unfortunately has
+a flaw. First of all, it's impossible to uniformly sample integers to get
+random coefficients for our generated polynomial. Additionally, if we don't
+operate in a finite field, information about the secret is leaked for every part
+someone recovers.
+
+For these reasons, Vault's implementation of Shamir's Secret Sharing uses finite
+field arithmetic, specifically in GF(2^8), with 229 as the generator. GF(2^8)
+has 256 elements, so using this we can only split one byte at a time. This is
+not a problem, though, as we can just split each byte in our secret
+independently. This implementation uses tables to speed up the execution of
+finite field arithmetic.
diff --git a/shamir/shamir.go b/shamir/shamir.go
index d6f5137e5..6312a1f48 100644
--- a/shamir/shamir.go
+++ b/shamir/shamir.go
@@ -1,5 +1,16 @@
 package shamir
 
+// Some comments in this file were written by @autrilla
+// The code was written by HashiCorp as part of Vault.
+
+// This implementation of Shamir's Secret Sharing matches the definition
+// of the scheme. Other tools used, such as GF(2^8) arithmetic, Lagrange
+// interpolation and Horner's method also match their definitions and should
+// therefore be correct.
+// More information about Shamir's Secret Sharing and Lagrange interpolation
+// can be found in README.md
+
+
 import (
 	"crypto/rand"
 	"crypto/subtle"
@@ -40,6 +51,8 @@ func makePolynomial(intercept, degree uint8) (polynomial, error) {
 }
 
 // evaluate returns the value of the polynomial for the given x
+// Uses Horner's method <https://en.wikipedia.org/wiki/Horner%27s_method> to
+// evaluate the polynomial at point x
 func (p *polynomial) evaluate(x uint8) uint8 {
 	// Special case the origin
 	if x == 0 {
@@ -58,6 +71,9 @@ func (p *polynomial) evaluate(x uint8) uint8 {
 
 // interpolatePolynomial takes N sample points and returns
 // the value at a given x using a lagrange interpolation.
+// An implementation of Lagrange interpolation
+// <https://en.wikipedia.org/wiki/Lagrange_polynomial>
+// For this particular implementation, x is always 0
 func interpolatePolynomial(x_samples, y_samples []uint8, x uint8) uint8 {
 	limit := len(x_samples)
 	var result, basis uint8
@@ -79,6 +95,7 @@ func interpolatePolynomial(x_samples, y_samples []uint8, x uint8) uint8 {
 }
 
 // div divides two numbers in GF(2^8)
+// GF(2^8) division using log/exp tables
 func div(a, b uint8) uint8 {
 	if b == 0 {
 		// leaks some timing information but we don't care anyways as this
@@ -109,6 +126,7 @@ func div(a, b uint8) uint8 {
 }
 
 // mult multiplies two numbers in GF(2^8)
+// GF(2^8) multiplication using log/exp tables
 func mult(a, b uint8) (out uint8) {
 	var goodVal, zero uint8
 	log_a := logTable[a]
@@ -168,7 +186,11 @@ func Split(secret []byte, parts, threshold int) ([][]byte, error) {
 		return nil, fmt.Errorf("cannot split an empty secret")
 	}
 
-	// Generate random list of x coordinates
+	// Generate random x coordinates for computing points. I don't know
+	// why random x coordinates are used, and I also don't know why
+	// a non-cryptographically secure source of randomness is used.
+	// As far as I know the x coordinates do not need to be random.
+
 	mathrand.Seed(time.Now().UnixNano())
 	xCoordinates := mathrand.Perm(255)
 
@@ -177,6 +199,10 @@ func Split(secret []byte, parts, threshold int) ([][]byte, error) {
 	// output is {y1, y2, .., yN, x}.
 	out := make([][]byte, parts)
 	for idx := range out {
+		// Store the x coordinate for each part as its last byte
+		// Add 1 to the xCoordinate because if the x coordinate is 0,
+		// then the result of evaluating the polynomial at that point
+		// will be our secret
 		out[idx] = make([]byte, len(secret)+1)
 		out[idx][len(secret)] = uint8(xCoordinates[idx]) + 1
 	}
@@ -186,6 +212,8 @@ func Split(secret []byte, parts, threshold int) ([][]byte, error) {
 	// a single byte as the intercept of the polynomial, so we must
 	// use a new polynomial for each byte.
 	for idx, val := range secret {
+		// Create a random polynomial for each point.
+		// This polynomial crosses the y axis at `val`.
 		p, err := makePolynomial(val, uint8(threshold-1))
 		if err != nil {
 			return nil, fmt.Errorf("failed to generate polynomial: %v", err)
@@ -195,7 +223,10 @@ func Split(secret []byte, parts, threshold int) ([][]byte, error) {
 		// We cheat by encoding the x value once as the final index,
 		// so that it only needs to be stored once.
 		for i := 0; i < parts; i++ {
+			// Add 1 to the xCoordinate because if it's 0,
+			// then the result of p.evaluate(x) will be our secret
 			x := uint8(xCoordinates[i]) + 1
+			// Evaluate the polynomial at x
 			y := p.evaluate(x)
 			out[i][idx] = y
 		}
@@ -233,6 +264,8 @@ func Combine(parts [][]byte) ([]byte, error) {
 
 	// Set the x value for each sample and ensure no x_sample values are the same,
 	// otherwise div() can be unhappy
+	// Check that we don't have any duplicate parts, that is, two or
+	// more parts with the same x coordinate.
 	checkMap := map[byte]bool{}
 	for i, part := range parts {
 		samp := part[firstPartLen-1]
@@ -250,7 +283,8 @@ func Combine(parts [][]byte) ([]byte, error) {
 			y_samples[i] = part[idx]
 		}
 
-		// Interpolte the polynomial and compute the value at 0
+		// Use Lagrange interpolation to retrieve the free term
+		// of the original polynomial
 		val := interpolatePolynomial(x_samples, y_samples, 0)
 
 		// Evaluate the 0th value to get the intercept