Encoding information using Reed-Solomon codes
Reed-Solomon codes are non-binary, block, and error correcting codes may be used in storing information to avoid damaged data loss. It should be noted, that this approach will not be rational in many cases, but it helps to realize the noiseless coding with a variable percentage of recoverable information.
Using Reed- Solomon codes one needs redundancy as doubled size of recovered data. Let me explain with on example: if we have a sequence of 10 characters and want to be able to repair errors in 3eh of them, then we need to store 10 3 * 2 = 16 characters. Variables, that we will use: n = 10, the number of information symbols; t = 3, maximal number of recovered symbols; g = 16, length of the encoded sequence. Thus, the formula can be written as: g = n + t * 2.
While encoding and decoding data, all arithmetic operations are performed in Galois fields. There applied the so-called polynomial arithmetic or Galois field arithmetic. Thus, the result of any operation is also part of this field. Specific Galois field consists of a fixed range of numbers. Galois field is defined by prime number p, which is called characteristic of the field, and positive integer m. The order, or number of elements is the form pm. When m = 1, a field is called prime field. In cases when m>1, to construct the field we need monic irreducible of degree m. GF (pm) - the designation of the Galois field.
To work with the digital data is natural to use p = 2 as characteristic of the field. If m = 1 each element of code sequence is a bit, if m = 8 - 8 bits, ie byte. Actually Reed-Solomon codes operate on bytes and are the most common.
Before moving on to the encoding and decoding operations we need todeal with the Galois field arithmetic on example of GF (23). This field consists of the numbers from 0 to 7.
The simplest is the operation of addition, which is a simple bitwise exclusive disjunction (XOR).
Unfortunately, multiplication is much more difficult to implement, to do it, we need to convert a number in polynomial form.
As you can see the number in polynomial form is a polynomial whose coefficients are the values of bits in the binary representation of the number.
Multiplication of two numbers in polynomial form:
Firstly, it should be noted that even in polynomial form addition is exclusive disjunction, and therefore x2+x2=0 . Secondly, the multiplication result 27 is not included in GF (23) (it consists of numbers from 0 to 7, as mentioned above). To deal with this problem it is necessary to use the generator polynomial. The generator polynomial is irreducible (by analogy with prime numbers is divisible by 1 and itself). For example we will use generating polynomial: (x)=x3+x+1 .
It is also assumed that x satisfies the equation f(x)=x3+x+1=0 .
The end of multiplication example:
The same result can be obtained as remainder of division of the obtained polynomial by the generator polynomial:
Division in polynomial form is quite hard to understand. It's much better to execute division using multiplication table:
Exponentiation table of Galois field’s elements is also very impotant. Exponentiation is carried out in polynomial form, similar to multiplication.
Exponentiation table has cycle: seventh power corresponds to zeroth, then eighth corresponds to first, etc. You may want to check it out.
In Galois fields there exists a primitive element - an element whose powers include all non-zero elements of field. Exponentiation table shows that this condition correspond to all the elements (well, except for one of course). However, this does not always happen, for example, view exponentiation table for GF (16).
For fields we are considering (with characteristic 2) 2 is always chosen as primitive element. Considering his property, any element of field can be expressed as power of primitive member.
Considering the cycle of exponentiation table, we can try again to multiply number:
5∙7=26∙25=26+5=211=211 mod 7=24=6
We got the same result.
Now we can try division:
This result is also matches with previous.
And in conclusion let’s try exponentiation:
5^2=(26)2=26∙2=212=212 mod 7=25=7
And again result matches.
This approach to multiplication and division is much easier than the real operation with the use of polynomials, and there is no need to store a large multiplication table, but a string of powers of primitive element.
So, before coding we chose GF(q), where q=pm. The length of coding sequence should be q-1. Thus, in case of using GF(8) coding sequence consists of 7 elements. Further we should determine which elements would be informational, and which of them would be check (redundant). At the very beginning, we said that the amount of redundant elements should be twice greater than the number of wrong elements, which we want to restore. If we need to correct twofold error (t = 2 – is the error multiplicity), then we should use four check symbols. Let’s apply this to our example: of seven elements there needed four redundant, and therefore three information elements to correct twofold error. The code sequence is as follows:
C=(c0, c1, c2, c3, c4, c5, c6), where c0, c1, c2 – informational, c3, c4, c5, c6 – redundant.
I want to draw attention to the fact that the correction of double error in the code sequence of the seven elements means that you can deal with error, the probability of which is not more than per = 2/7 ≈ 0,29. If the probability of mistake is grater, you need to increase the redundancy, otherwise recovery of distorted information will not work.
Let’s encode the sequence С=( 4, 6, 7, 0, 0, 0, 0), the last four elements – redundant and equal to zero.
C in polynomial form:
Formula for encoding: cj'=C(zj) , where z=2 – is a primitive element.
We got encoded sequence: C'=(5,2,5,3,3,2,4). In polynomial form: С(x)=5∙x0+2∙x1+5∙x2+3∙x3+3∙x4+2∙x5+4∙x6.
Formula for encoding: cj =C' (z-j) .
с1=C' (2-1 )=C' (5)=5+2∙5+5∙7+3∙6+3∙3+2∙4+4∙2=5+1+6+1+5+3+3=6
с2=C' (2-2 )=C' (7)=⋯=7
с3=C' (2-3 )=C' (6)=⋯=0
с4=C' (2-4 )=C' (3)=⋯=0
с5=C' (2-5 )=C' (4)=⋯=0
с6=C' (2-6 )=C' (2)=⋯=0
While decoding we got (4, 6, 7, 0, 0, 0, 0), which is equal to original. To check if there are some mistakes it's enough to look at redundant elements. If they are still equal to zero, then there is no errors.
Error is another sequence which is added to the encoded. Assume, that error-vector looks like: f' =(0, 0, 5, 0, 3, 0, 0), then encoded sequence with mistake is:
Let’s try to decode the given codeword: Cf'=5∙x0+2∙x1+0∙x2+3∙x3+0∙x4+2∙x5+4∙x6=5+2∙x1+3∙x3+2∙x5+4∙x6
Cf=(2, 5, 7, 6, 5, 5, 3) – decoded sequence. As you can see, the last four elements are not equal to zero, which indicates an error. Firstly it's necessary to determine the position of distorted elements. For this we need to calculate the error location polynomial. His roots indicate positions of errors. In matrix form error location polynomial looks like L=[1, l1, l2, … lt]. As far as in our example t=2, so L=[1, l1, l2].
Last four symbols of encoded sequence:
Let’s form a matrix and column-vector, to calculate L. In general case:
For our example:
Calculate M-1 , considering, that we use Galois field arithmetic:
Just in case we will verify the calculations:
L in polynomial form:
Calculate roots with simple search:
So errors are in positions c2' and c4'.
Now we need to find the correct values. Bring L(x) to normal form:
Error-vector looks like this:
Create convolution for f0, f1, f2, and calculate them:
So F=(6, 3, 0, 6, 5, 5, 3). Since error was added to encoded sequence we need to encode F:
Since we already know positions of errors, it’s enough to calculate only and , all others will be equal to zero. But in order to accurately verify this let's honestly count all values:
Finally f'=(0, 0, 5, 0, 3, 0, 0). Add error-vector to wrong encoded sequence:
As result we got correct encoded sequence.