# N-ary m-sequence generator in Python

Maximum length sequences (or m-sequences) are bit sequences generated using maximal LFSRs (Linear Feedback Shift Registers) and are so called because they are periodic and reproduce every binary sequence that can be represented by the shift registers (i.e., for length-m registers they produce a sequence of length $2^{m}-1$). Although originally defined for binary sequences, the concept of m-sequence can be extended to sequences over $GF(n), n>2$.

Despite being deterministically generated, m-sequences have nice statistical properties, which resemble those of white noise (they are spectrally flat, with the exception of a near-zero constant term). For this reason, m-sequences are often referred to as pseudo random noise code (PN code). Unfortunately though, unpredictability is not among those properties (see below).

The configuration of the feedback taps in a binary LFSR can be expressed in finite field arithmetic as a polynomial in $GF(2^{n})$. The LFSR is maximal-length if and only if the corresponding feedback polynomial is primitive. The linear complexity (LC) of a given periodic sequence is the number of cells in the shortest LFSR that can generate that sequence. The correlation between two sequences is the complex inner product of the first sequence with a shifted version of the second sequence. It is called an autocorrelation if the two sequences are the same or a cross-correlation if they are distinct. Particularly interesting for spread spectrum application are sequences (not necessarily m-sequences) that exhibit low autocorrelation and low cross-correlation (the ideal correlation function would then be the Kronecker’s delta).

Apart from spread spectrum communications, m-sequences are used in a variety of areas including digital watermarking and cryptography. As m-sequences are derived from linear recurrence relations — which lead to fairly easy cryptanalysis — they must be processed by nonlinear functions in order to be used in cryptographic applications. A good example is Trivium, a 80-bit stream cipher that consists of three shift registers of different lengths.

```#!/usr/bin/python3
#
# Author: Joao H de A Franco (jhafranco@acm.org)
#
# Description: N-ary sequence generator in Python
#
# Version: 1
#
# Date: 2014-02-14
#
#          (CC BY-NC-SA 3.0)
#===========================================================

from functools import reduce
from fractions import Fraction
from cmath import rect,pi

def flatList(s,q):
"""Converts (list of) polynomials into (list of) elements
of finite field mapped as integers.
s: list of polynomials
q: order of finite field"""
if type(s[0]) is int:
return s
elif type(s[0]) is list:
return [reduce(lambda i,j:(i*q)+j,e) for e in s]
else:
raise TypeError

def printListOfNumbers(S,L,*X):
"""Print list of numbers according to their particular
format (integer, float or complex)"""
print(S+" [",end="")
for i,e in enumerate(L):
if type(e) is complex:
print("{:5.3f}{:+5.3f}i".format(e.real,e.imag),end="")
elif type(e) is float:
print("{:5.3f}".format(e),end="")
elif type(e) is int:
print("{:d}".format(e),end="")
elif type(e) is Fraction:
print("{:s}".format(e),end="")
else:
raise TypeError
if i<len(L)-1:
print(", ",end="")
print("]",*X)

def xac(s,q=None,d=None,n=None):
"""Evaluates the complex auto-correlation of a periodic
sequence with increasing delays within the period.
Input length must correspond to a full period.
s: q-ary periodic sequence
q: order of finite field
d: maximum denominator allowed in fractions
n: number of decimal places in floating point numbers"""
return xcc(s,s,q,d,n)

def xcc(s1,s2,q=None,d=None,n=None):
"""Evaluates the complex cross-correlation between two
equally periodic q-ary sequences with increasing delays
within the period. Input length must correspond to a full
period.
s: q-ary periodic sequence
q: order of finite field
d: maximum denominator allowed in fractions
n: number of decimal places in floating point numbers"""

def cc(s1,s2):
"""Evaluates the complex correlation between two equally
periodic numerical sequences.
s1,s2: q-ary periodic sequences"""
assert type(s1[0]) == type(s2[0])
if type(s1[0]) is list:
s3 = [[(j-i)%q for i,j in zip(u,v)] for u,v in zip(s1,s2)]
s4 = [reduce(lambda x,y:(x*q)+y,e) for e in s3]
z = sum(rect(1,2*pi*i/q) for i in s4)/len(s1)
elif type(s1[0]) is int:
z = sum(rect(1,2*pi*(j-i)/q) for i,j in zip(s1,s2))/len(s1)
else:
raise TypeError
zr,zi = round(z.real,n),round(z.imag,n)
if abs(zi%1)<10**-n:
if abs(zr-round(zr))<10**-n:
return int(zr)
elif Fraction(z.real).limit_denominator().denominator<=d:
return Fraction(z.real).limit_denominator()
else:
return zr
else:
return complex(zr,zi)

q = 2 if q is None else q
d = 30 if d is None else d
n = 3 if n is None else n
assert len(s1) == len(s2)
return [cc(s1,s2[i:]+s2[:i]) for i in range(len(s1))]

def LFSR(P,S,M,N,K):
"""Outputs K-ary sequence with N elements. Each element is
derived from M successive values of the LFSR sequence
generated by polynomial P and initial state S. Polynomial
P is represented by a list of coefficients in decreasing
power order."""
def LFSR2():
"""Generates linear K-ary sequence according to polynomial
P and initial state S. If P is primitive, sequence length
is exactly one period."""
seq,st = [S[-1]],S
for j in range(K**len(S)-2):
st0 = sum([i*j for i,j in zip(st,P[1:])])%K
st = [st0]+st[:-1]
seq += [st[-1]]
return seq

assert len(P) > 1 and len(P)-1 == len(S)
s = LFSR2()
L = len(s)
assert M <= L
return [s[i%L] if M == 1 else (s[i%L:]+s[:i%L])[:M] for i in range(N)]

if __name__ == "__main__":

# Binary sequence over GF(2) generated by the LFSR
# defined by non primitive polynomial x⁴+x³+x+1.
# This is not a m-sequence since its period is 6
# (<15 = 2⁴-1)
s01 = LFSR([1,1,0,1,1],[0,0,0,1],1,15,2)
print("s01 =",s01,"\n")
```
``````s01 = [1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0]
``````
```    # Sequence over GF(2⁴) derived from sequence "s01".
# The polynomial elements were also mapped to integers
# in Z16, in order to show the period.
s04 = LFSR([1,1,0,1,1],[0,0,0,1],4,15,2)
print("s04 =",s04,"\n")
print("s04 =",flatList(s04,2),"\n")
```
``````s04 = [[1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 1], [0, 1, 1, 1],
[1, 1, 1, 0], [1, 1, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1],
[0, 0, 1, 1], [0, 1, 1, 1], [1, 1, 1, 0], [1, 1, 0, 0],
[1, 0, 0, 1], [0, 0, 1, 0], [0, 1, 0, 0]]

s04 = [8, 1, 3, 7, 14, 12, 8, 1, 3, 7, 14, 12, 9, 2, 4]
``````
```    # m-sequence over GF(2) generated by the LFSR defined
# by the primitive polynomial x⁴+x+1. Its period is
# equal to 2⁴-1=15
s11 = LFSR([1,1,0,0,1],[0,0,0,1],1,15,2)
print("s11 =",s11,"\n")
```
``````s11 = [1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0]
``````
```    # Autocorrelation of m-sequence "s11". This function
# is two-valued and has the same period as sequence
# "s11"
printListOfNumbers("ac(s11) =",xac(s11),"\n")
```
``````ac(s11) = [1, -1/15, -1/15, -1/15, -1/15, -1/15, -1/15, -1/15,
-1/15, -1/15, -1/15, -1/15, -1/15, -1/15, -1/15]
``````
```    # Sequence over GF(2²) derived from m-sequence "s11"
s12 = LFSR([1,0,0,1,1],[0,0,0,1],2,15,2)
print("s12 =",s12,"\n")
```
``````s12 = [[1, 0], [0, 0], [0, 0], [0, 1], [1, 0], [0, 0], [0, 1],
[1, 1], [1, 0], [0, 1], [1, 0], [0, 1], [1, 1], [1, 1], [1, 1]]
``````
```
# Autocorrelation of sequence "s12"
printListOfNumbers("ac(s12) =",xac(s12,4),"\n")
```
``````ac(s12) = [1, 7/15, 7/15, 7/15, 7/15, 7/15, 7/15, 7/15, 7/15,
7/15, 7/15, 7/15, 7/15, 7/15, 7/15]
``````
```    # Sequence over GF(2³) derived from m-sequence "s11"
s13 = LFSR([1,0,0,1,1],[0,0,0,1],3,15,2)
print("s13 =",s13,"\n")
```
``````s13 = [[1, 0, 0], [0, 0, 0], [0, 0, 1], [0, 1, 0], [1, 0, 0],
[0, 0, 1], [0, 1, 1], [1, 1, 0], [1, 0, 1], [0, 1, 0],
[1, 0, 1], [0, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 0]]
``````
```    # Autocorrelation of sequence "s13"
printListOfNumbers("ac(s13) =",xac(s13,8,d=1000),"\n")
```
``````ac(s13) = [1, 0.844, 0.844, 0.844, 0.844, 0.844, 0.844, 0.844,
0.844, 0.844, 0.844, 0.844, 0.844, 0.844, 0.844]
``````
```    # Sequence over GF(2⁴) derived from the m-sequence "s11"
s14 = LFSR([1,0,0,1,1],[0,0,0,1],4,15,2)
print("s14 =",s14,"\n")
```
``````s14 = [[1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [0, 1, 0, 0],
[1, 0, 0, 1], [0, 0, 1, 1], [0, 1, 1, 0], [1, 1, 0, 1],
[1, 0, 1, 0], [0, 1, 0, 1], [1, 0, 1, 1], [0, 1, 1, 1],
[1, 1, 1, 1], [1, 1, 1, 0], [1, 1, 0, 0]]
``````
```    # Autocorrelation of sequence "s14"
printListOfNumbers("ac(s14) =",xac(s14,16,d=1000),"\n")
```
``````ac(s14) = [1, 0.959, 0.959, 0.959, 0.959, 0.959, 0.959, 0.959,
0.959, 0.959, 0.959, 0.959, 0.959, 0.959, 0.959]
``````
```    # Binary sequence over GF(2) generated by the LFSR
# defined by the primitive polynomial (x⁴+x³+1).
# This is also a m-sequence with period equal to
# 15 = 2⁴-1
s21 = LFSR([1,0,0,1,1],[0,0,0,1],1,15,2)
print("s21 =",s21,"\n")
```
``````s21 = [1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1]
``````
```    # Autocorrelation of sequence "s21"
printListOfNumbers("ac(s21) =",xac(s21),"\n")
```
``````ac(s21) = [1, -1/15, -1/15, -1/15, -1/15, -1/15, -1/15,
-1/15, -1/15, -1/15, -1/15, -1/15, -1/15, -1/15, -1/15]
``````
```    # Cross-correlation between m-sequences "s11" and "s21".
# This function is four-valued and has the same period
# as sequences "s11" and "s21"
printListOfNumbers("cc(s11,s21) =",xcc(s11,s21),"\n")
```
``````cc(s11,s21) = [-1/15, 1/5, -1/15, 7/15, 1/5, -1/3, -1/15,
1/5, 7/15, -1/3, 1/5, -1/3, -1/3, -1/15, -1/15]
``````
```    # m-sequence over GF(3) generated by the LFSR defined
# by the primitive polynomial x²+2x+1. Its period is
# equal to 3²-1=8
s31 = LFSR([1,2,1],[1,0],1,8,3)
print("s31 =",s31,"\n")
```
``````s31 = [0, 1, 2, 2, 0, 2, 1, 1]
``````
```    # Autocorrelation of sequence "s31"
printListOfNumbers("ac(s31) =",xac(s31,3),"\n")
```
``````ac(s31) = [1, -1/8, -1/8, -1/8, -1/8, -1/8, -1/8, -1/8]
``````
```    # Sequence over GF(3²) derived from m-sequence "s31"
s32 = LFSR([1,2,1],[1,0],2,8,3)
print("s32 =",s32,"\n")
```
``````s32 = [[0, 1], [1, 2], [2, 2], [2, 0], [0, 2], [2, 1], [1, 1], [1, 0]]
``````
```    # Autocorrelation of sequence "s32"
printListOfNumbers("ac(s32) =",xac(s32,3),"\n")
```
``````ac(s32) = [1, -1/8, -1/8, -1/8, -1/8, -1/8, -1/8, -1/8]
``````
```    # Sequence over GF(3³) derived from m-sequence "s31"
s33 = LFSR([1,2,1],[1,0],3,8,3)
print("s33 =",s33,"\n")
```
``````s33 = [[0, 1, 2], [1, 2, 2], [2, 2, 0], [2, 0, 2],
[0, 2, 1], [2, 1, 1], [1, 1, 0], [1, 0, 1]]
``````
```    # Autocorrelation of sequence "s33"
printListOfNumbers("ac(s33) =",xac(s33,3),"\n")
```
``````ac(s33) = [1, -1/8, -1/8, -1/8, -1/8, -1/8, -1/8, -1/8]
``````
```    # m-sequence over GF(3) generated by the LFSR defined
# by the primitive polynomial x³+x²+2. Its period is
# equal to 3³-1=26
s41 = LFSR([1,1,0,2],[1,0,0],1,26,3)
print("s41 =",s41,"\n")
```
``````s41 = [0, 0, 1, 1, 1, 0, 2, 1, 1, 2, 1, 0, 1, 0, 0, 2, 2, 2, 0, 1,
2, 2, 1, 2, 0, 2]
``````
```    # Autocorrelation of m-sequence "s41". This function is
# two-valued and has the same period as sequence "s41"
printListOfNumbers("ac(s41) =",xac(s41,3),"\n")
```
``````ac(s41) = [1, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26,
-1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26,
-1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26, -1/26]
``````
```    # Sequence over GF(3²) derived from m-sequence "s41"
s42 = LFSR([1,1,0,2],[1,0,0],2,26,3)
print("s42 =",s42,"\n")
```
``````s42 = [[0, 0], [0, 1], [1, 1], [1, 1], [1, 0], [0, 2],
[2, 1], [1, 1], [1, 2], [2, 1], [1, 0], [0, 1],
[1, 0], [0, 0], [0, 2], [2, 2], [2, 2], [2, 0],
[0, 1], [1, 2], [2, 2], [2, 1], [1, 2], [2, 0],
[0, 2], [2, 0]]
``````
```    # Autocorrelation of sequence "s42"
printListOfNumbers("ac(s42) =",xac(s42,9),"\n")
```
``````ac(s42) = [1, 0.701, 0.701, 0.701, 0.701, 0.701, 0.701,
0.701, 0.701, 0.701, 0.701, 0.701, 0.701, 0.838,
0.701, 0.701, 0.701, 0.701, 0.701, 0.701, 0.701,
0.701, 0.701, 0.701, 0.701, 0.701]
``````
```    # Sequence over GF(3³) derived from m-sequence "s31"
s43 = LFSR([1,1,0,2],[1,0,0],3,26,3)
print("s43 =",s43,"\n")
```
``````s43 = [[0, 0, 1], [0, 1, 1], [1, 1, 1], [1, 1, 0],
[1, 0, 2], [0, 2, 1], [2, 1, 1], [1, 1, 2],
[1, 2, 1], [2, 1, 0], [1, 0, 1], [0, 1, 0],
[1, 0, 0], [0, 0, 2], [0, 2, 2], [2, 2, 2],
[2, 2, 0], [2, 0, 1], [0, 1, 2], [1, 2, 2],
[2, 2, 1], [2, 1, 2], [1, 2, 0], [2, 0, 2],
[0, 2, 0], [2, 0, 0]]
``````
```    # Autocorrelation of sequence "s43"
printListOfNumbers("ac(s43) =",xac(s43,27),"\n")
```
``````ac(s43) = [1, 0.963, 0.963, 0.963, 0.963, 0.963, 0.963,
0.963, 0.963, 0.963, 0.963, 0.963, 0.963, 0.981,
0.963, 0.963, 0.963, 0.963, 0.963, 0.963, 0.963,
0.963, 0.963, 0.963, 0.963, 0.963]
``````
```    # Sequence over Z20 with ideal autocorrelation function
# (Kronecker's delta)
s51 = [t**2%20 for t in range(10)]
print("s51 =",s51,"\n")
```
``````s51 = [0, 1, 4, 9, 16, 5, 16, 9, 4, 1]
``````
```    # Autocorrelation of sequence "s51"
printListOfNumbers("xac(s51) =",xac(s51,20),"\n")
```
``````xac(s51) = [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
``````
```    # Sequence over Z7 with ideal autocorrelation function
# (Kronecker's delta)
s61 = [(t*(t+1)//2)%7 for t in range(7)]
print("s61 =",s61,"\n")
```
``````s61 = [0, 1, 3, 6, 3, 1, 0]
``````
```    # Autocorrelation of sequence "s61"
printListOfNumbers("xac(s61) =",xac(s61,7),"\n")
```
``````xac(s61) = [1, 0, 0, 0, 0, 0, 0]
``````

# An elegant algorithm for calculating Hamming distance

Generally speaking, the Hamming distance between two strings of the same length is the number of positions in which the corresponding symbols are different. This metric, proposed by Richard W. Hamming in his seminal paper “Error Detecting and Error Correcting Codes”, Bell System Technical Journal 26(2):147-160 (1950), also expresses the least number of (intended or unintended) substitutions needed to change one string to another.

Besides being used in computer- and communications-related fields such as information theory, coding theory, and cryptography, the Hamming distance concept has also found its way into genomics for the comparison of genomic sequences. Its importance can also be judged by the fact that modern microprocessors have a specific instruction (POPCNT, “population count”) for counting the number of bits set to 1.

Assuming two bit strings of equal length, x and y, the “obvious” algorithm to calculate the Hamming distance between them is to count the number of “1” bits in the result of the expression “x xor y”, as shown in the following Python code:

```def hamming1(x,y):
"""Calculate the Hamming distance between two bit strings"""
assert len(x) == len(y)
count,z = 0,int(x,2)^int(y,2)
while z:
if z&1:
count += 1
z >>= 1
return count
```

While searching the web the other day, I have come across a very elegant algorithm for the same purpose, which can be coded in Python as follows:

```def hamming2(x,y):
"""Calculate the Hamming distance between two bit strings"""
assert len(x) == len(y)
count,z = 0,int(x,2)^int(y,2)
while z:
count += 1
z &= z-1 # magic!
return count
```

Apart from being 50% faster than the first one, its simplicity is awesome! After being counted, the lowest-order nonzero bit is cleared like magic! When I started a new search later, this time looking for the algorithm’s authorship, I finally learned that it was proposed by Peter Wegner in 1960 (“A technique for counting ones in a binary computer“, Communications of the ACM 3 (5): 322).

Where did I find this information? The answer was, talking again about “obvious” things, in the very same Wikipedia’s entry for “Hamming distance“! This little gem has been there for at least a year (according to the entry’s history), just waiting to be discovered! Quoting Antoine de Saint Exupéry, “You know you have achieved perfection in design, not when you have nothing more to add, but when you have nothing more to take away.”