Norms

Although the main result of playing with discriminants in number fields is the existence of integral bases, there are a few other important consequences. For one, proving the finiteness of the class group is based on some juggling around with norms of ideals.

There are three ways of defining norms of ideals, which turn out to be equivalent. The first is by trying to extend the definition of the norm of an element by multiplication; but for that we need to first show that the class group is finite. The second is by letting the norm of an ideal I of O(K) be the number of elements of O(K)/I. The third is by playing with integral bases more.

The proof that every ring of integers of a number field has an integral basis relies on two ideas: first, that O(K) has n elements that are linearly independent over Q (where n is as usual the degree of K), and second, that if such a set is not an integral basis then we can find another set of smaller discriminant. Part two works for any nonzero ideal of O(K) – just do a global search and replace and write I instead of O(K).

For part one, the key fact is that every ideal contains a rational integer, say m. If X = {x1, x2, …, x(n)} is a linearly independent set of O(K), then so is mX = {mx1, mx2, …, mx(n)}, since m is in Q. Also, since the x(i)’s are in O(K), every mx(i) is in I.

Now, given the prominence of the smallest (or highest if it’s negative) discriminant in the proof, it makes sense to give it a special name – the discriminant of K, denoted d(K). The proof shows that every set of discriminant d(K) is an integral basis. In fact, the converse holds: every integral basis has discriminant d(K).

To see why, let A be the matrix corresponding to some integral basis X of discriminant d(K). Let Y = {y1… y(n)} be another integral basis, and write y(j) as b(1j)x1 + b(2j)x2 + … + b(nj)x(n). Let B be the matrix of the elements b(ij). B has integer elements, since X is an integral basis.

We can multiply matrices. The multiplication is weird at first, because it’s based on the original definition of matrices as linear functions on vector spaces. If AB = C, then c(ij) = a(i1)b(1j) + a(i2)b(2j) + … + a(in)b(nj). In general, AB != BA; for example, if A = [1, 0; 1, 1], B = [2, 0; 0, 1], then AB = [2, 0; 2, 1], but BA = [2, 0; 0, 1]. But in fact, |AB| = |BA| = |A|*|B|.

The reason you should care about this is that the matrix corresponding to Y is just AB, by the definition of the b(ij)’s as coefficients of y(j)’s. The conjugates don’t give us any problems since for every rational number, k*w(x) = w(kx). So disc(Y) = |AB|^2 = (|A|^2)(|B|^2) = disc(X)(|B|^2).

Since B has integer entries, |B| is an integer. So |B|^2 is a positive integer, and all discriminants of integral bases have the same sign. Further, disc(Y) is clearly divisible by disc(X). But by the same argument, disc(X) is divisible by disc(Y), so that disc(X) = disc(Y). It’s justified to define d(K) to be disc(X) for any integral basis, then.

By the same argument, we get that disc(I) is the same regardless of which integral basis for I we choose. We can even recover |B|^2 via disc(I)/d(K), and then take a square root. There are two possible square roots, corresponding to two different values of |B|; we can just take the positive one, and call it N(I).

If c is an element of O(K), then to find the norm of (c), let’s look for a good integral basis of (c). If X is an integral basis of O(K), then cX is an integral basis of (c). All elements of cX are clearly in (c), and we can write every a  in O(K) as a sum of x(i)’s times integers, which lets us write ca as a sum of the cx(i)’s times integers. If A is the matrix of X, then the matrix of cX has a first row equal to this of X times c = w1(c), a second row equal to this of X times w2(c), and so on. So the matrix has determinant |A|*w1(c)*w2(c)*…*w(n)(c) = |A|*N(c). So the norm of (c) is the absolute value of the norm of c.

Finally, N(I) is equal to the size of O(K)/I. Defining integral bases X of O(K) and Y of I, and writing the matrix of Y as B as before, we can add and switch rows or columns, which won’t change the absolute value of |B|. Further, it won’t change I: playing with rows will change X to another integral basis but do nothing to Y, and playing with columns will change Y to another integral basis of I.

There’s a procedure for performing these operations in a way that will change B to a diagonal matrix D, that is one with nonzero values only on the diagonal running from top-left to bottom-right. Then |D| is just the product of the diagonal entries, d11*d22*…*d(nn). But regarding O(K) just as a group and D as a subgroup, we get that O(K)/D has the same structure as Z/d11Z * Z/d22Z * … * Z/d(nn)Z, which has |d11*d22*…*d(nn)| elements.