The corpus of the 1970s total of 425,600 tokens consists of the following text classes:
| Text class | Filename beginning | Number of tokens | Per cent of corpus |
|---|---|---|---|
| Newspapers | (several, see below) | 168,500 | 40 % |
| Fiction | ilu | 257,100 | 60 % |
Newspaper texts are from the following titles:
| Newspaper | File name beginning | Number of tokens | Per cent of newspapers | Per cent of corpus |
|---|---|---|---|---|
| Edasi | ed | 27,000 | 16 % | 6 % |
| Kodumaa | km | 10,600 | 6 % | 2.5 % |
| Noorte Hääl | nh | 37,000 | 22 % | 9 % |
| Punane Täht | pt | 3,800 | 2 % | 1 % |
| Rahva Hääl | rh | 60,500 | 36 % | 14 % |
| Sirp ja Vasar | sv | 21,500 | 13 % | 5 % |
| Õhtuleht | ol | 8,000 | 5 % | 2 % |