The corpus of the 1890s - total of 348,000 tokens - consists of following text classes:
| Text class | File name beginning | Number of tokens | Per Cent of corpus |
|---|---|---|---|
| Newspapers | aja | 193,000 | 55 % |
| Fiction | ilu | 155,000 | 45 % |
Newspaper texts come from the following titles:
| Newspaper | File name beginning | Number of tokens | Per Cent from newspapers | Per Cent from corpus |
|---|---|---|---|---|
| Eesti Postimees | epo | 36,600 | 19 % | 11 % |
| Olewik | ole | 33,400 | 17 % | 10 % |
| Postimees | pos | 48,000 | 25 % | 14 % |
| Ristirahwa pühapäewa leht | rip | 2,100 | 1 % | 1 % |
| Sakala | sak | 5,300 | 3 % | 2 % |
| Walgus | val | 60,500 | 31 % | 17 % |
| Wirmaline | vir | 7,100 | 4 % | 2 % |