An XML index of all the documents with their proposed dates is already complete. This date index can be combined with various word indices to provide a temporal context for that data. An XML index of all the words in the word index from Zaliznjak 2004, with all of the vowels marked up to indicate their Common Slavic etymology, is currently in production. Also being developed is an XML index of all the given names (as denoted by личн., which includes nicknames) extracted from the word index, marked up for gender and classified as attested (based on data from Tupikov), compositional, or "other", for those names that fall into neither category.
The initial use case for the full word index as it is currently being developed is for historical phonological research, whereas the name index will be used as the basis for reconstructing a "social network" for medieval Novgorod, looking for patterns in communication between different groups of individuals (different genders, different kinds of names, etc.) Further planned extensions to the project include the expansion of etymological information to include all segments in the corpus as well as the markup of inflectional morphemes to enable a quantitative analysis of historical morphology in the Novgorod dialect.
All of the materials-- the XML indices, as well as the XSLT that will be written to generate the results-- will be released under a Creative Commons Attribution-Share Alike license, which will enable scholars to repurpose the materials for other research as they see fit. We also encourage others to develop their own materials based on ours, and share it with the larger community of scholars via our project website