This project planned to create an XML index of all the words in the word index from Zaliznjak 2004, with all of the vowels marked up to indicate their Common Slavic etymology, in addition to an XML index of all the given names (as denoted by личн., which includes nicknames) extracted from the word index, marked up for gender and classified as attested (based on data from Tupikov), compositional, or "other", for those names that fall into neither category.

The initial use case for the full word index was for historical phonological research, whereas the name index was intende for use as the basis for reconstructing a "social network" for medieval Novgorod, looking for patterns in communication between different groups of individuals (different genders, different kinds of names, etc.) Further planned extensions to the project included the expansion of etymological information to include all segments in the corpus as well as the markup of inflectional morphemes to enable a quantitative analysis of historical morphology in the Novgorod dialect.

All of the materials-- the XML indices, as well as the XSLT that will be written to generate the results-- were to be released under a Creative Commons Attribution-Share Alike license, which will enable scholars to repurpose the materials for other research as they see fit.