The Art and Science of Coding AHPN Documents
The coding, from my perspective, is the heart of the project. I say this, because the coding team has the responsibility of selecting documents according to the random sample, recording the documents’ contents, and applying the criteria to convert that content into an entry in a quantitative database. Not to mention the fact that this team has the privilege of being in direct contact with the documents.
At present, because of advanced organizational processes, not everyone has a chance to hold an original document in their hands. The quantitative study had many advantages in this regard; since we started work in parallel with the archival organization, we had the chance to explore the 95 percent* of documents that had not passed through any process.
It was not easy to be dedicated to coding for many years, through many steps and stages. People came and went and the coding must always be the same, with excellent and replicable results, no matter who the coder was.
Another characteristic of this team was its composition. The team’s size was always an average of 12 people, aged between 18 and 30 years; they had critical knowledge and experiences (with different academic backgrounds); and finally, the team always had a balance of men and women.
All these elements allow me to conclude that the encoding process (sample extraction and encoding of documents) is the backbone of the quantitative project, since our application and understanding of the manuals and methodology (taking care not to make interpretations) ensured that during the analysis we had the elements necessary to generate and calculate estimates. Sure, to get to that point there are several important intermediate steps, which will be discussed later, such as weight calculation, generating inferences, and the analysis itself.
While a person is working as a coder and facing the document it is difficult to imagine the final product, and it is difficult to see how the document will be useful for further analysis—especially because it is only one document among millions. Recall that since July 2006 (when the process started) and now (November 2013), seven years have passed, as have several stages in the investigation.
Also you need to keep in mind the broad and important context of study—the 36 years of conflict and the multiple offices that produced documents during the conflict. The structure of the National Police, the kinds of information they recorded, and the ways in which they recorded that information, changed over time. So no single document can represent an entire document type, source, or time period. It was important that the random sample selected a wide variety of documents.
In a qualitative study, we may consider a single a key document, but sampling involves a collection of documents. This is where we link the two efforts, quantitative and qualitative, as there may be doubts about the reliability of a single document. But if we compare and contrast it with statistics that indicate others of that type or with those characteristics, or that time period, it gives a different weight to the individual document.
Alongside the coding effort, during the research, we applied a procedure that should be mentioned: inter-rater reliability. Inter-rater reliability, or IRR, is a mathematical measure that indicates how consistent we coders were when applying the manuals, vocabularies, and criteria when completing the coding form. In other words, reliability means that coders analyzing the same document provide consistent answers with the help of, for example, templates and alphanumeric codes.
To re-cap, the keys to successful coding at AHPN are:
- The design of the coding instruments, including an agreed-upon format, manuals, and a clear strategy to recover data from each document
- Clear research questions that helped us to always keep in mind what we’re looking for and why we’re meticulously recording document information. We also have to take the time to write, since our research can only achieve our objectives if it is shared with a wider audience
- Application of the sampling design and coding methods by an evolving team who consistently maintained high standards of accuracy and precision
For my next post, I’ll write about how you can’t ask too much of a document—but this doesn’t mean the documents don’t have the elements necessary to find other things.
* The 95 percent estimate is based on a calculation from the pilot study of how many documents in the sample had been filed by the Archive process.
Photos: Christine Grillo 2013.
THE AHPN SERIES by Carolina López
Quantitative Research at the AHPN Guatemala
The Story of One Document Inside the AHPN
The Art and Science of Coding AHPN Documents
Learning Day by Day: Quantitative Research at the AHPN
IRR: Agreement among Coders is Key
The AHPN: Home of Stories Old and New
The Great Lessons in Research at the Archive
Ten Years and Counting in Guatemala
The Case of Ana Lucrecia Orellana Stormont