Data Archaeology for Human Rights in Central America: HRDAG Collaborates with UWCHR

Phil Neff frolicking in Muir Woods

Phil Neff frolicking in Muir Woods

Patrick Ball is kicking himself for a decision he made almost 25 years ago. “I was clever, but I wasn’t smart,” he says ruefully, as he considers the labyrinth of tables and ASCII-encoded keystrings he used to design a database of human rights violations for the pioneering Salvadoran non-governmental Human Rights Commission (CDHES). Now I’m sitting in his office in San Francisco’s Mission District watching over his shoulder, and trying to keep up, as he bangs out code to decipher the priceless data contained in these old files. Created in 1991 and 1992, during the last days of El Salvador’s internal armed conflict, the files detail thousands of human rights violations, as well as information on their perpetrators. This is data archaeology in action.

I first learned of Patrick’s work with HRDAG in 2012, during the trial of José Efrain Ríos Montt and Mauricio Rodríguez Sánchez for the Guatemalan genocide. I had recently returned home to Seattle after two years working with the Network in Solidarity with the People of Guatemala, coordinating teams of international observers accompanying human rights defenders, among them the witnesses and lawyers at the center of the genocide trial. I was unemployed and feeling the effects of reverse culture shock, and it was a strange comfort to be able to follow, via live-stream from my bedroom in Seattle, the proceedings of the trial back in Guatemala.

For weeks, more than a hundred Maya Ixil survivors brought forward their searing testimony, and a roster of international experts complemented their eyewitness accounts. One of those experts was HRDAG’s Dr. Patrick Ball, who presented a statistical report supporting the charge of genocide, including the finding that, in little more than a year of Ríos Montt’s time as head of state during (1982-1983), the Guatemalan military killed 5.5% of the indigenous population in the Maya-Ixil municipalities of Nebaj, Chajul, Cotzal, compared to 0.7% of the non-indigenous population. On May 10, 2012, in an historic ruling, the former dictator was convicted of genocide; a sentence tragically reversed by Guatemala’s Constitutional Court days later. Today, nearly four years later, a new trial has yet to open.

Eventually my life in Seattle got back on track, and I now have the privilege to work at the University of Washington Center for Human Rights as coordinator of Unfinished Sentences, the Center’s project supporting human rights organizations and survivors’ groups in El Salvador. In fact, it was UWCHR Director Angelina Snodgrass Godoy’s study-abroad course in Guatemala which first inspired my commitment to human rights movements in Central America.

One of our first major projects was the analysis and publication of the Yellow Book, a secret document created by Salvadoran military intelligence during the 1980s, containing the names and photographs of nearly 2,000 Salvadorans considered “delinquent terrorists,” including not only leaders of the FMLN guerrillas, but also human rights advocates, labor activists, and civilian political figures. We knew that many of the people profiled in the Yellow Book had been detained, tortured, killed or forcibly disappeared during El Salvador’s internal armed conflict. But how many? Some of them we knew personally, like Hector Bernabé Recinos, a member of the Committee of Ex-Political Prisoners (Ex-COPPES), our partners in researching the Yellow Book. “To live to see this book, it makes you feel happy to be alive, that they weren’t able to kill you,” Hector told us. “Because the decision to eliminate you had been close.”

What about the many hundreds of people whom we could not interview about their experiences, who had not survived, or who we knew only as names and blurry photos in a photocopy of a three-decades old document? I tried matching the names in the Yellow Book with lists of victims published as an annex to the report of the Salvadoran Truth Commission. I was doing everything wrong: fiddling with spreadsheets, overwriting my data, confusing myself with logical conundrums. What if I found a name in the Yellow Book that matched a killing reported to the Truth Commission, but then later found another Truth Commission report of the arrest of someone with the same name at a later date? How could I know which matches were correct, and which just seemed correct?

I called up Patrick Ball. Actually, I emailed him, and I found that despite his workload at HRDAG, he was eager to help. In fact, Patrick’s roots at the intersection of human rights and data science are in El Salvador, where he worked as a human rights accompanier with Peace Brigades International, and later designed databases used by Salvadoran human rights organizations. Patrick and colleagues at HRDAG compared the Yellow Book with four such historical databases of human rights violations in El Salvador.

In total, approximately 45% of names in the Yellow Book matched reports of detentions, torture, killings, or disappearances. HRDAG’s process was scientifically rigorous and statistically sound, lending authority and legitimacy to the UWCHR’s report on the Yellow Book. I believe that HRDAG’s quantitative findings, which made headlines in El Salvador, drove much of the considerable wave of interest received by the publication. More than a year later, it is still the most popular post on our website, and we continually receive emails from Salvadorans who have identified themselves or a loved one in the Yellow Book, and who want to share details of their experiences.

Which brings me, finally, to Patrick’s office at HRDAG HQ in San Francisco. For the first two weeks of January, I’ve joined him for an intensive training with the goal of revitalizing a trove of databases regarding the Salvadoran conflict. These include databases created by the CDHES, the U.N. Truth Commission, and the U.S.-Salvadoran NGO El Rescate, among others. At the University of Washington Center for Human Rights, we hope to use these to extract information about the military hierarchy at the time of specific human rights abuses which our partners in El Salvador are investigating. But first we have to figure out how to open them.

It turns out that opening these old files and saving them in a modern format isn’t that hard, thanks to LibreOffice, which lets us switch seamlessly from an outdated MS-DOS character encoding to the contemporary UTF-8 standard. This is important, because Patrick’s “clever” method back in 1993 was to use funky four-character codes like ‘fQF╣’ or ‘p  σ’ to encode important information: a date; or the name, rank, or post of a particular military officer. We will need to replace these codes with something that is easier to work with using modern data analysis frameworks.

This will take us from the Wikipedia page for the archaic code page 437 to an iPython Notebook, where we (mostly Patrick, to be honest) will reverse engineer a function that reveals ‘fQF╣’ to be the date 1983-01-13. Patrick will also design a function to replace a short, encoding-sensitive string like ‘p  σ’ with a long, unique hash number: say, ‘5726e79addff2af59fc4dd5bd356b66f’. While 1993 Patrick had to worry about conserving every single byte, 2016 Patrick is more concerned with mathematical consistency and elegance. Luckily, today’s computers can handle it!

Meanwhile, I’m studying the basics of Unix command line and software development, pondering Patrick’s axioms about subjects like project architecture, for example: “Never mix data and logic!” I’ll learn to use Git to collaborate remotely with Patrick and others, and the Python Pandas library for data analysis. All this just to start the process of working with one dusty (digitally-speaking, at least) historical database. Back in Seattle, I will have to think creatively about what kinds of queries these databases can answer, how to put them in conversation with each other, and how to effectively and responsibly share the results.

Each record in these databases represents a life cut short, or altered forever; a family still waiting for a sign of their loved one; a perpetrator who may still walk the halls of power. Exhuming data is worlds away from exhuming mass graves. But as countries like Guatemala and El Salvador continue to confront the violence of the recent past, both processes have the potential to contribute to the same struggles for justice.

Phil Neff is project coordinator for Unfinished Sentences at the University of Washington Center for Human Rights, and a board member of the Network in Solidarity with the People of Guatemala. Follow him on Twitter: @cascadiasolid


Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.

Donate