Principled Data Processing: Goals and Practices for Auditable, Replicable, Scalable, and Transparent Data Work (CSSS Workshop, 11/1/2018)
Posted: 10/26/2018 (Local Events)
Principled Data Processing: goals and practices for auditable, replicable, scalable, and transparent data work
Patrick Ball, Human Rights Data Analysis Group
If we have the data and the code, it should be easy to re-calculate results from work we did in the past. In most projects, this turns out to be difficult or impossible. In this workshop, we will discuss principles for data processing: transparency, auditability, replicability, and scalability. I’ll propose a series of practices that help work get closer to these principles. Some of the practices include:
- A task is a quantum of workflow
- Standardizing small tasks
- Using basic unix tools to standardize and link tasks
- Executable documentation: if it runs, it’s true
- Separating data and logic
- Testing: unit-level, file-level, project-level
Patrick Ball has spent more than twenty-five years conducting quantitative analysis for truth commissions, non-governmental organizations, international criminal tribunals, and United Nations missions in El Salvador, Ethiopia, Guatemala, Haiti, South Africa, Chad, Sri Lanka, East Timor, Sierra Leone, South Africa, Kosovo, Liberia, Perú, Colombia, the Democratic Republic of Congo, and Syria. Patrick has provided expert testimony in several trials, including those of Slobodan Milošević, the former President of Serbia; José Efraín Ríos Montt, former de-facto president of Guatemala; and Hissène Habré, the former President of Chad.
Patrick founded the Human Rights Data Analysis Group (HRDAG) in 1991, where he currently serves as Director of Research.
In 2018, Patrick received the Karl E. Peace Award for Oustanding Statistical Contributions for the Betterment of Society; in 2015, the Claremont Graduate University awarded Patrick a Doctor of Science (honoris causa); in 2014, he was elected a Fellow of the American Statistical Association; in 2005, the Electronic Frontier Foundation gave him their Pioneer Award; and in 2003, the ACM gave him the Eugene Lawler Award for Humanitarian Contributions within Computer Science and Informatics.
Patrick is on the Advisory Council of Security Force Monitor, a project of the Columbia Law School Human Rights Institute; a Fellow at the Human Rights Center at Berkeley Law of the University of California-Berkeley; and a Research Fellow at Carnegie Mellon’s Center for Human Rights Science. Patrick received his bachelor of arts degree from Columbia University, and his doctorate from the University of Michigan.
Time: 12:30-2:30 PM
Location: SAV, Room 117