Mauricio Sadinle, Assistant Professor, Department of Biostatistics
In the context of a civil war, it is common for multiple organizations to maintain registries of human rights violations reports. These registries can, in principle, be used to address the common question of how many violations (have) occurred due to the war. While not all violations are reported, obtaining the number of single violations reported across organizations provides us with a lower bound on the total. This is often a nontrivial task, since violation reports typically lack unique identifiers and have name misspellings, typing errors, missingness, and other data quality issues. Probabilistic record linkage is used to identify coreferent reports and provide a probabilistic quantification of the uncertainty in the linkage process. Under additional assumptions, capture-recapture or multiple-systems estimation can be used to provide an estimate of the total number of violations using the results of the linkage step. In this talk we review record linkage and capture-recapture models, and present a Bayesian approach to incorporate the linkage uncertainty into the estimation of the total number of violations. We present a case study where our goal is to estimate the number of civilian casualties from the civil war of El Salvador.