Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection
Debiasing data using TRAK
- problem: data bias can cause worse performance on specific groups. Normal debiasing methods just remove data, which isn’t ideal.
- used predictive data attribution method (TRAK) to figure out which data points contribute most to the worst group’s performance on a small validation dataset, just get rid of those data points
- this paper seems mostly just like an application of TRAK