Estimating the impact of unknown unknowns on aggregate query results