Yelp Scrapper Harvesting Guide

Hi Guys,
 
Yesterday I ran a campaign for harvesting a specific niche and it scrapped 150 K records. Now when I cleaned them up was only left with 70 records and it was all full of duplicates. Now my question is can I somehow prevent this from happening so that duplicates are not harvested , like I could set a condition based on emails or name or website , that I only want 1 record per name or website or address. Otherwise it was a total waste of time and wouldn’t want the same to happen again. As I am after a unique email from a specific company / organization.

Also for harvesting , please share a footprint that will ensure that the keywords that I harvest won’t have duplicate url’s or duplicate emails.

Thanks,