Best Practices to use Indetified URLs from GSA PI for SER

Hi 

I am scraping my own Targets with ScrapeBox 24*7
After I scrape 5-10GB of Data, I trim to root, de-duplicate and send the resulted list to GSA PI.

GSA PI feeding Identified list to my GSA SER Instance which verifies the lists and being used for link building.

Now my questions are.

*Question 1*
Should I keep adding my new targets into the existing Identified list OR I should wipe out my identified list occasionally and start over with a fresh GSA PI identified list?

Because over time, All the targets are already tried by GSA SER and there is nothing more left under this identified list to verify.

If yes, How occasionally? OR you can tell after how big your Identified list becomes when you wipe it out.

I am running GSA SER on a dedicated server with 2000 threads and getting 100+ LPM. So you can get an idea when I need to wipe it out.

*Question 2*
I am using SB Link Extractor to scrape the initial targets. I believe I am getting LESS Unique targets. 

How can I increase it?

Thanks in Advance