Questioning a DB Consultant and their recommendations

We have hired a systems design consultant to help use architect a new CRM system, bringing together several business needs into one system, instead of the several we currently have.

It’s the typical project that a company spends a fortune on, and then discovers it doesn’t do what it thought it would. I’m trying to make sure that doesn’t happen to us.

Could I have some community feedback on some parts of the solution our consultant has recommended? Personally I’m unsure, but I’d defer to their expertise, and to this community’s collective knowledge.

Our User Tables

Some of our users have verified accounts with us. Some will have bought tickets through a third party provider. Others will simply reach out to our customer service, and may not be customers yet.

One way of doing this would be to use three tables: VerifiedUsers (data from our verified user accounts), TicketPurchasers (data pulled from our ticket provider – also verified), and Individuals (essentially email addresses used to contact our customer service).

Obviously it would be great to make connections between these tables, so we can pull together as much information about our customers/users as possible.

This all seems fine to me so far. Where I get a little unsure is that the consultant has suggested the Individuals table should be the master table.

This table is populated with the most “dirty” data: The least verified data sent from people with the lowest at stake (potentially not even customers, sometimes just so they can rant at our customer service team).

Also, if someone has several email addresses they use to reach out to customer service, how are we supposed to make a reliable trustworthy connection between the TicketPurchasers and VerifiedUser tables? It just seems likely that one person could easily have several rows of data in the Individuals table, and we would never really know.

Question: Does this make sense, even just from a database integrity point of view?

Building our own analytics

The consultant has also recommended we start creating our own analytics. This ostensibly makes sense as well: It would be great to be able to collate every touch point our customers/users make.

The idea would be that a table would be filled with a user_id (presumably taken from the Individuals table), an action_id (to reference the action they took — eg. “Logged into their account”, “Contacted customer service”) and a timestamp.

This data could be potentially interesting to look at (although I’d like to sit down and ensure it’s useful, too), but after a few quick calculations it seems would could fill 1,000,000 rows in a year.

If we continue growing as a company (as we wish to), this table of analytics could easily hit 10,000,000 rows in a few years, and just keep growing.

This scares me.

For a start, I don’t like the idea of a table just growing and growing. It makes me uneasy. Secondly, running live operations on a table of millions of rows (which is what would be required), could be prohibitively resource intensive. (Yes, I guess it depends how much my organisation is prepared to spend on this.)

Question: Are my concerns valid, or is it my inexperience?

Thanks for any help!