I am developing an eLearning application using MySql, which has the following schema. My issue with handling millions of rows in tables like attempted_questions and question_choices.
Let say I have 1000 users and 100000 questions. then the number of rows in question_choices will become 400000(no of choices 4 x no of questions 100000).
If each user attempts question 2 times, the number of rows in attempted_questions = 2 x 1000 users x 400000 questions, which will be 800000000 rows. In attempted_questions table, I need to track all attempts of the users.
Some of the most regularly used queries are:
SELECT COUNT(DISTINCT question_id) FROM attempted_questions WHERE user_id = 1 SELECT COUNT(DISTINCT question_id) FROM attempted_questions WHERE user_id = 1 AND subject_id = 1
SELECT s.id, s.name, qc.qn_count, aq.attended_cnt FROM subjects s LEFT JOIN (SELECT COUNT(q.id) as qn_count, q.subject_id FROM questions q GROUP BY q.subject_id) qc ON qc.subject_id = s.id LEFT JOIN (SELECT COUNT(DISTINCT question_id) as attended_cnt, subject_id FROM attempted_questions WHERE user_id = 1 GROUP BY subject_id) aq ON aq.subject_id = s.id
How can I optimize the DB for this much data?
what are the issues with my DB design that may arise when the application grows?