I have a table that dictates who was on a project, from a startdate to an enddate. I’d like some help writing a query that will return the number of "active" users at the end of every month, for the past 12 months.
CREATE TABLE `roster` ( `id` int(11) NOT NULL, `user_id` int(11) DEFAULT NULL, `project_id` int(11) DEFAULT NULL, `start_date` datetime NOT NULL, `end_date` datetime NOT NULL, `closed_date` datetime DEFAULT NULL, PRIMARY KEY (`id`) );
id user_id project_id start_date end_date closed_date 1 1 1 2019-05-27 00:00:00 2021-02-01 00:00:00 NULL 2 2 2 2020-05-27 00:00:00 2021-02-01 00:00:00 2020-02-05 00:00:00 3 3 3 2020-05-27 00:00:00 2024-02-01 00:00:00 2020-02-05 00:00:00
And the result would be something that shows how many distinct users had an active project within each month (only since the start of this year).
So for the dataset above, we can see that all 3 projects were "active" in the month January 2021, because the end date is in the future.
2020-01-31 | 3
The last bit of complexity is that sometimes a project can be closed before the EndDate and I’d like to exclude any users who’s endDate is in the future but the project has actually closed.
For example in the dataset above, the third project has a end date of Feb 2024, however the project closed in Feb 2021. So technically the person was active in 2021 Jan and Feb, but not March.
p.s Hoping to get an answer in Postgres (I’m using Redshift)