User’s Online Time and Period with no Users Online

Hello good people of stackoverflow!

I need help with writing a query, to identify online time grouped by users, and also to identify which are the periods where there were 0 users online and for how long.

Note that I have only read access for this database.

The challenge I am facing right now, is that there can be simultaneous sessions recorded for a user.
For example within a day’s observation period(there is also a primary key of log_id that follows the chronological order of login_ts for every new session started):

User login_ts logout_ts
Kadaj 2023-05-01 08:00:00 2023-05-01 13:00:00
Kadaj 2023-05-01 10:00:00 2023-05-01 12:00:00
Kadaj 2023-05-01 11:00:00 2023-05-01 15:00:00
Kadaj 2023-05-01 16:00:00 2023-05-01 18:00:00

Challenge 1 – Total Online Time in Day

Observation period: ‘2023-05-01 00:00:00’ to ‘2023-05-01 23:59:59’ or ‘2023-06-01 00:00:00’

If only 1 session is allowed per user I could have gotten the sum of the duration of each session:

User login_ts logout_ts duration
Kadaj 2023-05-01 08:00:00 2023-05-01 13:00:00 5
Kadaj 2023-05-01 10:00:00 2023-05-01 12:00:00 2
Kadaj 2023-05-01 11:00:00 2023-05-01 15:00:00 3
Kadaj 2023-05-01 16:00:00 2023-05-01 18:00:00 2

Which will make up of 12 hours, but because the sessions overlap each other the actual online time is: 8am to 3pm + 4pm to 6pm = 9 hours

Expected Results

User duration
Kadaj 9
Yazoo 10
Loz 3

Challenge 2 – No Users Online Time

Assuming I want to find out for 2023-05-01(Observation period same as above), what were the periods where there were no users online:

Expected results

no_user_start_time no_user_end_time duration
2023-05-01 00:00:00 2023-05-01 08:00:00 8
2023-05-01 15:00:00 2023-05-01 16:00:00 1
2023-05-01 18:00:00 2023-05-01 23:59:59 (or 2023-06-01 00:00:00) 6

The examples I have given is to simplify the obstacle I am trying to cross, I have used a series of sorting with LAG, LEAD, GENERATE_SERIES, PARTITION functions but I keep hitting roadblocks and cannot come up with the logic to achieve my results(depending if a second/third session wraps around the original or if it overlaps at the head or tail of the session, lag and lead can produce very different results).

The table is much much larger than my example with more users and also have concurrent login sessions with overlapping timestamps.

I have something that works for Challenge 1, just looking for a second opinion to improve my query.

I need help with Challenge 2 where I am not sure how I can approach the observation period where there are no active login sessions.

All second opinions are greatly appreciated, please enlighten me on your thought process please 😀

Leave a Reply

Your email address will not be published.