Tag storage when using HBM as an L4 cache

I’m reading through Computer Architecture: A Quantitative Approach, and I’m trying to understand the issues surrounding tag storage in using HBM as an L4 cache. The primary concern mentioned in the book is “where do the tags reside?”, since the sheer capacity (1 GiB) of the HBM with a small block size (64B) results in a large amount of tag bits (96MiB).

The book first suggests that the tag data might live outside of HBM – except there is not enough SRAM capacity to store them. It then goes on to suggest that keeping the tags in HBM as a solution to this. However, in L1-L3 SRAM caches, don’t the tags reside in each of the caches themselves? Why would HBM be any different? Is it because we’re somehow unable to perform hit detection/tag comparison on the HBM DRAM chips themselves (as is done in SRAM caches), so they need to be processed outside of HBM?