I won't have complex shaders, so I guess that buys me some GPU compute. Would that be reasonable, or a complete overkill? So maybe I could do a custom system that can clip some of those elements, at least the background fill, looking at the closer layers. I always clear the background to sky blue, that's already filling each pixel then I draw clouds, then terrain, then entities, then UI. In my case I guess I'll have an average of 3 to 4 overdraws per pixel. I wonder if I should worry about overdraw.
So, in my 2D game I push render commands to a list, then order them by z, and then execute them from farthest to closest.