In 2021 Lyft wrote a series of blogposts starting with https://eng.lyft.com/scaling-productivity-on-microservices-at-lyft-part-1-a2f5d9a77813 that document what feels like a very common story.
Decomposed #monolith, initially doing "all services on one box", finding the issues with that and then evolving to an #EnvoyProxy mesh setup.
This feels very simlar to Uber's SLATE (https://www.uber.com/en-GB/blog/simplifying-developer-testing-through-slate/), but hooking into staging not production
@coldclimate Do you already have a lot of microservices and are trying to wrangle them, or are you looking at starting to build a lot of microservices?
@coldclimate You should simultaneously push back on adding more, and consolidate into chunkier services. Actual "micro" services are a really *really* bad idea for almost everyone.
@coldclimate Also, you already need tracing more than you realise.