At Credit Karma, my team maintains a service powered by a multi node Akka cluster. Akka is a framework for building distributed, concurrent, and message-driven systems in Scala. It helps us easily scale our service, and gives us some resilience when problems inevitably happen in production. In this post, I’ll cover two problems - auto down and quarantine - and share our lessons learned.
Three best practices helped us successfully pull off a 24-hour coding event that accelerated the development of our proprietary prediction software.