If you missed Spark Summit, catch this talk from Abdulla about how to build predictive recommendation models and ensure type safety using Apache Spark DataFrames. See how Credit Karma's choice of metric evaluation helps us calibrate models to obtain the best global result, and hear our lessons learned when we scaled our model development environment to handle Terabyte-scale data with thousands of features.
Credit Karma leverages data for over 60 million members to deliver a personalized user experience. To do this, we rely largely on Scala and Akka to do the heavy lifting. Powerful tools, however, demand some mastery on how to use them.
When we were considering how to push 700k events per minute from Kafka into our data warehouse, Vertica, we learned these lessons about how to choose the best framework for high throughput.
At Credit Karma, my team maintains a service powered by a multi node Akka cluster. Akka is a framework for building distributed, concurrent, and message-driven systems in Scala. It helps us easily scale our service, and gives us some resilience when problems inevitably happen in production. In this post, I’ll cover two problems - auto down and quarantine - and share our lessons learned.