Powered By Scalding

Want to be added to this page? Send a tweet to @scalding or open an issue.

Company Scalding Use Case Code
Twitter We use Scalding often, for everything from custom ad targeting algorithms, market insight, click prediction, traffic quality to PageRank on the Twitter graph. We hope you will use it too! -
Spotify We use Scalding for almost everything including music recommendation features like Discover Weekly & Release Radar, key business metrics, analytics and content catalogue. -
Etsy We’re starting to use Scalding alongside the JRuby Cascading stack described here. More to come as we use it further. -
eBay We use Scalding in our Search organization for ad-hoc data analysis jobs as well as more mature data pipelines that feed our production systems. -
Snowplow Analytics Our data validation & enrichment process for event analytics is built on top of Scalding. GitHub
PredictionIO Machine-learning algorithms build on top of Scalding. GitHub
Gatling We’ve just rebuilt our reports generation module on top of Scalding. Handy API on top of an efficient engine. GitHub
SoundCloud We use Scalding in our search and recommendations production pipelines to pre and post-process data for various machine learning and graph-based learning algorithms. We also use Scalding for ad-hoc and regular jobs run over production logs for things like click tracking and quality evaluation on search results and recommendations. -
Sonar Our platform is built on Hadoop, Scalding, Cassandra and Storm. See Sonar’s job listings. -
BSkyB Sky is using Scalding on Hadoop and utilizing HBase through the SpyGlass library for statistical analysis , content related jobs and reporting. -
LivePerson LivePerson’s data science group is using Scalding on Hadoop, to develop machine learning algorithms and big data analysis. -
Sharethrough Sharethrough uses Scalding throughout our production data infrastructure. We use it for everything from advertiser reporting and ML feature engineering, to ad targeting and click forecasting. -
LinkedIn Scalding is being used at LinkedIn both at the Product Data Science team and the Email Experience team. -
Stripe Stripe uses Scalding for ETL and machine learning to support our analytics and fraud prevention teams. -
Move Move uses Scalding on Hadoop for advanced analytics and personalization for Realtor.com and its mobile real estate apps. -
Tapad Tapad uses scalding to manage productized analytics and reporting, internal ad-hoc data mining, and to support our data science team’s research and development efforts. -
CrowdStrike CrowdStrike employs Scalding in our data science and data mining pipelines as part of our big data security platforms in research, development, product and customer endpoints. We have plans to open source our Scalding API (AWS, EMR) on github. -
Tumblr Tumblr uses scalding as a sort of MVC framework for Hadoop. Applications include recommendations/discovery, spam detection, and general ETL. -
Elance Elance uses scalding for constructing data sets for search ranking, recommendation systems, other modeling problems. -
Commonwealth Bank Of Australia Commbank uses scalding as a key component within its big data infrastructure. Both on the ETL side, and for the implementation of data science pipelines for building various predictive models Github
Sabre Labs Sabre Labs uses Scalding for ETL and ad hoc data analysis of trip information.  
gutefrage.net gutefrage.net uses Scalding for it’s Data Products and general ETL flows.  
MediaMath MediaMath uses Scalding to power its Data Platform, the centralized data store that powers our ad hoc analytics, client log delivery and new optimization/insight-based products.  
The Search Party The Search Party is using Scalding to build production machine learning libraries for clustering, recommendation and text analysis of recruitment related data. Scalding is a breath of fresh air!  
Opower Opower uses Scalding and KijiExpress to analyze the world’s energy data and extract machine learning-based insights that power behavior change.  
Barclays Barclays uses Scalding for Data Warehousing, ETL and data tranformation into columnar (query optimized) data formats.  
Devsisters Devsisters uses Scalding for game log analysis (1264)