Using NLP to Evaluate 100 Million Global Webpages Daily to Contextually Target Consumers
TTD’s contextual targeting team needs to analyze 100 million web pages per day. Fifty percent of the webpages are non-English. Half of the content was not being properly analyzed and targeted intelligently. TTD attempted to build a model using Spark NLP, however the package could not scale and was not cost-effective. GPU utilization was low and the solution was cost prohibitive. TTD engaged with Databricks in early 2022 to build an NLP model on Databricks. Our teams partnered closely together. We were able to build a solution using distributed inference (150-200 GPUs running at 80%+ utilization); Each day, Databricks translated two hundred times faster across 50 million web pages that are in for over 35 + languages and at a fraction of the cost. This solution enables TTD teams to standardize on English for contextual targeting ML models. TTD can now be a one-stop shop for their customers’ global advertising needs.
The Trade Desk is headquartered in Ventura, California. It is the largest independent demand-side platform in the world, competing against Google, Facebook, and others. Unlike traditional marketing, programmatic marketing is operated by real-time, split-second decisions based on user identity, device information, and other data points. It enables highly personalized consumer experiences and improves return-on-investment for companies and advertisers.
Talk by: Xuefu Wang and Mark Lee
Connect with us: Website: https://databricks.com
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc
Facebook: https://www.facebook.com/databricksinc
Add comment