By 2015, the company behind Apache Spark had a product problem hiding inside a marketing problem. Downloads were booming. Revenue wasn’t. Cloud giants were happily monetizing the open-source technology Databricks had built, while the company itself struggled to convert popularity into a business. CEO Ali Ghodsi reportedly considered walking away entirely.
▶️ Listen to this episode
[ Apple Podcasts ] [ YouTube ]
What saved Databricks wasn’t a better algorithm. It was a category creation strategy that gave the market language it didn’t have before, paired with product moves gutsy enough to back that language up.
That combination took the company to a $134 billion valuation. Here’s what I think every growth leader should take from it.
The Trap of Being “Best in Class”
Enterprise data in the late 2010s was split into two camps.
Data lakes stored raw, unstructured data cheaply. Data scientists loved them. Governance teams did not.
Data warehouses, championed by rivals like Snowflake, were clean and queryable but expensive and rigid.
Most companies bought both. They ended up stitching together brittle pipelines just to keep data consistent across two disconnected systems.
Databricks could have kept competing as “the best managed Spark.” That would have been a mistake. Being the best version of a known thing caps your ceiling. You inherit the market’s existing frame, and someone else usually owns that frame already.
Instead, Databricks decided to build the bridge between the two camps itself.
A Product Bet Before There Was a Word for It
This is the part people skip when they talk about Databricks’ marketing win. The marketing had nothing to sell without the product first taking a real risk.
The team built Delta Lake, a storage layer that brought database-grade reliability to raw cloud storage. It was a technical bet that data science and business intelligence were converging, and that one environment needed to serve both the engineer writing Python and the analyst writing SQL.
I’ve sat in enough roadmap debates to know how hard that call is. You’re asking engineering to build into a competitor’s core turf, with no guarantee the market will care.
Marketing can’t manufacture credibility out of nothing. It amplifies what’s already true about the product. Databricks earned the right to make a big claim before it made one.
Naming the Category
Here’s where the category creation strategy actually took shape.
Databricks first described its new architecture as “unified data analytics.” It was accurate. It was also forgettable. Prospects didn’t respond to it, and by most accounts, that frustration was visible in early focus groups.
CMO Rick Schultz and the founding team kept iterating instead of settling. That’s a detail worth sitting with. Founders staying hands-on in positioning work, long after most companies hand it off, is rare and it mattered here.
The eventual answer was the “data lakehouse.” Part data lake, part data warehouse. Instantly legible, slightly provocative, and impossible to confuse with anything else on the market.
Before the term even existed, Databricks had already been publishing content about the cost and complexity of running separate silos. That’s a sequencing lesson I use with my own teams constantly: agree on the problem before you introduce the solution’s name. A category label dropped on an unconvinced market lands as jargon. A category label dropped on a market you’ve already primed lands as relief.
Message Stamina, Not Message Comfort
Analysts and legacy vendors dismissed the lakehouse as marketing spin. That’s normal. Any real category claim invites pushback, because it’s asking incumbents to accept a new frame they don’t control.
Most marketing teams flinch here. They soften the language, chase consensus, or quietly retire the term after two quarters of resistance.
Databricks did the opposite. Schultz has called it message stamina, and I think that’s the right name for it. White papers, technical content, a global tour, and an activated open-source community repeating the same core idea across dozens of formats.
There’s a rule I’ve seen play out again and again in enterprise B2B: a buyer needs to hear a message seven to nine times before it actually sticks. Most marketers give up around attempt three, convinced the message has failed. It hasn’t failed. It’s just early.
The real proof point came when customers like Capital One started describing their own “lakehouse strategies” publicly. Once your customers are using your vocabulary unprompted, you’ve stopped marketing a product and started defining a category. Competitors are now forced to respond on terms you set.
Winning the Practitioner Before the Executive
While the lakehouse message circulated, Databricks ran a disciplined, unglamorous demand engine underneath it.
Early on, the company skipped the C-suite almost entirely. It targeted data scientists, data engineers, and their direct managers, the people who actually touch the data every day.
That’s a sequencing choice worth noticing. Selling to the CIO first, before your product has technical champions, is often a vanity move. It looks efficient on paper and produces shallow deals in practice.
Open-source engagement across Spark and MLflow wasn’t treated as community goodwill. It fed intent-scoring models that told the sales team exactly which accounts were already primed. Combine that with a distribution deal that made Azure Databricks a first-party Microsoft service, and you get a pipeline built on genuine technical trust, not cold outreach.
Adapting the Story for the AI Moment
Generative AI threatened to make even a well-earned category feel dated overnight.
Databricks answered on both fronts again. On the product side, the roughly $1.3 billion acquisition of MosaicML in 2023 bought the infrastructure to train and serve large language models. On the marketing side, the lakehouse story evolved into the Data Intelligence Platform.
The messaging shift was smart for a specific reason: it read enterprise anxiety correctly. Companies were nervous about handing proprietary data to closed AI vendors. “Your data, your AI” answered that fear directly instead of chasing hype language.
What This Means for the Rest of Us
You don’t need a category as big as the lakehouse to use this playbook.
Name the problem your buyers already feel but haven’t articulated. Prove your product solves it before you go loud. Then repeat the message well past the point where your own team is sick of hearing it.
The companies that win category battles aren’t always the ones with the best technology. They’re the ones willing to say the same true thing, out loud, for longer than everyone else is comfortable with.