Introduction
an information firm actually develop?
This week what would have been information a 12 months in the past was now not information. Snowflake invested in AtScale, a supplier of semantic layer companies in a strategic funding within the waning firm’s historical past. An odd transfer, given the dedication to the open semantic interchange or “OSI” (one more acronym or .yaa) which seems to be metricflow masquerading as one thing else.
In the meantime, Databricks, the AI and Information firm, invested in AI-winner and all-round VC paramore Loveable — the quickly rising vibe-coding firm from Sweden.
Beginning a enterprise arm is a tried-and-tested route for enterprises. Everyone from Walmart and Hitachi to banks like JPMorgan and Goldman Sachs, and naturally the hyperscalers — MSFT, GOOG — themselves have enterprise arms (although surprisingly not AWS).
The advantages are clear. An funding right into a spherical can provide the precise of first refusal. It provides each events affect round complementary roadmap options in addition to clear distribution benefits. “Synergy” is the phrase utilized in boardrooms, although it’s the much less insidious and pleasant youthful brother of central price reducing so prevalent in PE relatively than venture-backed companies.
It ought to subsequently come as no shock to see that Databricks are branching out exterior of Information. In any case (and Ali has been very open about this), the group understands the way in which to develop the corporate is thru new use circumstances, most notably AI. Whereas Dolly was a flop, the jury is out on the partnership with OpenAI. AI/BI, in addition to Databricks Functions, are promising initiatives designed to deliver extra mates into the tent — exterior of the core SYSADMIN cluster directors.
Snowflake in the meantime could also be making an attempt an identical tack however with differing ranges of success. Other than Streamlit, it isn’t clear what worth its acquisitions are really bringing. Openflow, Neolithic Nifi under-the-hood, just isn’t effectively acquired. Moderately, it’s the inside developments such because the embedding of dbt core into the Snowflake platform that look like gaining extra traction.
On this article, we’ll dive into the various factors at play and make some predictions for 2026. Let’s get caught in!
Progress by use circumstances
Databricks has an issue. An enormous downside. And that’s fairness.
Because the fourth-largest privately held firm on this planet, on the tender age of 12 its staff require liquidity. And liquidity is pricey (see this glorious article).
To make good on its inside commitments, Databricks wanted maybe $5bn+ when it did this elevate. The quantity it wants per 12 months is critical. It’s subsequently merely not an choice to stop elevating cash with out firing staff and reducing prices.
The expansion is staggering. Within the newest sequence L (!) the corporate cites 55% yearly period-on-period progress resulting in a valuation of over $130bn. The corporate should proceed to lift cash to pay its opex and fairness, however there may be one other constraint which is valuation. At this level Databricks’ capacity to lift cash is virtually a bellwether for the business, and so there’s a vested curiosity for everybody concerned (the record is big) to maintain issues up.

The dream is to proceed rising the corporate as it will maintain the valuation — valuations are tied to income progress. Which brings us again to make use of circumstances.
The clear use circumstances, as proven right here, are roughly:
- Large knowledge processing and spark
- Inside this, Machine Studying workloads
- AI workloads
- Information warehousing
- Ingestion or Lakeflow (Arcion we suspect was maybe a bit early)
- Enterprise Intelligence
- Functions
It’s price noting these sectors are all forecasted to develop at round 15–30% all in, per the overwhelming majority of market stories (an instance right here). This displays the underlying demand for extra knowledge, extra automation, and extra effectivity which I consider is finally justified, particularly within the age of AI.

It might seem to point, subsequently, that the underside or “ground” for Databricks could be a couple of progress of 15–30%, and with it maybe a 40% haircut to valuation multiples (assuming linear correlation; sure, sure, assumptions, assumptions — some extra data right here), barring in fact any exogenous shocks to the system reminiscent of OpenAI going out of enterprise or conflict.
That is hardly regarding as a bear-case, which makes me surprise — what’s the bull?
The bull lies within the two A’s: AI use circumstances and Functions.
AI as a method out
If Databricks can efficiently associate with the mannequin suppliers and turn into the de-facto engine for internet hosting fashions and operating the related workflows, it may very well be huge.
Handkerchief maths — the income is $4.8bn RR rising at 55%. Say we’re rising at 30% in regular state, we’re lacking 25%. 25% of $4.8 is $1.2bn. The place can this come from? Supposedly current AI merchandise and current warehousing is already over $2bn (see right here). What occurs subsequent 12 months when Databricks is at $6bn and we have to develop 50% and subsequently want $3bn? Is the enterprise going to double the AI half?
Confluent is a benchmark. It’s the largest Kafka/stream processing firm, with a income of about $1.1bn annualised. It grows about 25% y-o-y however traded at about 8x income and bought to IBM for $11bn, so about 11x income. Even with its loyal fanbase and powerful adoptions for AI use circumstances (see for instance marketecture from Sean Falconer.), it could nonetheless battle to place one other $250m of annual progress on yearly.
Functions are one other story. Those who construct data-intensive purposes usually are not those who typically construct internal-facing merchandise, a activity usually borne by in-house groups of software program engineers or consultants. These are groups that already understand how to do that, and know methods to do it effectively, with current expertise particularly designed for its function, specifically core engineering primitives like React, Postgres (self-hosted) and Quick API.

A knowledge engineer may log in to Loveable, spin up Neon-Postgres, a declarative spark ETL pipeline, and front-end in Databricks. They might. However will they wish to add this to their ever-increasing backlog? I’m not positive.
The purpose is the core enterprise just isn’t rising quick sufficient to maintain the present valuation so extra strains of enterprise are required. Databricks is sort of a golden goose on the craps desk, who continues to keep away from rolling the unutterable quantity. They’ll now proceed making increasingly more bets, whereas all these across the desk proceed to profit.
Databricks is topped out as a data-only firm.
We’ve written earlier than about methods they may have moved out of this. Spark-structured streaming was an apparent selection, however the ship has sailed, and it’s firms like Aiven and Veverica that are actually in pole place for the Flink race.
📚 Learn: What to not miss in Actual-time Information and AI in 2025 📚
To turn into a model-serving firm or an ‘AI Cloud’ appears additionally a tall order. Coreweave, Lambda, and naturally Nebius are all on monitor to actually problem the hyperscalers right here.
An AI cloud is essentially pushed by a excessive availability of GPU-optimised compute. This doesn’t simply imply leasing EC2 cases from Jeff Bezos. It means sliding into Jensen Huang’s DMs and shopping for a ton of GPUs.
Nebius has about 20,000, with one other 30,000 on the way in which — this Yahoo report thinks the numbers are larger. All of the AI Clouds lease area in knowledge centres in addition to constructing their very own. Inference, not like spark, just isn’t a commodity due to the immense software program, {hardware}, and logistical challenges required.
Allow us to not overlook that Nebius owns simply over 25% of Clickhouse — each groups being very software program engineering-led and Russian; the Yandex Alumni Membership.
If there may be one factor we have now discovered it’s that it’s simpler to go up the worth chain than down it. I wrote about this funnel maybe two years in the past now nevertheless it appears more true than ever.

Snowflake simply eats into dbt. Databricks has simply eaten into Snowflake’s warehouse income. Microsoft will eat into Databricks’. And in flip, with uncooked knowledge centre energy, NVIDIA and Meta partnerships, and a military of the perfect builders within the enterprise, Nebius can eat into the hyperscalers.
Information warehousing below assault
With each passing day proprietary knowledge warehousing platforms appear increasingly more unlikely to be the technical finish for AI and Information infrastructure.
Salesforce are growing levies, databases are supporting cross-query capabilities, CDOs are operating Duck DB in Snowflake itself.
Even Invoice Inmon acknowledges warehousing firms missed the warehousing!
Whereas handy, there’s a scale at which enterprises and even late stage start-ups are demanding better openness, better flexibility and cheaper compute.
At Orchestra we’ve seen this first-hand. The businesses applied sciences reminiscent of Iceberg are overwhelmingly huge. From the most important telecom suppliers to the Reserving.com’s of this world (who occur to make use of and love Snowflake; extra on this later), conventional knowledge warehousing is unlikely to proceed dominating the share of price range it has finished for the final decade.
There are just a few methods Snowflake has additionally tried to broaden its core providing:
- Assist for managed iceberg; open compute engine
- Information cataloging (Choose *)
- Functions (streamlit)
- Spark and different types of compute like containers
- AI brokers for Analysts AKA snowflake intelligence
- Transformation (i.e. dbt)
Paradoxically for a proprietary engine supplier, it could seem that Iceberg is a big progress avenue, in addition to AI. See extra from TT right here.
Snowflake clients like it.
Information Pangea
I believe the definitions of the pioneers, early adopters, late adopters, and laggards are altering.
Early Adopters now embody a heavy real-time element and AI-first strategy to the stack. That is prone to revert to Machine Studying as folks realise AI just isn’t a hammer for each nail.
These firms wish to associate with just a few massive distributors, and have a excessive urge for food for constructing in addition to shopping for software program. They’ll have at the least one vendor within the streaming/AI, question engine and analytics area. A superb instance is reserving.com, or maybe Fresha, who makes use of Snowflake, Starrocks, and Kafka (I liked the article under).
📚 Learn: Exploring how trendy streaming instruments energy the following era of analytics with StarRocks. 📚
Early Adopters could have the standard analytics stack after which one different space. They lack the size to totally buy-in to an enterprise-wide knowledge and AI technique, so give attention to these use-cases they know work. Automation, Reporting.
The previous “early adopters” would have had the Andreesen Horowitz knowledge stack. That, I’m afraid, is now not cool, or in. That was the previous structure. The late adopters have the final stack.
The laggards? Who is aware of. They’ll most likely go together with whoever their CTO is aware of probably the most. Be it Informatica (see this unbelievable reddit publish), Cloth, or maybe even GCP!
The following step: chaos for smaller distributors
Loads of firms are altering tack. Secoda had been acquired by Atlassian, Choose Star had been acquired by Snowflake. Arch.dev, the creators of Meltano, shut-down and handed the challenge to Matatika. From the massive firms to the small, slowing income progress mixed with huge stress from bloated VC rounds make constructing a “Trendy-Information Stack”-style firm an untenable strategy.
📚 Learn: The Ultimate Voyage of the Trendy Information Stack | Can the Context Layer for AI present catalogs with the final chopper out of Saigon? 📚
What would occur when the Databricks and Snowflake progress numbers lastly begin to sluggish, as we argue they need to right here?
What would occur if there was a big exogenous market shock or OpenAI ran out of cash quicker than anticipated?
What occurs as Salesforce enhance taxes and therefore instruments like Fivetran and dbt enhance in value much more?
An ideal storm for migrations and re-architecturing is brewing. Information infrastructure is extraordinarily sticky, which suggests in troublesome occasions, firms elevate costs. EC2 spot cases have probably not modified a lot in value through the years, and so neither too has knowledge infra compute — and but even AWS are elevating costs of GPUs.
The marginal price of onboarding a further software is changing into very excessive. We used to construct the whole lot ourselves because it was the one method. However having one software for each downside doesn’t work both.

We should always not overlook that Parkinson’s regulation applies to IT budgets too. Regardless of the price range is, the price range will get spent. Think about in case you had a software that helped you automate extra issues with AI whereas decreasing your wareouse invoice and decreasing your BI Licenses (usually a big 25–50% P&L price range line) — what do you do?
You don’t pat your self on the again — you spend it. You spend it on extra stuff, doing extra stuff. You’ll most likely push your Databricks and Snowflake invoice again up. However you should have extra to indicate for it.
Consolidation is driving funds again into centre of gravities. These are Snowflake, Databricks, GCP, AWS and Microsoft (and to a lesser extent, palantir). This spells chaos for many smaller distributors.
Conclusion — brace for less complicated structure
The Salesforce Tax is a pivotal second in our business. Corporations like Salesforce, SAP, and ServiceNow all have an immense quantity of information and sufficient clout to maintain it there.
As Information Folks, anybody who has finished a migration from Salesforce to Netsuite is aware of that migrating these instruments might be the most important, most costly, and most painful transfer anybody faces of their skilled careers.
Salesforce charging infrastructure service suppliers charges will elevate costs, which in flip, mixed with the more and more precarious home of playing cards we see in AI and Information, all level in direction of huge consolidation.
ServiceNow’s acquisition of Information.World, I believe, supplies some readability into why we’ll see knowledge groups make extra use of current tooling, simplifying structure within the course of. Information.World is a supplier of data graphs and ontologies. By mapping the ServiceNow knowledge schema to an ontology, a gargantuan activity, ServiceNow may find yourself with half-decent AI and brokers operating inside ServiceNow.
AgentForce and Data360 is Salesforce’s try, and supposedly already has $1.4bn in income, although we suspect it contains numerous legacy in there too.
These suppliers do probably not need knowledge operating round as AI use circumstances in Snowflake or Databricks. They need the Procurement Specialists, Finance Professionals, and Advertising Gurus staying in their platforms — they usually have the means to make them keep.
This isn’t monetary recommendation and this isn’t a loopy prediction. To foretell that Snowflake and Databricks will find yourself rising extra alongside the analyst consensus is hardly difficult.
However the concept that the most important knowledge firms’ progress is on the verge of slowing is difficult. It challenges the rhetoric. It challenges the AI maximalist discourse.
We’re getting into the period of the Nice Information Closure. Whereas the AI maximalists dream of a borderless future, the fact is a heavy ceiling constructed by the incumbents’ gravity. On this new panorama, the winner isn’t the one with the perfect set of instruments, however the folks that take advantage of what they’ve.
About Me
I’m the CEO of Orchestra. We assist Information Folks construct, run and monitor their pipelines simply.
Yow will discover me on Linkedin right here.















