Apple AI’s Platform Pivot Potential

Apple AI is delayed, and Apple may be trying to do too much; what the company ought to do is empower developers to make AI applications.

Mar 10, 2025 - 13:01

It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way — in short, the period was so far like the present period that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only.
— Charles Dickens, A Tale of Two Cities

Apple’s Bad Week

Apple has had the worst of weeks when it comes to AI. Consider this commercial which the company was running incessantly last fall:

In case you missed the fine print in the commercial, it reads:

Apple Intelligence coming fall 2024 with Siri and device language set to U.S. English. Some features and languages will be coming over the next year.

“Next year” is doing a lot of work, now that the specific feature detailed in this commercial — Siri’s ability to glean information from sources like your calendar — are officially delayed. Here is the statement Apple gave to John Gruber at Daring Fireball:

Siri helps our users find what they need and get things done quickly, and in just the past six months, we’ve made Siri more conversational, introduced new features like type to Siri and product knowledge, and added an integration with ChatGPT. We’ve also been working on a more personalized Siri, giving it more awareness of your personal context, as well as the ability to take action for you within and across your apps. It’s going to take us longer than we thought to deliver on these features and we anticipate rolling them out in the coming year.

It was a pretty big surprise, even at the time, that Apple, a company renowned for its secrecy, was so heavily advertising features that did not yet exist; I also, in full disclosure, thought it was all an excellent idea. From my post-WWDC Update:

The key part here is the “understanding personal context” bit: Apple Intelligence will know more about you than any other AI, because your phone knows more about you than any other device (and knows what you are looking at whenever you invoke Apple Intelligence); this, by extension, explains why the infrastructure and privacy parts are so important.
What this means is that Apple Intelligence is by-and-large focused on specific use cases where that knowledge is useful; that means the problem space that Apple Intelligence is trying to solve is constrained and grounded — both figuratively and literally — in areas where it is much less likely that the AI screws up. In other words, Apple is addressing a space that is very useful, that only they can address, and which also happens to be “safe” in terms of reputation risk. Honestly, it almost seems unfair — or, to put it another way, it speaks to what a massive advantage there is for a trusted platform. Apple gets to solve real problems in meaningful ways with low risk, and that’s exactly what they are doing.
Contrast this to what OpenAI is trying to accomplish with its GPT models, or Google with Gemini, or Anthropic with Claude: those large language models are trying to incorporate all of the available public knowledge to know everything; it’s a dramatically larger and more difficult problem space, which is why they get stuff wrong. There is also a lot of stuff that they don’t know because that information is locked away — like all of the information on an iPhone. That’s not to say these models aren’t useful: they are far more capable and knowledgable than what Apple is trying to build for anything that does not rely on personal context; they are also all trying to achieve the same things.

So is Apple more incompetent than these companies, or was my evaluation of the problem space incorrect? Much of the commentary this week assumes point one, but as Simon Willison notes, you shouldn’t discount point two:

I have a hunch that this delay might relate to security. These new Apple Intelligence features involve Siri responding to requests to access information in applications and then performing actions on the user’s behalf. This is the worst possible combination for prompt injection attacks! Any time an LLM-based system has access to private data, tools it can call, and exposure to potentially malicious instructions (like emails and text messages from untrusted strangers) there’s a significant risk that an attacker might subvert those tools and use them to damage or exfiltrating a user’s data.

Willison links to a previous piece of his on the risk of prompt injections; to summarize the problem, if your on-device LLM is parsing your emails, what happens if one of those emails contains malicious text perfectly tuned to make your on-device AI do something you don’t want it to? We intuitively get why code injections are bad news; LLMs expand the attack surface to text generally; Apple Intelligence, by being deeply interwoven into the system, expands the attack surface to your entire device, and all of that precious content it has unique access to.

Needless to say, I regret not raising this point last June, but I’m sure my regret pales in comparison to Apple executives and whoever had to go on YouTube to pull that commercial over the weekend.

Apple’s Great Week

Apple has had the best of weeks when it comes to AI. Consider their new hardware announcements, particularly the Mac Studio and its available M3 Ultra; from the company’s press release:

Apple today announced M3 Ultra, the highest-performing chip it has ever created, offering the most powerful CPU and GPU in a Mac, double the Neural Engine cores, and the most unified memory ever in a personal computer. M3 Ultra also features Thunderbolt 5 with more than 2x the bandwidth per port for faster connectivity and robust expansion. M3 Ultra is built using Apple’s innovative UltraFusion packaging architecture, which links two M3 Max dies over 10,000 high-speed connections that offer low latency and high bandwidth. This allows the system to treat the combined dies as a single, unified chip for massive performance while maintaining Apple’s industry-leading power efficiency. UltraFusion brings together a total of 184 billion transistors to take the industry-leading capabilities of the new Mac Studio to new heights.
“M3 Ultra is the pinnacle of our scalable system-on-a-chip architecture, aimed specifically at users who run the most heavily threaded and bandwidth-intensive applications,” said Johny Srouji, Apple’s senior vice president of Hardware Technologies. “Thanks to its 32-core CPU, massive GPU, support for the most unified memory ever in a personal computer, Thunderbolt 5 connectivity, and industry-leading power efficiency, there’s no other chip like M3 Ultra.”

That Apple released a new Ultra chip wasn’t a shock, given there was an M1 Ultra and M2 Ultra; almost everything about this specific announcement, however, was a surprise.

Start with the naming. Apple chip names have two components: M_ refers to the core type, and the suffix to the configuration of those cores. Therefore, to use the M1 series of chips as an example:

	Perf Cores	Efficiency Cores	GPU Cores	Max RAM	Bandwidth
M1	4	4	8	16GB	70 GB/s
M1 Pro	8	4	16	32GB	200 GB/s
M1 Max	8	2	32	64GB	400 GB/s
M1 Ultra	16	4	64	128GB	800 GB/s

The “M1” cores in question were the “Firestorm” high-performance core, “Icestorm” energy-efficient core, and a not-publicly-named GPU core; all three of these cores debuted first on the A14 Bionic chip, which shipped in the iPhone 12.

The suffix meanwhile, referred to some combination of increased core count (both CPU and GPU), as well as an increased number of memory controllers and associated bandwidth (and, in the case of the M1 series, faster RAM). The Ultra, notably, was simply two Max chips fused together; that’s why all of the numbers simply double.

The M2 was broadly similar to the M1, at least in terms of the relative performance of the different suffixes. The M2 Ultra, for example, simply doubled up the M2 Max. The M3 Ultra, however, is unique when it comes to max RAM:

	Perf Cores	Efficiency Cores	GPU Cores	Controllers	Max RAM	Bandwidth
M3	4	4	10	8	32GB	100 GB/S
M3 Pro	6	6	18	12	48GB	150 GB/s
M3 Max	12	4	40	32	128GB	400 GB/s
M3 Ultra	24	8	80	64	512GB	800 GB/s

I can’t completely vouch for every number on this table (which was sourced from Wikipedia), as Apple hasn’t yet released the full technical details of the M3 Ultra, and it’s not yet available for testing. What seems likely, however, is that instead of simply doubling up the M3 Max, Apple also reworked the memory controllers to address double the memory. That also explains why the M3 Ultra came out so much later than the rest of the family — indeed, the Mac Studio base chip is actually the M4 Max.

The wait was worth it, however: what makes Apple’s chip architecture unique is that that RAM is shared by the CPU and GPU, and not in the carve-out way like integrated graphics of old; rather, every part of the chip — including the Neural Processing Units, which I didn’t include on these tables — has full access to (almost¹) all of the memory all of the time.

What that means in practical terms is that Apple just shipped the best consumer-grade AI computer ever. A Mac Studio with an M3 Ultra chip and 512GB RAM can run a 4-bit quantized version of DeepSeek R1 — a state-of-the-art open-source reasoning model — right on your desktop. It’s not perfect — quantization reduces precision, and the memory bandwidth is a bottleneck that limits performance — but this is something you simply can’t do with a standalone Nvidia chip, pro or consumer. The former can, of course, be interconnected, giving you superior performance, but that costs hundreds of thousands of dollars all-in; the only real alternative for home use would be a server CPU and gobs of RAM, but that’s even slower, and you have to put it together yourself.

Apple didn’t, of course, explicitly design the M3 Ultra for R1; the architectural decisions undergirding this chip were surely made years ago. In fact, if you want to include the critical decision to pursue a unified memory architecture, then your timeline has to extend back to the late 2000s, whenever the key architectural decisions were made for Apple’s first A4 chip, which debuted in the original iPad in 2010.

Regardless, the fact of the matter is that you can make a strong case that Apple is the best consumer hardware company in AI, and this week affirmed that reality.

Apple Intelligence vs. Apple Silicon

It’s probably a coincidence that the delay in Apple Intelligence and the release of the M3 Ultra happened in the same week, but it’s worth comparing and contrasting why one looks foolish and one looks wise.

Apple Silicon

Start with the latter: Tony Fadell told me the origin story of Apple Silicon in a 2022 Stratechery Interview; the context of the following quote was his effusive praise for Samsung, which made the chips for the iPod and the first several models of the iPhone:

Samsung was an incredible partner. Even though they got sued, they were an incredible partner, they had to exist for the iPod to be as successful and for the iPhone to even exist. That happened. During that time, obviously Samsung was rising up in terms of its smartphones and Android and all that stuff, and that’s where things fell apart.
At the same time, there was the strategic thing going on with Intel versus ARM in the iPad, and then ultimately iPhone where there’s that fractious showdown that I had with various people at Apple, including Steve, which was Steve wanted to go Intel for the iPad and ultimately the iPhone because that’s the way we went with the Mac and that was successful. And I was saying, “No, no, no, no! Absolutely not!” And I was screaming about it and that’s when Steve was, well after Intel lost the challenge, that’s when Steve was like, “Well, we’re going to go do our own ARM.” And that’s where we bought P.A. Semi.
So there was the Samsung thing happening, the Intel thing happening, and then it’s like we need to be the master of our own destiny. We can’t just have Samsung supplying our processors because they’re going to end up in their products. Intel can’t deliver low power embedded the way we would need it and have the culture of quick turns, they were much more standard product and non custom products and then we also have this, “We got to have our own strategy to best everyone”. So all of those things came together to make what happened happen to then ultimately say we need somebody like TSMC to build more and more of our chips. I just want to say, never any of these things are independently decisions, they were all these things tied together for that to pop out of the oven, so to speak.

This is such a humbling story for me as a strategy analyst; I’d like to spin up this marvelous narrative about Apple’s foresight with Apple Silicon, but like so many things in business, it turns out the best consumer AI chips were born out of pragmatic realities like Intel not being competitive in mobile, and Samsung becoming a smartphone competitor.

Ultimately, though, the effort is characterized by four critical qualities:

Time: Apple has been working on Apple Silicon for 17 years.

Motivation: Apple was motivated to build Apple Silicon because having competitive and differentiated mobile chips was deemed essential to their business.

Differentiation: Apple’s differentiation had always been rooted in the integration of hardware and software, and controlling their own chips let them do exactly that, wringing out unprecedented efficiency in particular.

Evolution: The M3 Ultra isn’t Apple’s first chip; it’s not even the first M chip; heck, it’s not even the first M3! It’s the result of 17 years of iteration and experimentation.

Apple Intelligence

Notice how these qualities differ when it comes to Apple Intelligence:

Time: The number one phrase that has been used to characterize Apple’s response to the ChatGPT moment in November 2022 is flat-footed, and that matches what I have heard anecdotally. That, by extension, means that Apple has been working on Apple Intelligence for at most 28 months, and that is almost certainly generous, given that the company likely took a good amount of time to figure out what its approach would be. That not nothing — xAI went from company formation to Grok 3 in 19 months — but it’s certainly not 17 years!

Motivation: If you look at Apple’s earnings calls in the wake of ChatGPT, February 2023, May 2023, and August 2023, all contain some variation of “AI and machine learning have been integrated into our products for years, and we’ll continue to be thoughtful about how we implement them”; finally in November 2023 CEO Tim Cook said the company was working on something new:

In terms of generative AI, we have — obviously, we have work going on. I’m not going to get into details about what it is, because, as you know, we don’t — we really don’t do that. But you can bet that we’re investing, we’re investing quite a bit, we’re going to do it responsibly and it will — you will see product advancements over time that where the — those technologies are at the heart of them.

First, this obviously has bearing on the “time” point above; secondly, one certainly gets the sense that Apple, after tons of industry hype and incessant questions from analysts, very much representing the concerns of shareholders, felt like they had no choice but to be doing something with generative AI. In other words — and yes, this is very much driving with the rearview mirror — Apple didn’t seem to be working on generative AI because they felt it was essential to their product vision, but rather because they had to keep up with what everyone else was doing.

Differentiation: This is the most alluring part of the Apple Intelligence vision, which I myself hyped up from the beginning: Apple’s exclusive access to its users’ private information. What is interesting to consider, however, beyond the security implications, is the difference between “exclusivity” and “integration”.

Consider your address book: the iOS SDK included the Contacts API, which gave any app on the system full access to your contacts without requiring explicit user permission. This was essential to the early success of services like WhatsApp, which cleverly bootstrapped your network by using phone numbers as unique IDs; this meant that pre-existing username-based networks like Skype or AIM were actually at a disadvantage on iOS. iMessage did the same thing when it launched in 2011, and then Apple started requiring user permission to access your contacts in 2012.

Even this amount of access, however, paled in comparison to the Mac, where developers could access information from anywhere on the system. iOS, on the other hand, put Apps in sandboxes, cut off from other apps and system information outside of APIs like the Contacts API, all of which have become more and more restricted over time. Apple made these decisions for very good reasons, to be clear: iOS is a much safer and secure environment than macOS; increased restrictions generally mean increased privacy, albeit at the cost of decreased competition.

Still, it’s worth pointing out that exclusive access to data is downstream of a policy choice to exclude third parties; this is distinct from the sort of hardware and software integration that Apple can exclusively deliver in the pursuit of superior performance. This distinction is subtle, to be sure, but I think it’s notable that Apple Silicon’s differentiation was in the service of building a competitive moat, while Apple Intelligence’s differentiation was about maintaining one.

Evolution: From one perspective, Apple Intelligence is the opposite of an evolved system: Apple put together an entire suite of generative AI capabilities, and aimed to launch them all in iOS 18. Some of these, like text manipulation and message summaries, were straightforward and made it out the door without a problem; others, particularly the reimagined Siri and its integration with 3rd party apps and your personal data, are now delayed. It appears Apple tried to do too much all at once.

The Incumbent Advantage

At the same time, it’s not as if Siri is new; the voice assistant launched in 2011, alongside iMessage. In fact, though, Siri has always tried to do too much too soon; I wrote last week about the differences between Siri and Alexa, and how Amazon was wise to focus their product development on the basics — speed and accuracy — while making Alexa “dumber” than Siri tried to be, particularly in its insistence on precise wording instead of attempting to figure out what you meant.

To that end, this speaks to how Apple could have been more conservative in its generative AI approach (and, I fear, Amazon too, given my skepticism of Alexa+): simply make a Siri that works. The fact of the matter is that Siri has always struggled with delivering on its promised functionality, but a lot of its shortcomings could have been solved by generative AI. Apple, however, promised much more than this at last year’s WWDC: Siri wasn’t simply going to work better, it was actually going to understand and integrate your personal data and 3rd-party apps in a way that had never been done before.

Again, I applauded this at the time, so this is very much Monday-morning quarterbacking. I increasingly suspect, however, we are seeing a symptom of big-company disease that I hadn’t previously considered: while one failure state in the face of new technology is moving too slowly, the opposite failure state is assuming you can do too much too quickly, when simply delivering the basics would be more than good enough.

Consider home automation: the big three players in the space are Siri and Alexa and Google Assistant. What makes these companies important is not simply that they have devices you can put in your home and talk to, but also that there is an entire ecosystem of products work with them. Given that, consider two possible products in the space:

OpenAI releases a ChatGPT speaker that you can talk to and interact with; it works brilliantly and controls, well, it doesn’t control anything, because the ecosystem hasn’t adopted it. OpenAI would need to work diligently to build out partnerships with everyone from curtain makers to smart light to locks and more; that’s hard enough in its own right, and even more difficult when you consider that many of these objects are only installed once and updated rarely.
Apple or Amazon or Google update their voice assistants with basic LLMs. Now, instead of needing to use precise language, you can just say whatever you want, and the assistant can figure it out, along with all of the other LLM niceties like asking about random factoids.

In this scenario the Apple/Amazon/Google assistants are superior, even if their underlying LLMs are worse, or less capable than OpenAI’s offering, because what the companies are selling is not a standalone product but an ecosystem. That’s the benefit of being a big incumbent company: you have other advantages you can draw on beyond your product chops.

What is striking about new Siri — and, I worry, Alexa+ — is the extent to which they are focused on being compelling products in their own right. It’s very clever for Siri to remember who I had coffee with; it’s very useful — and probably much more doable — to reliably turn my lights on and off. Apple (and I suspect Amazon) should have absolutely nailed the latter before promising to deliver the former.

If you want to be generous to Apple you could make the case that this was what they were trying to deliver with the Siri Intents expansion: developers could already expose parts of their app to Siri for things like music playback, and new Siri was to build on that framework to enhance its knowledge about a user’s context to provide useful answers. This, though, put Apple firmly in control of the interaction layer, diminishing and commoditizing apps; that’s what an Aggregator does, but what if Apple went in a different direction?

An AI Platform

While my clearest delineation of the difference between Aggregators and Platforms is probably in A Framework for Regulating Competition on the Internet, perhaps the most romantic was in Tech’s Two Philosophies:

There is certainly an argument to be made that these two philosophies arise out of their historical context; it is no accident that Apple and Microsoft, the two “bicycle of the mind” companies, were founded only a year apart, and for decades had broadly similar business models: sure, Microsoft licensed software, while Apple sold software-differentiated hardware, but both were and are at their core personal computer companies and, by extension, platforms.

Google and Facebook, on the other hand, are products of the Internet, and the Internet leads not to platforms but to Aggregators. While platforms need 3rd parties to make them useful and build their moat through the creation of ecosystems, Aggregators attract end users by virtue of their inherent usefulness and, over time, leave suppliers no choice but to follow the Aggregators’ dictates if they wish to reach end users.

The business model follows from these fundamental differences: a platform provider has no room for ads, because the primary function of a platform is to provide a stage for the applications that users actually need to shine. Aggregators, on the other hand, particularly Google and Facebook, deal in information, and ads are simply another type of information. Moreover, because the critical point of differentiation for Aggregators is the number of users on their platform, advertising is the only possible business model; there is no more important feature when it comes to widespread adoption than being “free.”
Still, that doesn’t make the two philosophies any less real: Google and Facebook have always been predicated on doing things for the user, just as Microsoft and Apple have been built on enabling users and developers to make things completely unforeseen.

I said this was romantic, but the reality of Apple’s relationship with developers, particularly over the last few years as the growth of the iPhone has slowed, has been considerably more antagonistic. Apple gives lip service to the role developers played in making the iPhone a compelling platform — and in collectively forming a moat for iOS and Android — but its actions suggest that Apple views developers as a commodity: necessary in aggregate, but mostly a pain in the ass individually.

This is all very unfortunate, because Apple — in conjunction with its developers — is being presented with an incredible opportunity by AI, and it’s one that takes them back to their roots: to be a platform.

Start with the hardware: while the M3 Ultra is the biggest beast on the block, all of Apple’s M chips are highly capable, particularly if you have plenty of RAM. I happen to have an M2 MacBook Pro with 96GB of memory (I maxed out for this specific use case), which lets me run Mixtral 8x22B, an open-source model from Mistral with 141 billion parameters, at 4-bit quantization; I asked it a few questions:

You don’t need to actually try and read the screen-clipping; the output is pretty good, albeit not nearly as detailed and compelling as what you might expect from a frontier model. What’s amazing is that it exists at all: that answer was produced on my computer with my M2 chip, not in the cloud on an Nvidia datacenter GPU. I didn’t need to pay a subscription, or worry about rate limits. It’s my model on my device.

What’s arguably even more impressive is seeing models run on your iPhone:

This is a much smaller model, and correspondingly less capable, but the fact it is running locally on a phone is amazing!

Apple is doing the same thing with the models that undergird Apple Intelligence — some models run on your device, and others on Apple’s Private Cloud Compute — but those models aren’t directly accessible by developers; Apple only exposes writing tools, image playground, and Genmoji. And, of course, they ask for your app’s data for Siri, so they can be the AI Aggregator. If a developer wants to do something unique, they need to bring their own model, which is not only very large, but hard to optimize for a specific device.

What Apple should do instead is make their models — both local and in Private Cloud Compute — fully accessible to developers to make whatever they want. Don’t limit them to cutesy-yet-annoying frameworks like Genmoji or sanitized-yet-buggy image generators, and don’t assume that the only entity that can create something compelling using developer data is the developer of Siri; instead return to the romanticism of platforms: enabling users and developers to make things completely unforeseen. This is something only Apple could do, and, frankly, it’s something the entire AI industry needs.

When the M1 chip was released I wrote an Article called Apple’s Shifting Differentiation. It explained that while Apple had always been about the integration of hardware and software, the company’s locus of differentiation had shifted over time:

When OS X first came out, Apple’s differentiation was software: Apple hardware was stuck on PowerPC chips, woefully behind Intel’s best offerings, but developers in particular were lured by OS X’s beautiful UI and Unix underpinnings.
When Apple moved to Intel chips, its hardware was just as fast as Windows hardware, allowing its software differentiation to truly shine.
Over time, as more and more applications moved to the web, the software differences came to matter less and less; that’s why the M1 chip was important for the Mac’s future.

Apple has the opportunity with AI to press its hardware advantage: because Apple controls the entire device, they can guarantee to developers the presence of particular models at a particular level of performance, backed by Private Cloud Compute; this, by extension, would encourage developers to experiment and build new kinds of applications that only run on Apple devices.

This doesn’t necessarily preclude finally getting new Siri to work; the opportunity Apple is pursuing continues to make sense. At the same time, the implication of the company’s differentiation shifting to hardware is that the most important job for Apple’s software is to get out of the way; to use Apple’s history as analogy, Siri is the PowerPC of Apple’s AI efforts, but this is a self-imposed shortcoming. Apple is uniquely positioned to not do everything itself; instead of seeing developers as the enemy, Apple should deputize them and equip them in a way no one else in technology can.

Apple reserves some memory for the CPU at all times, so that the computer can actually run ↩