Is the tech giant taking on Gemini with its new MM1 AI virtual assistant model?
Apple is launching a new artificial intelligence software model that takes smart device image and text usage to new “state-of-the-art” levels.
With the powerful MM1 (it stands for “multi-model”—version 1), Apple aims to improve the iPhone user experience by creating more flexible and intelligent systems based on large language model technology.
According to Apple, the multimodels can handle up to 30 billion parameters and easily follow user prompts and search among texts and images for the best user query outcomes. Apple CEP Tim Cook had alluded to a “significant announcement” on the artificial intelligence front that would be a “major breakthrough” earlier in 2024.
Many Apple influencers and experts believe the company will replace the current version of Siri, and that may be coming true. Just as Google replaced its digital virtual Assistant feature with Gemini, Apple could be replacing Siri with a new LLM feature for the iPhone that some Apple trackers have dubbed “Siri 2.0.”
Apple didn’t take the typical route and announced a new AI feature via press release. Instead, it introduced the MM1 concept in a research paper titled “Methods, Analysis, and Insights from Multimodal LLM Pre-Training.”
Let’s look at Apple’s new AI model and see what potential it brings to the table.
Apple MM1 Defined
MM1 is a brand-new Apple multimodal training technology that leverages synthetic data to boost the performance of standard digital device tools like text and images. The technology ramps up performance and curbs follow-up response queries to get the end user the optimal result. Apple says MM1 will be user-friendly for regular Siri users – all they’ll likely notice is the fast speed at which a query gets a response and the bare amount of interaction with the software they’ll have in getting a query response.
“By utilizing a diverse dataset comprising image-caption pairs, interleaved image-text documents, and text-only data, Apple claims the MM1 model sets a new standard in AI’s ability to perform tasks such as image captioning, visual question answering, and natural language inference with a high level of accuracy,” MacRumors notes in a March 18 post.
How It Works
In the Apple research paper, software engineers said that researchers no longer need to use one concept to achieve state-of-the-art performance. With MM1, it’s possible to use different training data and architecture models to achieve top query and prompt results.
By deploying a “diverse dataset spanning visual and linguistic information,” Apple engineers could improve the virtual assistant user experience by leveraging AI-powered image captioning, visual queries, text-only data, and natural language models to quickly and accurately get the desired response.
“Thanks to large-scale pre-training, MM1 enjoys appealing properties such as enhanced in-context learning and multi-image reasoning, enabling few-shot chain-of-thought prompting,” the paper noted.
On MM1 Replacing Siri
An AI-powered Siri has long been a topic of discussion among the Apple user base, spurred by reports of Apple’s massive investment in AI talent and technology resources in the past year.
Similar rumors also emerged when Apple rolled out Ferret, its open-source, multimodal AI model that enabled LLMs to operate on iPhones more seamlessly and efficiently.
“We view AI and machine learning as fundamental technologies, and they’re integral to virtually every product we ship,” Cook said earlier this year.
MM1 looks to be a capable step up from other Siri AI technologies. While the Apple research report doesn’t mention Siri specifically, it looks like Apple is heading in that direction with its minimal prompt AI software technology.
In the Apple user community, users have increasingly been buzzing over large LLM models that can easily learn from user interactions and work seamlessly on the iPhone. The research report alluded to an LLM Siri-like 2.0 that operates “on-device” and can maneuver around Apple’s personal private data firewalls to get users the exact information they need quickly and efficiently.
With MM1, Apple seems to be on that path. However, only time will tell exactly what the company has in store for its 1.46 billion iPhone users.
It is evident that Apple plans to use AI to change how users interact with their digital devices, and its new large language model work could be a precursor of things to come.
Brian O’Connell, a former Wall Street bond trader and best-selling author, is a prominent figure in the finance industry. With a substantial background as an ex-Wall Street trader, he has authored two best-selling books: ‘The 401k Millionaire’ and ‘CNBC’s Creating Wealth’, demonstrating his profound knowledge of finance and investing.
Brian is also a finance and business writer for esteemed national platforms and publications, including CNN, TheStreet.com, CBS News, The Wall Street Journal, U.S. News & World Report, Forbes, and Fox News.