Google launches Gemini, most powerful model to date in AI race

Google Cloud chief executive Thomas Kurian said Gemini’s improvements would let enterprises do what was impossible before. PHOTO: GOOGLE CLOUD

LAS VEGAS – The race to make the most useful generative artificial intelligence (AI) assistant for workers has gone up a notch, with Google launching Gemini 1.5.

The foundation model, labelled a “game changer” and the most powerful Google has produced, runs larger quantities of data than any other available to deliver more contextualised answers in shorter times.

Simply put, the algorithm could crunch the equivalent of one hour of video, 11 hours of audio, 700,000 words, or 30,000 lines of code in response to a user’s single prompt.

Other models such as Anthropic Claude 2.1 manage up to one-fifth of Gemini 1.5’s one million tokens of data, while OpenAI GPT-4 can handle one-eighth.

The new model is also the only multimodal one in the market, meaning it can ingest and put out content in not only text but also audio, video and code.

Gemini 1.5 – first revealed in February – was among almost 30 new products and product improvements shown by the tech giant’s cloud division at its annual Google Cloud Next conference in Las Vegas on April 9.

News of its roll-out immediately sparked headlines touting that the California-based firm is back in the game – even in the lead – in the race for AI supremacy after a hesitant start and a series of disappointing demonstrations with previous models.

Dr Chirag Dekate, an analyst at research firm Gartner, called the new Gemini a game changer: “You can actually take your videos, video files, and ask multimodal questions. You cannot do this with any of the chatty GPT models or any of the other models that exist.”

Other product launches at the conference included new processing chips, security features, integration with more third-party models, customised AI assistants for customer service or data analytics, and even a text-to-video tool for Google Workspace called Vids.

One thread runs through them all, noted Dr Dekate – AI.

“What you’re starting to see emerge is Google changing the cloud narrative,” he said. “It is absolutely a new data cloud, where everything you do will be in preparation for AI transformation and augmenting.”

Google Cloud chief executive Thomas Kurian told about 30,000 conference delegates who filled the Michelob Ultra Arena and adjacent halls that Gemini’s improvements would let enterprises do what was impossible before.

“For example, a gaming company could now offer video analysis to support players, even offer tips for players. Or an insurance company could combine video images and text to create an incident report and automate the claims process,” he said.

Google Cloud, which turned profitable for the first time in the first three months of 2023, has grown rapidly and accounted for 3.6 per cent of parent company Alphabet’s US$23.7 billion (S$31.9 billion) operating income by the final quarter of the same year.

Its turnover comes from businesses that pay fees and subscriptions for its services such as cloud hosting, cyber security and analytics services, and enhanced work tools such as Gmail, Google Docs and Google Drive.

Mr Kurian and his division’s importance next to Google’s consumer units such as Search and YouTube is expected to grow, as hefty costs of running AI are likely to be sustained by business budgets.

Mr Ray Wang, chief executive of Constellation Research, said Google has the longest history in doing AI, and is the only operator to own its full stack of AI infrastructure, data, models and solutions.

But it will need better sales execution, he said. “Right now, the best team in sales is Microsoft.”

For now, Google has the lead.

Mr Wang said: “The question is, will Microsoft make enough of a dent to show that they will build a team that does AI? Right now, they are relying on OpenAI. In the future, Microsoft is going to have a team. That’s really the race.”

Dr Dekate concurred with Mr Wang about Google’s lead: “Microsoft is going to have to wait until OpenAI comes out with multimodal models before they can do anything like this.”

But the crown is not won by the size of the model.

Dr Dekate said: “The cloud provider that manages to simplify enterprise experiences around generative AI enables faster delivery of generative AI innovation to solve business problems; that is the vendor that is going to win the generative AI market share.”

Dr Nitin Mayande, co-founder of marketing analytics start-up Tellagence, is both a Google vendor and user.

He added that Google’s language model delivered more consistent and better results than others he has tried. There is not much difference in pricing between them, he said.

“So after trying a lot of other things, we have settled on Google.” he said. “But in the future, what might become a potential competition to Gemini is the integration of many smaller systems.”

The tech firm that best helps businesses integrate these models would be the winner, Dr Mayande added.

However, victory may not mean much eventually: “Five years down the line, will (gen AI) be the only game in town? I don’t think so.”

Firms could settle on a concoction of models that combine not just generative AI systems, but also other technologies such as neural networks, Dr Mayande said.

While generative AI models are known for creating new content from fed data, neural network systems tend to focus on adaptive learning to improve continuously.

They are inspired by the way human brains work and, according to Dr Mayande, are probably less likely to “hallucinate”, or generate false information.

Join ST's Telegram channel and get the latest breaking news delivered to you.