This week has been full of AI developments, largely driven by OpenAI’s activities, including a controversial statement from CEO Sam Altman, the wide rollout of Advanced Voice Mode, speculation on a new 5GW data center , significant personnel changes and ambitious restructuring initiatives.
However, the AI industry as a whole continues to operate independently, rapidly producing new AI models and insights. Here’s a recap of some other important AI developments that happened over the past week.
Google Gemini Updates
This Tuesday, Google revealed improvements to its Gemini model series, releasing two new production-ready models: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002. These models build on previous versions and provide better overall performance, including math, extended context management, and vision-based tasks. Google reports a 7% improvement in the MMLU-Pro benchmark and a 20% increase in math tasks. However, as long-time Ars Technica readers may realize, AI testing doesn’t always translate into real-world usefulness.
Along with these model improvements, Google has significantly reduced prices for the Gemini 1.5 Pro, reducing input token costs by 64% and output token costs by 52% for prompts below 128,000 tokens. AI researcher Simon Willison pointed out on his blog: “For comparison, GPT-4o costs $5 per million tokens for input and $15 per million for output, and Claude 3.5 Sonnet charges $3 per million for entry and $15 per million for exit. The Gemini 1.5 Pro was already the most affordable among the flagship models, and now it’s even more economical.
Google has also increased transaction limits, with the Gemini 1.5 Flash now capable of handling 2,000 requests per minute and the Gemini 1.5 Pro up to 1,000. The latest versions offer double the output speed and a third of the latency of their predecessors, making it easier and more cost-effective for developers to integrate Gemini into their applications.
Meta presents Llama 3.2
Meta on Wednesday announced the release of Llama 3.2, a significant update to its frequently talked about series of open weight AI models. The update includes vision-capable large language models (LLMs) with 11 billion and 90 billion parameters, as well as smaller, text-only models with 1 billion and 3 billion parameters, optimized for edge use and mobile. Meta claims that these vision models rival the best closed-source models in image recognition and visual understanding tasks, while smaller models outperform their competitors in various text-based tests.
Simon Willison reported notable results from his testing with the smaller 3.2 models. Meanwhile, AI researcher Ethan Mollick demonstrated Llama 3.2’s capabilities on his iPhone using an app called PocketPal.
Meta also rolled out its first official “Llama Stack” distributions, designed to streamline development and deployment across different platforms. As with previous versions, Meta offers these templates as a free download, but with some licensing restrictions. These new models support expanded context windows of up to 128,000 tokens.
Google’s AlphaChip improves chip design
On Thursday, Google DeepMind announced a major breakthrough in AI-based microchip design with AlphaChip. Born from a research project in 2020, AlphaChip uses a reinforcement learning approach to create chip layouts. Google used AlphaChip to design highly efficient configurations for the last three generations of its tensor processing units (TPUs), which are specialized chips designed to speed up AI tasks. Google says AlphaChip can generate optimal chip configurations in a matter of hours, a task that typically takes humans weeks or months. (Nvidia is also reportedly using AI in chip design.)
Importantly, Google has also made a pre-trained checkpoint of AlphaChip available on GitHub, sharing the model weights with the general public. Google noted that AlphaChip’s influence already extends beyond its labs, with companies like MediaTek using and improving the technology in their chip designs. According to Google, AlphaChip has launched a new wave of AI research for chip design, potentially revolutionizing every phase of the chip design process, from computer architecture to production.
While these are just some of the highlights, they represent the rapid pace and continued innovation within the AI sector. We’ll see what next week brings, as the industry shows no signs of slowing down.