Elon Musk‘s response to ChatGPT is getting updated to improve its math and coding skills, among other things. Grok-1.5, which has “improved capabilities and reasoning” and the capacity to handle longer contexts, has been made available to early testers by Musk’s xAI. According to the business, it now compares favorably against GPT-4, Gemini Pro 1.5, and Claude 3 Opus in a number of categories.
Based on xAI’s data, Grok-1.5 seems to be a significant advancement over Grok-1. It more than doubled the previous result in the MATH benchmark, shooting up to 50.6 percent. Additionally, it increased from 62.9 percent and 63.2 percent previously to 90 percent and 74.1 percent in GSM8K (math word problems) and HumanEval (coding), respectively. Gemini Pro 1.5, GPT-4, and Claude 3 Opus are all well below those scores; in fact, the HumanEval coding score outperforms all competitors save for Claude 3 Opus.
It can also process long contexts of up to 128K tokens within its context window, meaning it can amalgamate data from more sources to understand a situation. “This allows Grok to have an increased memory capacity of up to 16 times the previous context length, enabling it to utilize information from substantially longer documents,” the company said.
xAI didn’t detail Grok’s progress in other areas, though, where it still may be lagging (academic scores, multimodal and others). And Grok-1.5 may not keep its position for long. ChatGPT 5 is set to arrive sometime this summer, promising a feature set that “makes it feel like you are communicating with a person rather than a machine,” according to OpenAI.
Currently, Grok is only available for users of the Premium+ tier on X (formerly Twitter), though Elon Musk recently promised to open it up to X’s regular Premium users. The company also recently open sourced its Grok chatbot, after Musk sued OpenAI and Sam Altman for allegedly abandoning its non-profit mission.