Code Llama 70B, Meta AI’s sophisticated code generation model, has a new version available. Slightly faster and more precise than its predecessor, the new platform is one of the biggest open-source AI models for code production.
With a broad context window of 100,000 tokens, Code Llama 70B can interpret and generate lengthier and more complicated code in a variety of languages, including C++, Python, PHP, and Java. It has been trained on 500 billion tokens of code and code-related data.
Based on Llama 2, one of the biggest general-purpose large language models (LLM) in the world, Code Llama 70B has been fine-tuned for code generation using a technique called self-attention which can better understand code relationships and dependencies.
Uphill battle
Another of the highlights of the new model is CodeLlama-70B-Instruct, a variant fine-tuned for understanding natural language instructions and generating code accordingly.
Meta AI’s CEO Mark Zuckerberg stated, “The ability to code has also proven to be important for AI models to process information in other domains more rigorously and logically. I’m proud of the progress here, and looking forward to including these advances in Llama 3 and future models as well.”
Code Llama 70B is available for free download under the same license as Llama 2 and previous Code Llama models, allowing both researchers and commercial users to use and modify it.
Despite the improvements, Meta has the uphill challenge of trying to win over developers currently using GitHub Copilot, the number one AI tool for developers created by GitHub and OpenAI. Many devs are also suspicious of Meta and its data collection processes, and a lot aren’t fans of AI generated code in the first place. This can often require serious debugging, and produce code that non-programmers are happy to use but don’t understand, leading to problems down the line.