How does Code Llama 70B model compare with GitHub Copilot?
The HinduFebruary 14, 2024 03:46 pm | Updated 03:46 pm IST On a mission to build open-source language models, Meta AI released an update to its range of Code Llama models on January 29. On to the HumanEval benchmark, a dataset of 164 programming problems that measure the functional correctness and logic of code generation models, Code Llama 70B scores 65.2, far lower than GPT-4, which scores 85.4. For instance, after the release of the previous smaller-sized Code Llama models last year, models like the Phind Model V7 built on top of a finetuned Code Llama-34B were released that came close to GPT-4 in terms of performance. A Hugging Face leader board for the best open-source AI models for coding has two versions of the Phind Code Llama, the 34B V2 and the 34B V1 rank within the top five.