This week, Replit, a developer tooling vendor, has unveiled a set of initiatives aimed at democratizing AI for developers. Replit’s cloud-based software development platform boasts over 20 million users and has steadily been enhancing its generative AI capabilities in the past year. This enhancement includes the introduction of the GhostWriter AI code completion tool and a strategic partnership with Google. So far, access to GhostWriter has been limited to a select group of Replit users, but that’s about to change.
As of October 9, Replit is directly incorporating GhostWriter into its core platform, making the generative AI code completion tool accessible to all of its users. This move is labeled as “AI for all.” Alongside the integration of GhostWriter, Replit has also announced a new version of its purpose-built open source generative AI large language model (LLM) for coding, known as replit-code-v1.5-3b.
Replit’s open source coding LLM is positioned as a competitive alternative to StarCoder LLM, jointly developed by ServiceNow and Hugging Face, as well as Meta’s Llama CodeLlama 7B.
Amjad Masad, CEO of Replit, emphasized their mission of providing accessibility during a live-streamed session at the AI Engineer Summit. He stated, “Our mission is to empower the next billion developers, and we didn’t want to create a world where some people have access to GhostWriter while others don’t.”
With this integration, Replit is shedding the GhostWriter name entirely and instead establishing AI as a core feature available to all users. According to Masad, Replit has users worldwide coding on various devices, including laptops and mobile phones. Now, all of these users can become AI-enhanced developers.
Masad believes that this deployment of AI-enhanced coding will be one of the largest in the world. He anticipates significant demands on both GPU and CPU resources.
Replit’s generative AI capabilities are not superficial additions on top of someone else’s LLM; rather, they are based on proprietary open source technology that the company has developed. Michele Catasta, VP of AI at Replit, explained during a live-streamed session at the AI Engineer Summit that their code completion feature relies on a bespoke large language model. This model was trained on open source code from GitHub as well as code developed by the Replit user community.
In May, Replit introduced replit-code-v1-3b LLM, and now they have unveiled replit-code-v1.5-3b, a significant update that expands the LLM’s capabilities. This updated model was trained on a massive 1 trillion tokens of code and supports 30 different programming languages.
Catasta highlighted the significance of data quality in the model’s training, underscoring the meticulous work Replit has done in this regard. The hardware used for training also played a vital role, utilizing 128 Nvidia H100-80G GPUs, which are highly sought after. To Replit’s knowledge, this is the first model officially announced to be trained on the H100 and released as open source, marking a noteworthy milestone.