CodeMaker AI has demonstrated the capability of its fine-tuned AI model to autonomously recreate a 90,000-line software library with 91% similarity to the original. CodeMaker AI system processed 3,251 files and generated 90,063 lines of code within just 1 hour and 42 minutes, at a total cost of $265.73. The generated artifact has been published on GitHub (source). Estimates based on the COCOMO model suggest that manually writing such code would take approximately 25 years.

This achievement was made possible through the development of a custom fine-tuning pipeline, enabling the AI model to be trained on an entire codebase. This allows users to fine-tune their own models, tailoring them to specific codebases, coding styles, and domain spaces.

The primary goal of this experiment was to validate the accuracy and reliability of the fine-tuned models. These models can now be applied across a range of feature sets, including automated code generation, code completion, and enhanced chat experiences, among others.

This development builds on previous experimentation conducted with smaller-scale Java codebases. Earlier efforts focused on recreating the implementation code of the open-source libraries Mockito (source) and JUnit (source).

CodeMaker AI, founded in March 2023 and based in Vancouver, BC, is a startup dedicated to providing innovative tools and automation solutions for software developers. The company's offerings focus on streamlining the writing, testing, and documentation of source code.

CodeMaker AI Inc.Jakub Narlochemail: contact@codemaker.aiwebsite: https://codemaker.ai