TEST-DRIVEN DEVELOPMENT FOR CODE-GENERATED  BACKENDS USING SCHOLARSHIP EVALUATION SYSTEMS  CASE STUDY

Adhakim, Aidityas

dc.contributor.author	Adhakim, Aidityas
dc.date.accessioned	2024-08-12T02:26:02Z
dc.date.available	2024-08-12T02:26:02Z
dc.date.issued	2024-08-09
dc.identifier.citation	IEEE	en_US
dc.identifier.uri	https://library.universitaspertamina.ac.id//xmlui/handle/123456789/12348
dc.description.abstract	Despite significant advancements in Large Language Models (LLM) optimization techniques, challenges remain in applying these models to real-world software development tasks. Noble Saji Mathews and Meiyappan Nagappan's Tgen framework demonstrate that TDD can enhance the functionality and robustness of code generated by LLMs. Building on previous research, this study aims to extend the implementation of Test-Driven Development (TDD) in code generation, specifically for backend systems, in this case the Pertamina University scholarship evaluation system. Pertamina University manages various scholarships and over 400 awardees, necessitating an efficient scholarship management system. This research aims to evaluate how Test-Driven Development (TDD) enhances LLM-generated backend code and to explore different test codes that can maximize LLM potential. We gather functional requirements from stakeholders and the project manager, then create test cases based on these requirements. The study involves three analyses. The first analysis compares LLM-generated backend code using both unstandardized and standardized test codes. Standardized test code follows specific implementation rules. The second analysis examines the ability of LLMs to use test cases to generate test code, which then generates backend code. This is compared with backend code generated using human-created test code. The third analysis involves performance testing to compare LLM-generated backend code with human-generated backend code, assessing how LLMs can potentially outperform manual code generation. Code generated using TDD implementation passes all the test case given with 100% accuracy, TDD implementation improved by 28.5% when using the standardized test code. The code generated using LLMs generated test code only pass 41% of the test cases, which is 59% different from the code generated by human generated test code. The results indicate that LLM-generated code demonstrates competitive performance in terms of throughput, response time, and error rate when compared with human-generated code. Internal quality analysis performed using static code analysis tools to compare the internal quality of human and LLM generated backend code, static code analysis shows that LLM-generated code gives zero issues in terms of security, reliability and maintanability, while human-generated shows five issues in maintanability. These findings explain that TDD implementation with good written tests can improve LLMs capability in code generation. However, the ability of LLMs to produce their own test case and test code is still far from expected, making it unrealistic for programmers to rely solely on LLMs to generate backend code without any manual intervention.	en_US
dc.language.iso	en	en_US
dc.subject	Large Language Models	en_US
dc.subject	Test Driven Development	en_US
dc.subject	Functional Requirements	en_US
dc.subject	Test Case	en_US
dc.title	TEST-DRIVEN DEVELOPMENT FOR CODE-GENERATED BACKENDS USING SCHOLARSHIP EVALUATION SYSTEMS CASE STUDY	en_US