Teaching Large Language Models to Self-Debug

Large language models have achieved impressive performance on code generation, but generating correct code with a single attempt is challenging. To address this problem, the authors propose SELF-DEBUGGING, which teaches a large language model to debug its predicted program via few-shot demonstrations. SELF-DEBUGGING achieves state-of-the-art performance on several code generation benchmarks, including the Spider dataset for text-to-SQL generation, TransCoder for C++-to-Python translation, and MBPP for text-to-Python generation. By leveraging feedback messages and reusing failed predictions, SELF-DEBUGGING notably improves sample efficiency and can match or outperform baseline models that generate more than 10 candidate programs.

Teaching Large Language Models to Self-Debug

Previoujs Article

JEP 444: Virtual Threads Arrive in JDK 21, Ushering a New Era of Concurrency

Next Article

Telepresence for Docker - Simplified K8s Local Development

Tags