Python TDD with ChatGPT

By Christian Prokopp on 2023-02-02

Programming with ChatGPT using an iterative approach is difficult, as I have demonstrated previously. Maybe ChatGPT can benefit from Test-driven development (TDD). Could it aid LLMs as it does humans?

Test taking

Previously, I wrote Python code with ChatGPT starting simple and adding increasing details. That worked initially and then started to fall apart over time. Let us try a simpler example and define tests before asking ChatGPT to write code to pass the tests. It is a good practice that helps many developers.

First, I outline the behaviour and ask for the appropriate tests.

> Define a number of pytests for a function called is_asin. The function is_asin takes a string and returns a true or false. It returns true if the passed parameter passes the definition of an ASIN (Amazon Standard Identification Number). An ASIN is an alphanumeric identifier, 10 characters long and always uppercase. Write all the unit tests we need for the function is_asin.

ChatGPT > Define a number of pytests for a function called is_asin

Not bad, but two tests are wrong, and also I prefer the happy path test at the beginning.

> Remove the tests that return false for strings with only numbers or only characters. These are true and not false cases. Also move the test that returns true to the first place.

ChatGPT > Remove the tests that return false for strings with only numbers or only characters.

Better, let us have the unit test code.

> Write the code for the tests using pytest.

ChatGPT > Write the code for the tests using pytest.

One issue is that the happy path only uses letters. We need a better test input string.

> Update the string used in test_valid_asin to use both numbers and letters. Output all the tests.

ChatGPT > Update the string used in test_valid_asin to use both numbers and letters. Output all the tests.

Good, now we can generate the function code.

> Write the is_asin function to pass all the tests and use type annotation.

ChatGPT > Write the is_asin function to pass all the tests and use type annotation.

Good try, but this is not going to pass. Let us see if ChatGPT can fix it without further details.

> Do you have any specific recommendations to improve is_asin?

ChatGPT > Do you have any specific recommendations to improve is_asin?

The documentation is a good addition and probably something to ask for from the beginning. It should help ChatGPT to stay on track as we iterate. However, the happy path test fails when executed as expected. Let us see if we can get it fixed by being vague.

> The test test_valid_asin fails. Can you fix is_asin?

ChatGPT > The test test_valid_asin fails. Can you fix is_asin?

It still is failing. Let us try and give some more detail using simple language and not wordsmithing it for ChatGPT's benefit.

> The test test_valid_asin fails because is_asin should check all characters to be numbers or upper case characters. But it checks if they are both at the same time which cannot be true. Can you fix is_asin?

ChatGPT > > The test test_valid_asin fails because is_asin should check all characters to be numbers or upper case characters.

It still fails. Let us state a solution instead to help ChatGPT.

> The test test_valid_asin fails because is_asin should check all characters to be either an upper case letter or a number. Fix it.

ChatGPT > The test test_valid_asin fails because is_asin should check all characters to be either an upper case letter or a number. Fix it.

Success.

The great part is that we can generate a good list of tests based on the description and some decent code to solve it. Clearly, that can be useful in the future for developers. ChatGPT also needs precise language where possible, which forces and helps with reflecting on the breakdown of the problem and its description as part of the development work.

However, the gap for ChatGPT is the lack of understanding of what it generates. Simple functions like this trip it up and need experience and the ability to understand the code in detail by the user.

As it stands, ChatGPT can help expert users with simple tasks. The opportunity for a significant productivity boost is to move both dials, i.e. to help inexpert users with complex tasks, ideally. The interesting question is if this future is one, five, ten or more years away.


Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.