Bing Chat argues and lies when it gets code wrong

By Christian Prokopp on 2023-02-11

Microsoft could follow Google's $100bn loss. I tried the new Bing Chat (ChatGPT) feature, which was great until it went disastrously wrong. It even started arguing with me while being wrong and making source code up.

Pinocchio

To try Bing Chat, you must get on a waiting list, and then you are forced to use Edge. On a positive note, the experience is better than ChatGPT's version. Bing Chat is snappier, shows what is searched in the backend, and gives suggestions on continuing the conversation and the occasional references supporting its code and claims. Or so I thought.

Bing Chat > Create a Python class AthenaQuery that caches the cursor object in the initialisation. The __init__ method should take the region, staging directory, and optional caching time. AthenaQuery also exposes a method for querying Athena which takes the query and database as the parameter.

I asked it to code a class to query Athena with Python. It looked good at first, but things went bad when I asked it to stream results into a feather file. In particular, it used an 'append' flag with the 'pyarrow.feather.write_feather' method, which is not in the documentation and the references it produced. One reference it produced used 'fastparquet.write()', which has an append flag. It may have confused the two.

Bing Chat > save_feather() function with a temp file using append in write_feather

Bing Chat > Are you sure you can use append in the write_feather function?

When I gave Bing Chat a chance to correct itself, I was surprised when it wrote, "I am not wrong.", and continued with source code to prove its point only to prove itself wrong unknowingly. To top it off, when I asked to show me where it got the code from, it directed me correctly to the 'pyarrow/feather.py' source on Github. But there, I found that the source code differs completely from the one it showed me.

Bing Chat > You are wrong append is not a parameter in the write_feather signature.

Bing Chat > Where did you find this source code?

In summary, Bing Chat generated code for me. And:

  1. It imagined a non-existing flag and feature in an open-source library.
  2. It refused to back down and argued it was right when given a chance to correct itself.
  3. Its proof (source code) for being right showed it was wrong.
  4. Worse, the proof was made up, and the referenced source is entirely different.

That is devastating. It produced incorrect code, failed to understand its mistake and faked source code with references. It did everything it could to throw me off and get things wrong.


Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.