How to create a Data Dictionary using ChatGPT

By Christian Prokopp on 2023-02-03

ChatGPT can combine Data with natural language and has extensive information about most subjects. That lends itself to novel applications like creating informative data dictionaries.

Robot reading

Let us ask ChatGPT for a public dataset we can use for this how-to.

> List public CSV datasets with links that I could use to demonstrate your ability to create a data dictionary from a CSV file.

ChatGPT > List public CSV datasets with links that I could use to demonstrate your ability to create a data dictionary from a CSV file.

Next, I downloaded one CSV file from the wine dataset and took a sample. ChatGPT can easily create a simple data dictionary table from it. But if we expand the question with some thought, it can make some valuable additions. For example, we can add SQL types, units of measure, descriptions expanded by ChatGPT's general know-how, and a summary for the table.

> Create a data dictionary from the wine quality dataset for the red wine quality. Add a column for SQL data types and favour DECIMAL over FLOAT. Add a column for the Unit of Measure. Create description fields using your knowledge of red wine for each column with at least two sentences each. Make them sound natural and not repetitive. Precede the data dictionary table with a summary paragraph for data users.

ChatGPT > Create a data dictionary from the wine quality dataset for the red wine quality. 1/3

ChatGPT > Create a data dictionary from the wine quality dataset for the red wine quality. 2/3

ChatGPT > Create a data dictionary from the wine quality dataset for the red wine quality. 3/3

The output is remarkable. Three of four columns have been added by ChatGPT using context and its knowledge base. Naturally, you would want to verify the details to ensure it fits your purpose, but it is an impressive first draft.


Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.