By Christian Prokopp on 2023-02-03
ChatGPT can combine Data with natural language and has extensive information about most subjects. That lends itself to novel applications like creating informative data dictionaries.
Let us ask ChatGPT for a public dataset we can use for this how-to.
> List public CSV datasets with links that I could use to demonstrate your ability to create a data dictionary from a CSV file.
Next, I downloaded one CSV file from the wine dataset and took a sample. ChatGPT can easily create a simple data dictionary table from it. But if we expand the question with some thought, it can make some valuable additions. For example, we can add SQL types, units of measure, descriptions expanded by ChatGPT's general know-how, and a summary for the table.
> Create a data dictionary from the wine quality dataset for the red wine quality. Add a column for SQL data types and favour DECIMAL over FLOAT. Add a column for the Unit of Measure. Create description fields using your knowledge of red wine for each column with at least two sentences each. Make them sound natural and not repetitive. Precede the data dictionary table with a summary paragraph for data users.
The output is remarkable. Three of four columns have been added by ChatGPT using context and its knowledge base. Naturally, you would want to verify the details to ensure it fits your purpose, but it is an impressive first draft.
Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at firstname.lastname@example.org for inquiries.
Large-language models (LLMs) are great generalists, but modifications are required for optimisation or specialist tasks. The easiest choice is Retr...
Can ChatGPT help you develop software in Python? Let us ask ChatGPT to write code to query AWS Athena to test if and how we can do it step-by-step.
Data is the root of all my worries ...
Get huge, valuable datasets with 4.9 million Amazon bestsellers for free. No payment, registration or credit card is needed.
Many Amazon marketplace customers know that its huge product catalogue has data quality issues. However, they might expect its top sellers, which t...