Categories
LLM

Fix AttributeError: ‘DatasetDict’ object has no attribute ‘to’ in Huggingface Dataset

If you load dataset and you want to export it with “to_json”, “to_csv” or “features” and you got this error

Fix AttributeError: 'DatasetDict' object has no attribute 'to' 

The first step, to check the structure of the dataset by using print.

DatasetDict({
    train: Dataset({
        features: ['id', 'category', 'instruction', 'context', 'response'],
        num_rows: 15015
    })
})

In this case, the DatasetDict has only train data (usually it has test and etc). Exporting to json will be

ds['train'].to_json('train.json')

Yes, you need to access the key from the DatasetDict

Leave a Reply

Your email address will not be published. Required fields are marked *