Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasetbuilder Local Download FileNotFoundError #7001

Open
purefall opened this issue Jun 25, 2024 · 1 comment
Open

Datasetbuilder Local Download FileNotFoundError #7001

purefall opened this issue Jun 25, 2024 · 1 comment

Comments

@purefall
Copy link

purefall commented Jun 25, 2024

Describe the bug

So I was trying to download a dataset and save it as parquet and I follow the tutorial of Huggingface. However, during the excution I face a FileNotFoundError.

I debug the code and it seems there is a bug there:
So first it creates a .incomplete folder and before moving its contents the following code deletes the directory
Code
hence as a result I face with:

FileNotFoundError: [Errno 2] No such file or directory: '~/data/Parquet/.incomplete '

Steps to reproduce the bug

from datasets import load_dataset_builder
from pathlib import Path

parquet_dir = "~/data/Parquet/" 
Path(parquet_dir).mkdir(parents=True, exist_ok=True)
builder = load_dataset_builder(
    "rotten_tomatoes",
)
builder.download_and_prepare(parquet_dir, file_format="parquet")

Expected behavior

Downloads the files and saves as parquet

Environment info

Ubuntu,
Python 3.10

datasets 2.19.1
@purefall
Copy link
Author

Ok it seems the solution is to use the directory string without the trailing "/" which in my case as:

parquet_dir = "~/data/Parquet"

Still i think this is a weird behavior...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant