See how easy it is to get some images from this zoo of different datasets
Published
October 18, 2021
I want to try out using yolo + clip for object detection. To do this I need images that have bounding boxes around the objects within. Normally the open images dataset is good for this but I haven’t downloaded it recently and I’ve got less than 500Gb of download left for the month, so it would be hard to fit that in and use internet normally.
While it’s possible to download the images using aws there is also the fiftyone dataset zoo. I’m going to trigger the download for some of the images while I check out the zoo. If you want to download from aws you can find the urls here.
During that download I’m going to look at the library. One immediate problem is that the latest version of kaleido fails to install in Python 3.9. I can’t find any ticket about this so I’ve resorted to downgrading it to version 0.2.1. After doing that the library installed successfully.
Code
#hide_outputimport fiftyone.zoo as fozdataset = foz.load_zoo_dataset("quickstart")dataset.persistent =True
Dataset already downloaded
Loading existing dataset 'quickstart'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use
I kinda hate libraries that put things in my home folder. So that’s already a black mark against this. I want to be able to customize the download location.
#hide_outputimport fiftyone.zoo as fozdataset = foz.load_zoo_dataset("quickstart")dataset.persistent =True
Dataset already downloaded
Loading existing dataset 'quickstart'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use
I’ve opened a bug about this as it directly contradicts the documented behaviour. The configuration modification is documented here. The specific field is documented here. I have updated the field and the update has applied, yet it is ignored by the dataset.