DataLoader and DataSets
PyTorch provides some helper functions to load data, shuffling, and augmentations. This section we will learn more about it.
Data loading in PyTorch can be separated in 2 parts:
Data must be wrapped on a Dataset parent class where the methods __getitem__ and __len__ must be overrided. Not that at this point the data is not loaded on memory. PyTorch will only load what is needed to the memory.
Use a Dataloader that will actually read the data and put into memory.
The example shown here is going to be used to load data from our driverless car demo.
Dataset parent class
So let's create a class that is inherited from the Dataset class, here we will provide functions to gather data and also to know the number of items, but we will not load the whole thing in memory.
Instantiating the dataset and passing to the dataloader
Now pytorch will manage for you all the shuffling management and loading (multi-threaded) of your data.
Tranformation
PyTorch also has a mechanism to apply simple transformations on the image
References:
Last updated