PyTuna

Team: Jonathan Ko, Pranshu Chaturvedi, Andrew Chen, Kanav Kalucha, Vicki Xu
PyTorch Summer Hackathon, July-August 2020
Judging in progress

SUMMARY

In the past decade, we've made massive strides in developing accurate architectures and models to the point where most people can, in under a half-hour, use an out-of-the-box model on a dataset. Even so, out of the 24 million programmers in the world, just 300,000 are proficient in AI. This is because the effectiveness of these models is gated by how well the data is prepared; the data preprocessing step is where many programmers, beginners and experts alike, find the most difficulty. As any programmer knows, "garbage in, garbage out" — in other words, a model is only as good as the data that comes in.

However, there are few hard-and-fast rules or workflows to follow when preprocessing a dataset, especially with images. There are so many options to choose from — augmentation, normalization, object detection, Gaussian blur — and little consensus on how to use them.

PyTuna is a PyTorch framework that acts as a wizard to guide data scientists through preprocessing any image dataset. Under the hood, PyTuna has a convolutional neural network trained on 50 world-class image datasets and the preprocessing strategies that the most frequently-cited research papers and the most successful Kaggle notebooks used for each of them. Given any dataset, PyTuna first samples a representative subset. Then, it uses a ConvNet to predict a set of preprocessing techniques that are well-suited for the dataset. Next, the PyTuna wizard walks through each of the techniques with the user, automating, educating, and informing at each step. PyTuna makes the case for each technique, but the user gets the final say. Finally, PyTuna returns a list of preprocessed images in PyTorch tensor format, ready to be fed into a model.