Date of Graduation

5-2019

Level of Access

Open Access Thesis

Department or Program

Computer Science

First Advisor

Eric Chown

Abstract

Isolation-Based Scene Generation (IBSG) is a process for creating synthetic datasets made to train machine learning detectors and classifiers. In this project, we formalize the IBSG process and describe the scenarios—object detection and object classification given audio or image input—in which it can be useful. We then look at the Stanford Street View House Number (SVHN) dataset and build several different IBSG training datasets based on existing SVHN data. We try to improve the compositing algorithm used to build the IBSG dataset so that models trained with synthetic data perform as well as models trained with the original SVHN training dataset. We find that the SVHN datasets that perform best are composited from isolations extracted from existing training data, leading us to suggest that IBSG be used in situations where a researcher wants to train a model with only a small amount of real, unlabeled training data.

Available for download on Tuesday, May 26, 2020

COinS