Adversarial Structural Learning: Approximating Training Data for Multi-Variate Predictions
MetadataShow full item record
In this thesis, we address two important characteristics of prediction tasks in many real world problems by developing an adversarial classification framework. Structured data is the feature of many real world applications (e.g., computer vision, bioinformatics, natural language processing) requiring structured prediction methods to deal with interrelated variables. The other characteristic is lack of annotated data resulting in unreliable accuracy and high cost of annotation. We first introduce adversarial data augmentation (ADA) for object detection to demonstrate the adversarial approach and its benefits for computer vision without employing structural constraints on the output. Instead, the adversarial distribution over detected objects is shaped entirely by the evaluation performance measure. Our approach avoids the non-convexity of the empirical risk minimization for loss functions specialized for computer vision tasks (e.g., overlap over 70% being treated as correct), while providing strong theoretical guarantees (e.g., Fisher consistency). We find significant improvement on efficiency and predictive performance comparing with other methods across different deep architectures and features. As a next step, we study two structured prediction tasks: multi-label classification and bipartite matching. For the first task, we consider learning in edge-weighted graphs by proposing an adversarial robust cut framework. We explain two applications of our approach in supervised multi-label classification and semi-supervised binary classification, and find better prediction performance, tighter loss bound and time efficiency comparing with the state of the art methods. Next, we investigate modeling bipartite matching problems in our adversarial framework as our third contribution. We apply our approach in a video tracking application and demonstrate the efficiency and Fisher consistency of our method. As the third part of this thesis, we leverage adversarial active learning for structured prediction problems to address the lack of annotated data characteristics. We employ our adversarial structured prediction frameworks (Adversarial robust cut and adversarial bipartite matching) and apply active learning using uncertainty sampling heuristics. Finally, we target zero-shot learning problem where the training phase is done in the absence of testing data classes. We achieve better performance than comparison methods (e.g., structural support vector machine, Conditional random field) by leveraging adversarial robust cuts and a hierarchical feature representation.
SubjectMachine Learning, Structured prediction