Machine Learning with Limited Data

Pattern Exploitative Training
Learning with Limited Data

Pattern Exploitative Training

@article{schick2020exploiting,
  title={Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference},
  author={Timo Schick and Hinrich Schütze},
  journal={Computing Research Repository},
  volume={arXiv:2001.07676},
  url={http://arxiv.org/abs/2001.07676},
  year={2020}
}

@article{schick2020small,
  title={It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners},
  author={Timo Schick and Hinrich Schütze},
  journal={Computing Research Repository},
  volume={arXiv:2009.07118},
  url={http://arxiv.org/abs/2009.07118},
  year={2020}
}

Learning with Limited Data

Good machine learning is heavily dependent on good data. A few more good data-points is likely to be worth billions of model parameters.

However, sometimes we need to train models when data is limited. There are a number of strategies that we can try.

Zero-Shot and Few-Shot Learning

Pattern Exploitative Training is a way to use a small number of examples to train text classifiers. It is technically an example of synthetic data generation.

Machine Learning with Limited Data

Pattern Exploitative Training

Learning with Limited Data

Zero-Shot and Few-Shot Learning

In Context Learning (ICL)

Synthetic Data Generation and Augmentation