Aug 19, 2023
That is definitely a great idea, but doable only if you have large enough data. For fine-tuning, a limited amount of data is enough, but for pre-training on a certain domain, you need a very large amount of domain specific data, which is usually hard to find.