Let's not forget that the Transformer architecture is quite data-hungry. It needs tons of text to learn from, and all of that data has to be clean and well-labeled, which is a massive task. It's impressive tech for sure, but limitations in data and biases aren't something we can just …