Ask HN: Will outputs of GPT posted online pollute future training data of LLMs

If GPT models generate tons of data and most of that gets posted online -- does it reduce the overall quality of training data available to future LLMs

This post does not have any comments yet