AI poisoning tool Nightshade received 250k downloads in the first 5 days

  • I don't think I see it mentioned in the paper but what would happen if you used the poisoned images to train the text encoder? Would that make the text encoder itself resistant to at least the previously poisoned images?

    The images will still be titled and whatnot correctly, so unless someone is also attempting to trick scrapers, the actual captions will be correct. If this is the case can't you just compare the image classifier's and the actual caption to tell if it was poisoned.

  • I don't understand this nightshade tool. it says the tool works for many (if not all) models without needing to accessing the weights, yet when creating a poisoned image, it requires an existing model and it tries to let the poisoned image to generate a similar feature map of the unpoisoned image.

    how can it be sure that the feature map will be the same for all models?