bn-l 4 hours ago

> without any additional training.

Could this be applied to existing models? (Sorry if in paper, just read abstract for now)