TruthfulQA: Measuring how models mimic human falsehoods
TruthfulQA benchmark measures language model propensity to repeat human falsehoods, becoming standard evaluation for truthfulness.
Read at source: https://openai.com/index/truthfulqa
TruthfulQA benchmark measures language model propensity to repeat human falsehoods, becoming standard evaluation for truthfulness.
Read at source: https://openai.com/index/truthfulqa