====== Trustworthy AI ====== ===== Overviews ===== * [[https://arxiv.org/pdf/2107.06641|Liu et al 2021 - Trustworthy AI: A Computational Perspective]] * [[https://arxiv.org/pdf/2110.01167|Li et al 2021 - Trustworthy AI: From Principles to Practices]] * [[https://dl.acm.org/doi/10.1145/3491209|Kaur et al 2022 - Trustworthy Artificial Intelligence: A Review]] * [[https://arxiv.org/pdf/2306.00380|Wu et al 2023 - Survey of Trustworthy AI: A Meta Decision of AI]] * **LLMs** * [[https://arxiv.org/abs/2308.05374|Liu et al 2023 - Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment]] ===== Papers ===== * [[https://arxiv.org/pdf/2010.07487.pdf|Jacovi et al 2020 - Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI]] ===== LLM Trustworthiness ===== {{media:llm_trustworthy_pillars.png}}\\ Figure from [[https://trustllm.ai/|here]]. ===== Related Pages ===== * [[nlp:Explainability]] * [[nlp:Hallucination and Factivity]] * [[Mechanistic Interpretability]]