--- title: AI safety papers author: Issa Rice created: 2019-01-10 date: 2019-01-10 # documentkind: # status: # belief: bigtable: true --- This page keeps track of AI safety papers I am reading, to help me remember why I want to read a paper, where I get stuck, etc. The "Status/lessons" column tracks where I am in the paper and what I didn't understand, what background I seem to be missing, etc. The "Source/motivation" column tracks how I came across the paper and why I want to read the paper. (These two things are often connected, so I combined the column.) |Title|Status/lessons|Source/motivation| |-----|----------------------------|-------------------------| |["Reflective Oracles: A Foundation for Game Theory in Artificial Intelligence"](https://intelligence.org/files/ReflectiveOraclesAI.pdf)|At statement of theorem 4.1. Decided I wasn't comfortable enough with game theory (2019-01-09).|I've seen this paper mentioned a bunch.| |"Logical Induction"|I read the beginning parts of this paper twice and watched Andrew Critch's talk on YouTube. I am slowly digesting the definitions and so forth.|This seems to be one of MIRI's big results, so I want to understand it. I think I originally decided to read it because I wanted to understand decision theory better.| |["AI safety via debate"](https://arxiv.org/abs/1805.00899)|I finished reading the paper ([2019-01-04](https://issarice.wordpress.com/2019/01/05/2019-01-04/), [2019-01-05](https://issarice.wordpress.com/2019/01/05/2019-01-05/)). I think I need to know more about computational complexity (to appreciate the debate hierarchy analogy) and about machine learning in general|I wanted to understand the Paul/OpenAI approach better.| |["Supervising strong learners by amplifying weak experts"](https://arxiv.org/abs/1810.08575)|I finished reading the paper ([2019-01-05](https://issarice.wordpress.com/2019/01/05/2019-01-05/)). I think I need more familiarity with machine learning to appreciate the paper.|I wanted to understand the Paul/OpenAI approach better.| # External links * [My daily updates blog for my AI safety learning](https://issarice.wordpress.com/)