17-09-27 Interesting Approaches to AI Safety

1. Alignment is a fool’s errand. Focus entirely on control, and enforce a ban on autonomy. 2. Top-down (and understood) is much safer than bottom up. Create organizations to re-orient capabilities in that direction.

The should be a focus on directing capabilities towards interpretability & interruptibility - across the capabilities board, score approaches on safety impact.
Getting the machine intelligence community aligned with solving safety problems is more important than doing safety research. (Though the best way to create alignment may be through ML focused safety research)
1. Problems being orthogonality, degenerate loss functions, control (interruptibility, scalable solutions, avoiding misinterpretation), unsafe exploration and generalization failures
Metalearning from human preferences isn’t scalable and generalization will depend on the strength of our generalization algorithms
1. It’s also unclear that an agent that follows our preferences will lead to good outcomes
Policy is more important than research
1. Intermediate outcomes that involve governments using autonomous agents to create a singleton are likely to need to be overcome before safety from general intelligence is a problem
ML Researchers are not on board with safety issues in general.
1. See - Francois Chollet, Andrew Ng, Jasper Snoek, Ryan Adams, Zuckerberg ML reddit voting trends
2. Negative reactions to Elon’s commentary across the board
A team should be created to spread the steel manned versions of memes around safety.
AGI isn’t going to happen for a long time - too many broken paradigms and failed approaches. Better to work on collecting resources and power IRL to deploy decades down the line.

Source: Original Google Doc