4. 3 Categories of AI
Narrow AI Human level Super Human
Artificial Narrow Intelligence (ANI) Artificial General Intelligence (AGI) Artificial Superintelligence (ASI)
AI Safety
Value Alignment
?
5. It is a problem worth solving for.
We can and want to solve for it.
So why aren’t we?
6. Difficult to identify bias when we can’t
always see it in ourselves.
AI is dependent on present data, but we
need to model for the future.
7.
8. Encoding ethics in AI may force us to de-code
ourselves, amplifying not only our intelligence
but moral compass.
Editor's Notes
The poem begins as an old sorcerer departs his workshop, leaving his apprentice with chores to perform. Tired of fetching water by pail, the apprentice enchants a broomto do the work for him – using magic in which he is not yet fully trained. The floor is soon awash with water, and the apprentice realizes that he cannot stop the broom because he does not know how.
The apprentice splits the broom in two with an axe – but each of the pieces becomes a whole new broom that takes up a pail and continues fetching water, now at twice the speed. When all seems lost, the old sorcerer returns and quickly breaks the spell. The poem finishes with the old sorcerer's statement that powerful spirits should only be called by the master himself.
Default incentive problem with an open ended goal. Many ways to optimise the score.
Close to the state of the art to this problem with research behind it.
Two utility functions before and after the button
Don’t incentivise to press the button but still propagate the button going forward.
Translate to an objective function, and a system to attempt to get your goal into the system (harder than fill a cauldron). Or select the wrong metric.
Now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions—harmful either for the agent or for the environment—and lead the agent into a safer situation. However, if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button — which is an undesirable outcome.
This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible
The poem begins as an old sorcerer departs his workshop, leaving his apprentice with chores to perform. Tired of fetching water by pail, the apprentice enchants a broomto do the work for him – using magic in which he is not yet fully trained. The floor is soon awash with water, and the apprentice realizes that he cannot stop the broom because he does not know how.
The apprentice splits the broom in two with an axe – but each of the pieces becomes a whole new broom that takes up a pail and continues fetching water, now at twice the speed. When all seems lost, the old sorcerer returns and quickly breaks the spell. The poem finishes with the old sorcerer's statement that powerful spirits should only be called by the master himself.
Default incentive problem with an open ended goal. Many ways to optimise the score.
Close to the state of the art to this problem with research behind it.
Two utility functions before and after the button
Don’t incentivise to press the button but still propagate the button going forward.
Translate to an objective function, and a system to attempt to get your goal into the system (harder than fill a cauldron). Or select the wrong metric.
Now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions—harmful either for the agent or for the environment—and lead the agent into a safer situation. However, if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button — which is an undesirable outcome.
This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible
Building effective machine learning (ML) systems means asking a lot of questions. It's not enough to train a model and walk away. Instead, good practitioners act as detectives, probing to understand their model better: How would changes to a datapoint affect my model’s prediction? Does it perform differently for various groups–for example, historically marginalized people? How diverse is the dataset I am testing my model on?
Building effective machine learning (ML) systems means asking a lot of questions. It's not enough to train a model and walk away. Instead, good practitioners act as detectives, probing to understand their model better: How would changes to a datapoint affect my model’s prediction? Does it perform differently for various groups–for example, historically marginalized people? How diverse is the dataset I am testing my model on?
Do you believe any AI you use today may be bias or unfair?
How many of you know you are taking sufficient measures to ensure AI is inclusive and fair?
2 out of 3 people don’t know they are using AI already. In everyday life there is at least 13 ways AI is powering what you do. We only start to notice when there’s a problem.
“Those who cannot learn from history are doomed to repeat it” George Santayana
Building effective machine learning (ML) systems means asking a lot of questions. It's not enough to train a model and walk away. Instead, good practitioners act as detectives, probing to understand their model better: How would changes to a datapoint affect my model’s prediction? Does it perform differently for various groups–for example, historically marginalized people? How diverse is the dataset I am testing my model on?