Do You Trust This Computer?

I’ve exploited a few backdoors in my time

I mean, that’s not a surprise to me, or anyone who’s been paying attention to autonomous vehicles or financial AI, or this blog

Again, it’s not a minor change. It’s a change to the underlying reward structure. They literally changed the rules of the games they were training the AI on. It’s not like they were just different gameplay examples, or altering the images (without altering the score).

To be fair, having looked through the paper itself, it does seem to be a lot more insightful than I was giving it credit for from the summary article, not that I’m qualified to judge it.

The AI does not know any rules to the game. It simply has a massive set of inputs, outputs, and rewards.

You might have constructed the rewards according to a small number of fixed rules, but not necessarily. Imagine you are training an AI to predict the effects of medication. You know the effects for millions of cases, but you don’t know the underlying rules for what caused those effects. The AI is supposed to figure that out. It’s too complicated for you.

Now suppose the AI comes up with a working solution. It can predict the effects of each medication for a patient, and therefore choose the best one.

According to this paper, you can change a few details in that massive training set that will have no effect for the vast majority of patients. But when a particular patient comes in, the AI will choose a medication that kills him. I think that’s surprising and unsettling.

This paper was in the context of reinforcement learning though right? That is more like the model learning that “hey, every time I give left-handed blonde hypochondriacs cyanide, they recover from their cancer!”. I wouldn’t be surprised if it generalised to learning to predict outcomes from a static training set though.

Maybe this should be in the Technical or P&R category.

Sounds good to me. Machine learning is required to mangle massive data volumes, and human oversight keeps everything from going Skynet.

Hey computer, why don’t you pass the time with a little solitaire?

In the movies I think Skynet/Legion happens no matter what people do.

I recently picked up his up and added it to the backlog. Anyone here know if it’s any good?

Haven’t read it, but it’s certainly well-reviewed! Seems she hones in on the difference between general AI (which is not really a thing yet) and specific AI (which very much is).

Found it when I came across her blog post about training a neural net on Christmas carols:

Just finished it myself. It’s OK as a basic introduction, but I’d like it to be at least 50% more technical (and I say that as someone who has only very basic coding knowledge) and it is very short (the footnotes and index make up 25% of the book, and while there are a lot of them, they’re just citations. It also suffers from the tic that a lot of pop sci books have, namely repeating illustrative anecdotes again and again to hammer home the same point in different contexts. It gets pretty grating, especially when there was plenty of opportunity to go into the details of how certain problems were solved (or weren’t), as opposed to repeating high level principles. The book would probably be half as long as it is without that repetition, and it’s short as it is.

From the outside looking in, it seems AI is shackled by its creators limitations. Wich is to be expected. So I question the idea of a singularity. An apotheosis has to happen with the creator first.

Pretty damn cool:

Could allow blind people to enjoy porn in the future.

Some GPT-3 skepticism:

The article feels slightly unfair to me, because I can’t imagine relying on GPT-3 for medical diagnosis, or insights about the world, so it’s criticising it for something it’s not trying to be (maybe OpenAI are making stronger claims, but I haven’t seen them). It’s a toy at this point, albeit a very impressive one, especially given its general purpose nature. As someone with an amateur interest in linguistics, its ability to produce syntactically and idiomatically correct language is something I wouldn’t have dreamed of 10 years ago, even if is often semantically gibberish. Even being fully aware of the limitations illustrated in the article, stuff like this is pretty mindblowing:

This was one of the plot points in Blindsight. I.e. an AI creating syntactically correct English that ultimately was devoid of semantic meaning. Except it was an AI that was also able to create an invisible planet fourteen times the size of Jupiter. FYI.

That’s strangely reassuring.

It reminds me an awful lot of Ex Machina.

Sort of like having a conversation with the gun that kills you.