The A.I. Thread of OMG We're Being Replaced

Maybe (hopefully?) it will get as much pushback as AI-created book covers…?

Unless they come free with the Kindle version or whatever, I can’t imagine people will be happy paying Audible a premium for a text-to-speech service their phone would be able to do for free (if not now, soon).

Yeah, that’s a great point. Not helpful for the unpaid narrators, I guess, but it’s good to think that maybe it’s a dead end for Amazon.

Speech rec has made quantum leaps in the past few years.

I think he will continue to be fine just being an author. His first 2 books seem to have done well, and he now has a deal with a big publisher to do 3-4 more I believe.

I think you are having a different conversation than everyone else in this thread. You aren’t wrong, but you are talking about what’s being done in research settings while everyone else is talking about stuff available to consumers right now on their phones and PCs.

and TVs. Because of my mom’s hearing we usually watch things with closed captions on. Generally the non-live stuff is fine because it’s been given the once-over by humans but anything live, like a local newscast? Forget about it.

Okay, say you double the computational efficiency. What does that other than that OpenAI burns through $350000 a day instead of $700000? Flooding Facebook with Shrimp Jesus is half as environmentally destructive?

What is the use case that justifies all this money? You can have some efficiency gains in certain tasks, sure. You can fire your most customer service an replace them with ai bots and infuriate your customers for marginal gains, sure. But for fucking seven hundred thousand dollars a day and all the environmental destruction that comes with it, you need to produce a lot more than that.

And yeah, you can say “oh we can’t know what leaps we’re going to see” but given the way large language models work, this could just as easily be a cul de sac, where we’re just going to see incremental improvements and efficiency gains without any fundamental shifts incapability.

The stuff on PCs and mobile has made a big leap this year and last. It’s usually marked as “AI assisted” - check out the Speechify stuff. Dont look up the founder though.

You’ve got tons of crazy new AI stuff, you just seem to have already normalized it so that it’s trivial to you now, which seems insane to me.

Like, in the thread when JP was in the hospital, I had co-pilot generate a picture of an army of frogs carrying a patient out of a hospital.

On some level it’s silly and trivial. Would my life be really worse if I couldn’t have done that? Of course not.

But in another level, it’s totally fucking amazing. The idea that you can describe a totally novel thing in natural language, involving absurd concepts, have a computer understand it and then create a picture of it? Like… Most human beings couldn’t do that for me at all, much less do it in a few seconds. Performing that task is doing so much stuff that would have been considered impossible less than a decade ago. Hell, less than 5 years ago. That level of language understanding, spanning modalities, is amazing to me.

The stuff that’s involved in that is amazing, and it’s always so weird for me to see folks trivialize such things. And I work in the field, which kind of makes it even weirder to me, because I kind of understand how it works, and it’s still amazing to me, but some folks have much less understanding, and seem much more willing to trivialize it. Perhaps because they’ve read incorrect explanations of how things work that trivialize them, and then think that’s actually what’s going on, but it’s still kind of weird to me.

I realize that this could be taken wrong, so I want to make it really clear that I’m not trying to patronize or talk down to you here, it’s just that I don’t know what your background is. But, do you really know how these models work?

I ask that question because there are definitely a lot of folks who write up blog posts and stuff about this technology who definitely do not understand how these models works. Indeed, even as someone who works with them professionally, I feel like I would be reluctant to really claim that I fully know how they work.

When the first work was done with transformers, I don’t believe Google’s researchers expected that such systems would be able to process data with only an attention mechanism. And yet here we are. And I don’t think that anyone really expected that such systems would be able to achieve the level of understanding they have, just from consuming mountains of text, and yet again, here we are. I don’t think they expected that such systems works develop the kind of zero and few shot learning capabilities that we’ve seen.

And people may try to argue whether these models actually understand anything, but I feel that it definitely is the best word to use. There definitely is, to me at least, some sort of understanding that exists within these models. To suggest otherwise is to suggest some different definition of understanding that I… don’t understand. If a system is able to speak intelligently and answer questions about stuff, it has some understanding of those things. I’m certain that understanding is different from my own, but I believe it is still some sort of understanding.

As I said, there have been explanations in the media that somewhat trivialize what’s going on in these systems (I e. “It’s just a bullshit generator!”) but those explanations aren’t actually good. They don’t really capture what’s going on, and thus ideas built upon those fallacious mental models of the systems also aren’t really valid.

I think anyone telling you they know what lies down this road of technological development is kind of full of shit, to some degree or another. But the stuff we’ve seen very recently, on terms of magnitude of change and new capabilities, makes me think the folks claiming we’re going to see continued improvement are likely more correct than the ones claiming we’re going to run into a brick wall tomorrow.

I think a lot of people have come to see this stuff as sort of a cheap trick. For myself, I keep seeing these models do things so easily that were considered holy grail projects back when I was doing my PhD.

If I told 2010 me that we would have these capabilities in 2024 and people were debating whether they had any real use cases, 2010 me would figure the human race had gone crazy. Then I would tell 2010 me about the whole Trump saga…

Well, part of it is that you have these huge tech companies trying to sell the cheap party tricks these text based Gen AI can do as much more than they are.

See check out this CRN interview with Dell.

Now do replace all references to ‘AI’ with ‘containers/kubernetes’. Congrats, you have a CRN article from circa 2022.

Now replace all the references to ‘AI’ with ‘blockchain/Web3’. Congrats, you have a CRN article from circa 2020.

Now replace all the references to ‘AI’ with ‘big data’. Congrats, you have an CRN article from circa 2018.

It’s all just so predictably wishy washy vague around another tech that will be generationally transformational to all data center design, all application design, all business workflows and all business future growth potentials.

The folk selling shovels make out like bandits and the actual business customers make significant investments for marginal, if any, gains holistically until another transformational tech comes along.

I do think there is more meat on the bones with AI, but to date we are in the same situation - very little detail or examples on real widespread use cases beyond ‘replace front line workers with chat bots’ and ‘replace creatives with genAI’. Neither of which I think will be revolutionary for actual widespread business/GDP growth.

What’s really striking about this is the sheer laziness of the companies, starting with the stupid sameyness of their names (“Well, -r is out of fashion, let’s go with -ly”). None of the apps, with the possible exception of the IKEA AR one that isn’t featured, do what they purport to do. There’s no design involved, no analysis of the space and its contents, not even a real segregation of them. It’s just a lazy thematic reskin of the image with no regard for what it contains. I mean, look at the nonsense items in this one:

They don’t even rearrange the pictures on the wall! And they have the gall to charge $12 a month for this.

Yes, I agree, it is kinda amazing. But is it actually useful? It’s impressive that you can give it a prompt to generate an absurd image, but what is the real life business use case that justifies the massive amounts of money poured into it?

Yeah, we have different notions of understanding. I don’t think they actually understand anything, anymore than a parrot understands a phrase it’s been trained to mimic.

Mr. Angier, have you considered the cost of such a machine?

Price is not an object.

Perhaps not, but have you considered the cost?

I was thinking about this same analogy recently, and the parrot’s actually doing a lot more. He’s got senses that can tell there’s affection or food coming, a memory that saying the phrase can lead to rewards, a sense of hunger, the possibility of being too full to want the reward, etc. etc. The AI just knows that it needs to respond when prompted, with no understanding or concern about what, where, why, when, or how it got that “urge.”

I think it definitely is.

Generating a silly picture isn’t the important part, but rather what is necessarily taking place in order to generate the silly image. It demonstrates a degree of language understanding that would have been considered galaxies away from what computers were capable of, just a few short years ago. It’s stuff that was considered squarely within the realm of science fiction, and now it’s not only possible, but it’s widely available for virtually anyone to play with.

What we’re seeing now is a cacophony of applications, throwing everything at the wall and seeing what sticks. Tons of stuff is going to be useless (although honestly I wouldn’t say the image generation stuff falls into this category), but I think it’s totally reasonable to expect there to be some lag in finding the real “killer app” for this kind of tech, given how recently it arose from essentially nothing.

In truth, I can say with 100% certainty that, yes, it will be useful because I’ve already found applications for it in a professional sense, but the application isn’t something that’s in the commercial space. It’s more focused on military training applications, but even there, I can see applying the stuff I’ve worked on into the commercial space.

For instance, some of this stuff has dramatically improved our ability to perform natural language processing and understand human speech, both in terms of transcribing it into text, but more importantly, in terms of understanding the semantic content of messages. This allows us to understand complex verbal orders and execute them. An example application would be making an AI roleplayer flying an aircraft in simulation that you can talk to over the radio, just like you would be able to with a human pilot.

I can imagine this tech being used in the commercial space, from things like AI in games where you can have meaningful natural language interactions with characters, to having AI assistants on your phone who can understand and execute orders that go beyond the trivial ones they’ve historically been able to perform. Again, this is stuff we’re doing TODAY, and while it’s still relatively early in development, what we’ve already been able to achieve is stuff I would not have thought possible 5 years ago.

In terms of money being poured into it, that’s one of the most amazing aspects of this new AI revolution that’s taking place. While the bigs are certainly pouring mountains of money into the space, the amount of truly open technology that is available to small research and development groups like mine is bonkers. While OpenAI is laughably closed in terms of making any of its stuff available, companies like Google and Meta have released insanely powerful open source stuff that literally anyone can use and play with for free.

You don’t think the parrot understand the language it speaks? Because there’s a mountain of evidence in avian intelligence that suggests they definitely do.

But in terms of these models, there is absolutely some understanding that takes place here. That is, the models contain an abstract representation of language, not only at the word level, but at the concept level. These concepts are represented in an N-dimensional space, where location correlates to our own understanding of concepts. As a simple example, you can look a the distance between two concepts in that space, and concepts that are closer together are generally more similar, from the perspective of our own human understanding. I.e. the distance between a dog and a cat in that space is less than the distance between a dog and a mountain lion.

Those models contain a representation of the knowledge and understanding that is contained within our writings. While that understanding is certainly incomplete compared to our own, I feel that it’s foolish to pretend it doesn’t exist at all. It’s akin to the shadows on plato’s cave perhaps, but the understanding is still there.

To be clear, that doesn’t imply consciousness or intentionality, and I think that’s where people get gummed up when thinking about understanding. I do not believe you need to have a conscious mind in order to understand something. You merely need to have a representational model of the thing you are understanding that is useful, and understanding can exist at many levels across a wide spectrum. As John McCarthy suggested, we can think of a thermostat as having an understanding of or belief about the temperature in a room, albeit a trivially simple one compared to our own. But these language models have an understanding that is impressively complex and comprehensive. They can correctly, sensibly converse about a huge variety of things, and the only way you can do that is to have some sort of understanding of those things. If you are going to suggest otherwise, then I’d have to ask you to define understanding, because in the past when animal intelligence and understanding has been discounted, it’s been entirely due to human arrogance and assumption that we are special snowflakes, and other animals cannot possibly be doing what we do, because reasons.

I don’t think there’s really much argument that it’s useful. But it seems much less certain that it will be, ultimately and on the balance, beneficial. I think it’s very much an open question whether the ways in which it can help humanity outweigh the ways in which it can harm us.

The parrot analogy is really awful. Birds are mostly just mimicking the vocalizations that they have heard. They have very limited ability to imitate human speech.

To Timex’s point about whether AI language models have “an understanding”… they do and they don’t. They clearly “understand” how words work. They are able to generate novel coherent sentences in several languages, taking tone and formality into account. And Timex is right that just a few years ago that statement would sound bonkers. It’s an incredible leap.

But it’s also pretty clear to me that these models don’t understand facts or math or logic nearly as well. This is why LLMs make such unreliable narrators. They have no underlying world view about anything beyond token sequences. People often refer to this as “hallucination”, where the LLM is “making up facts”. Personally, I find that word to be a better description of the human interaction with the LLM. The AI isn’t hallucinating made-up facts, it’s just spitting out tokens that follow the learned rules of its training set. We are the ones hallucinating that any of these tokens mean anything in the first place.

Researchers are looking for strategies to address that limitation by augmenting LLMs with tools that can provide math and logic capabilities. Don’t ask the LLM “what is 2+2?” instead, prompt it to “use this calculator to add two and two and tell me what result you get.” Or, “please write a Python function that adds two and two and returns the result.” The fact that such “agentic” strategies work at all is equally incredible. But it’s still too early to tell whether they will work well enough to solve the hallucination problem.

You sure about this? Keep in mind that a parrot figuring out that “this sound” means “gets a treat of a specific kind” is language, but it isn’t “understanding.” If a parrot could spontaneously conjugate a word or add adjectives to a noun without being trained to do so, that would impress me. And maybe they can, I’m still googling for this; it’s pretty rife with “birds are awesome!” sites with no rigor.