As opposed to industrial processes that you compared it to, we cannot predict the output of a LLM with any kind of certainty. This can and will be problematic, as our economy is built around predictable processes.
That is true, but perhaps inappropriate in this case. Humans are not predictable, nor is weather, the actual outcomes of policy decisions, and any number of things that are critical to a functioning society. We mostly cope with most issues by creating systems that are somewhat resilient, take into account the lack of perfection, and by making adjustments over time to tweak the results.
I think perhaps a better analogy than the oil refinery might be economic or social policy. We have to always be fiddling with inputs and processes to get the results we desire. We never have perfectly predictable outcomes, yet somehow mostly manage to get things approximately correct. This doesn’t even ignore the issue that we can’t seem to really agree on what “correct” is as we seem to be in general agreement that 1920 was better than 1820 and that 2020 was better than 1920.
If we want AI to be the backbone of industry, then the current state of the art probably isn’t suitable and the LLM/transformer systems may never be. But if we want other ways to browse a problem space for potential solutions, then maybe they fit the bill.
I don’t know and I suspect we’re still a decade away from really being able to tell whether these things are net positive or not. Just one more thing that we have difficulty predicting, so we have to be sure to hedge our bets.
(And I apologize if it seems I’ve just moved the goal posts. I probably did, but I’m not really sure that I or anyone else really knows enough at this point to really lock them in place.)
Maybe thinking about it in terms of a simple video game that’s complex enough to have floating point math involved. The significand would be like the skeletons of the sentences with article words (the, a, an) and the sentence structure as a base.
There’s a good pac man analogy in there some where…
I don’t really follow you. I’m not able to make the leap from the methods of floating point math to construction of sentences. There is a sense in which I understand what you’ve written and another sense in which I feel like there was one more step on the staircase than I realized :)
The static point would be the sentence
“Theres a ____ in the house”
And from there it’s like a coin sorting machine filter filter filter okay noun filter filter filter cat the user doesn’t want a cat filter filter filter dog
Where the filtering = other similar static points or it’s looking for other sentences arranged like that with those words in that context.
That’s how it mistakes cat for dog
It’s not thinking “I know what a cat is, dogs are like that”
It’s just looking for word usage frequency in that specific or similar contexts and replacing it with a frequently used word. That’s how you end up getting a wrong answer “what’s more like a cat? Dog or kitten? Reply:Dog.”
Or if it screws up some math it’s to do with it not actually doing any math, instead it’s looking for answer frequency and enough people wrote 2+2=5
Okay, now I get it. That is pretty close to how I imagine it, too. That is part of why I think these LLMs may give insight into cognition more generally.
I had never thought of that while reading books and articles that describe and investigate the errors we make, especially when there is some kind of brain damage. But I feel like I’ve seen all these errors described in humans by Oliver Sacks et al.
I’m interested in this primarily as an English teacher. I need to be able to spot the linguistic tics and errors and recognize where it likely came from.
Right now, the best we have is like the opening scenes from Bladerunner.
Holden: One-one-eight-seven at Unterwasser.
Leon: That’s the hotel.
Holden: What?
Leon: Where I live.
Holden: Nice place?
Leon: Yeah, sure I guess-- that part of the test?
Holden: No, just warming you up, that’s all.
Leon: Oh. It’s not fancy or anything.
Holden: You’re in a desert, walking along in the sand when all of the sudden-
Leon: Is this the test now?
Holden: Yes. You’re in a desert walking along in the sand when all of the sudden you look down-
Leon: What one?
Holden: What?
Leon: What desert?
Holden: It doesn’t make any difference what desert, it’s completely hypothetical.
Leon: But how come I’d be there?
Holden: Maybe you’re fed up, maybe you want to be by yourself, who knows? You look down and you see a tortoise, Leon, it’s crawling towards you-
Leon: Tortoise, what’s that?
Holden: Know what a turtle is?
Leon: Of course.
Holden: Same thing.
Leon: I’ve never seen a turtle – But I understand what you mean.
Holden: You reach down, you flip the tortoise over on its back Leon.
Leon: Do you make up these questions, Mr. Holden, or do they write them down for you?
Holden: The tortoise lays on its back, its belly baking in the hot sun beating its legs trying to turn itself over but it can’t, not without your help, but you’re not helping.
Leon: What do you mean I’m not helping?
Holden: I mean, you’re not helping. Why is that Leon? – They’re just questions, Leon. In answer to your query, they’re written down for me. It’s a test, designed to provoke an emotional response. – Shall we continue?
Except I can’t ask the paper on Maya Angelou any questions. Short of interrogating each student when they turn something in, it’s been a real struggle in the last few months to spot work that was not actually done by my students but was instead written by chat gpt.
How to proceed now that they all interact with TikTok’s chatbot, where not just the tech savvy kids will try this, idk.
But my first super fake was a well written paper about the personal growth of a girl named Fredericka who described feeling triumphant having just got her masters degree and overcoming adversity since she grew up as a young black boy in the south. “Hmmmm,” I thought. “Something tells me You didn’t write this.”
I’m interested in this primarily as an English teacher. I need to be able to spot the linguistic tics and errors and recognize where it likely came from.
That might well turn out to be the Red Queen’s Race. It’s only a guess, but I suspect that competitive models, the advances resulting from competition, and the advances and experimentation associated with catching and correcting mistakes will mean that you’ll generally be playing catch up.
Frankly, I don’t even have anything more useful to offer than the unrealistic suggestion that all such work be performed in class using locked down word processing appliances or in longhand. It may be that the days of assigning unsupervised schoolwork are over.
Oh also, regarding compartmentalized language models in the brain, profanity and swearing is stored in muscle memory, not the front lobe. That’s why if u lose the power of speech due to stroke, you’d still be able to shout profanity of some kind.
Hah! Yes, I was aware of that. I only hope that should I be so afflicted that that still applies when using some of those words in the gloriously flexible ways they are capable of. :)
I disagree that 3 is not a problem.
As opposed to industrial processes that you compared it to, we cannot predict the output of a LLM with any kind of certainty. This can and will be problematic, as our economy is built around predictable processes.
That is true, but perhaps inappropriate in this case. Humans are not predictable, nor is weather, the actual outcomes of policy decisions, and any number of things that are critical to a functioning society. We mostly cope with most issues by creating systems that are somewhat resilient, take into account the lack of perfection, and by making adjustments over time to tweak the results.
I think perhaps a better analogy than the oil refinery might be economic or social policy. We have to always be fiddling with inputs and processes to get the results we desire. We never have perfectly predictable outcomes, yet somehow mostly manage to get things approximately correct. This doesn’t even ignore the issue that we can’t seem to really agree on what “correct” is as we seem to be in general agreement that 1920 was better than 1820 and that 2020 was better than 1920.
If we want AI to be the backbone of industry, then the current state of the art probably isn’t suitable and the LLM/transformer systems may never be. But if we want other ways to browse a problem space for potential solutions, then maybe they fit the bill.
I don’t know and I suspect we’re still a decade away from really being able to tell whether these things are net positive or not. Just one more thing that we have difficulty predicting, so we have to be sure to hedge our bets.
(And I apologize if it seems I’ve just moved the goal posts. I probably did, but I’m not really sure that I or anyone else really knows enough at this point to really lock them in place.)
Maybe thinking about it in terms of a simple video game that’s complex enough to have floating point math involved. The significand would be like the skeletons of the sentences with article words (the, a, an) and the sentence structure as a base.
There’s a good pac man analogy in there some where…
I don’t really follow you. I’m not able to make the leap from the methods of floating point math to construction of sentences. There is a sense in which I understand what you’ve written and another sense in which I feel like there was one more step on the staircase than I realized :)
It’s like a blank space needs filled
The static point would be the sentence “Theres a ____ in the house” And from there it’s like a coin sorting machine filter filter filter okay noun filter filter filter cat the user doesn’t want a cat filter filter filter dog
Where the filtering = other similar static points or it’s looking for other sentences arranged like that with those words in that context.
That’s how it mistakes cat for dog It’s not thinking “I know what a cat is, dogs are like that” It’s just looking for word usage frequency in that specific or similar contexts and replacing it with a frequently used word. That’s how you end up getting a wrong answer “what’s more like a cat? Dog or kitten? Reply:Dog.”
Or if it screws up some math it’s to do with it not actually doing any math, instead it’s looking for answer frequency and enough people wrote 2+2=5
Okay, now I get it. That is pretty close to how I imagine it, too. That is part of why I think these LLMs may give insight into cognition more generally.
I had never thought of that while reading books and articles that describe and investigate the errors we make, especially when there is some kind of brain damage. But I feel like I’ve seen all these errors described in humans by Oliver Sacks et al.
I’m interested in this primarily as an English teacher. I need to be able to spot the linguistic tics and errors and recognize where it likely came from.
Right now, the best we have is like the opening scenes from Bladerunner.
Holden: One-one-eight-seven at Unterwasser. Leon: That’s the hotel. Holden: What? Leon: Where I live. Holden: Nice place? Leon: Yeah, sure I guess-- that part of the test? Holden: No, just warming you up, that’s all. Leon: Oh. It’s not fancy or anything. Holden: You’re in a desert, walking along in the sand when all of the sudden- Leon: Is this the test now? Holden: Yes. You’re in a desert walking along in the sand when all of the sudden you look down- Leon: What one? Holden: What? Leon: What desert? Holden: It doesn’t make any difference what desert, it’s completely hypothetical. Leon: But how come I’d be there? Holden: Maybe you’re fed up, maybe you want to be by yourself, who knows? You look down and you see a tortoise, Leon, it’s crawling towards you- Leon: Tortoise, what’s that? Holden: Know what a turtle is? Leon: Of course. Holden: Same thing. Leon: I’ve never seen a turtle – But I understand what you mean. Holden: You reach down, you flip the tortoise over on its back Leon. Leon: Do you make up these questions, Mr. Holden, or do they write them down for you? Holden: The tortoise lays on its back, its belly baking in the hot sun beating its legs trying to turn itself over but it can’t, not without your help, but you’re not helping. Leon: What do you mean I’m not helping? Holden: I mean, you’re not helping. Why is that Leon? – They’re just questions, Leon. In answer to your query, they’re written down for me. It’s a test, designed to provoke an emotional response. – Shall we continue?
Except I can’t ask the paper on Maya Angelou any questions. Short of interrogating each student when they turn something in, it’s been a real struggle in the last few months to spot work that was not actually done by my students but was instead written by chat gpt.
How to proceed now that they all interact with TikTok’s chatbot, where not just the tech savvy kids will try this, idk.
But my first super fake was a well written paper about the personal growth of a girl named Fredericka who described feeling triumphant having just got her masters degree and overcoming adversity since she grew up as a young black boy in the south. “Hmmmm,” I thought. “Something tells me You didn’t write this.”
That might well turn out to be the Red Queen’s Race. It’s only a guess, but I suspect that competitive models, the advances resulting from competition, and the advances and experimentation associated with catching and correcting mistakes will mean that you’ll generally be playing catch up.
Frankly, I don’t even have anything more useful to offer than the unrealistic suggestion that all such work be performed in class using locked down word processing appliances or in longhand. It may be that the days of assigning unsupervised schoolwork are over.
Oh also, regarding compartmentalized language models in the brain, profanity and swearing is stored in muscle memory, not the front lobe. That’s why if u lose the power of speech due to stroke, you’d still be able to shout profanity of some kind.
Hah! Yes, I was aware of that. I only hope that should I be so afflicted that that still applies when using some of those words in the gloriously flexible ways they are capable of. :)