Investment giant Goldman Sachs published a research paper
Goldman Sachs researchers also say that
It’s not a research paper; it’s a report. They’re not researchers; they’re analysts at a bank. This may seem like a nit-pick, but journalists need to (re-)learn to carefully distinguish between the thing that scientists do and corporate R&D, even though we sometimes use the word “research” for both. The AI hype in particular has been absolutely terrible for this. Companies have learned that putting out AI “research” that’s just them poking at their own product but dressed up in a science-lookin’ paper leads to an avalanche of free press from lazy credulous morons gorging themselves on the hype. I’ve written about this problem a lot. For example, in this post, which is about how Google wrote a so-called paper about how their LLM does compared to doctors, only for the press to uncritically repeat (and embellish on) the results all over the internet. Had anyone in the press actually fucking bothered to read the paper critically, they would’ve noticed that it’s actually junk science.
A big part of the problem – and this is not a new issue, goes back decades – is that a lot of terms in AI-land don’t correspond to concrete capabilities, so it’s easy to claim that you do X when X is generally-perceived to be a much-more-sophisticated thing than what you’re actually doing, even if your thing technically qualifies as X by some definition.
None of this in any way conflicts with my position that AI has tremendous potential. But if people are investing money without having a solid understanding of what they’re investing in, there are going to be people out there misrepresenting their product.
saying the quiet part out loud… big tech won’t like that.
I’ve found like, 4 tasks that are really helped with by AI, and I don’t have the faintest idea how you could monetize any of them beyond “Subscribe to chatgpt”
At my previous job their was a role where you just called insurance companies and asked them incredibly basic questions about what they planned to do for a patient with diagnosis X and plan Y. This information should be searchable in a document with a single correct answer, but insurance companies are too scummy for that to be reliable.
In 2021 we started using a robot that sounded like a human to call instead. It could handle the ~80%+ of calls that don’t use any critical thinking. At a guess, that’s maybe 5-10% of our division’s workforce that wasn’t needed anymore.
With the amount of jobs like this that are 100% bullshit, I’m sure there are plenty of other cases where businesses can save money by buying an automated bullshit generator, instead of hiring a breathing bullshit generator.
The problem is that 20% failure rate has no validation and you are 100% liable for the failures of an AI you’re using as a customer support agent, which can end up costing you a ton and killing your reputation. The unfixable problem is that an AI solution takes a ton of effort to validate, way more than just double checking a human answer.
It’s not a 20% failure rate when the chatbot routes calls to a human agent whenever it’s more than x% unsure about what to say.
AI solutions still get the 80% “bottom of the barrel” menial tasks perfectly well.
It wont know it doesn’t know. At the current state of AI, it doesn’t seem to have almost any sense of what is right and wrong or a way to validate that - even when you tell it, it is wrong. Maybe there are systems that can but I am not aware of them.
The current state of AI chatbots, assigns a “confidence level” to every piece of output. It signals perfectly well when and where they should look for more information… but humans have been pushing them to “output something, anything”, instead of excusing itself for not knowing something, or running some additional processes in order to look for the missing information.
As of this year, Copilot has been running web searches to complement its lack of information, and Gemini is running both web searches, and iteratively self-checking its own answer in order to refine it (see “drafts”). It also seems like Gemini might be learning from humanity’s reactions to its wrong answers.
I thought confidence levels were for image recognition? How do confidence levels work for transformer LLMs?
LLMs generate output one token at a time. Each token comes with a confidence level by the model, about whether it’s the only possible token to continue the sequence. A model is only 100% confident in its output, if it reproduces a training text verbatim. With any temperature above 0, they veer off the 100% confidence path, which lets them leverage the concept association they came up with during training, makes their output more useful.
For every generated text, you could get a confidence heat map, then ask the model to refine sections that don’t meet a desired level of confidence. Especially the parts where a model makes stuff up, or hallucinates, are likely token sequences with much lower confidence than the rest.
Running a model several times, focusing on the sections with lower confidence, getting additional data from other sources like the internet, or some niche expert system, could eliminate many of the nonsense sections… and I have a reasonably suspicion that Google’s Gemini does exactly that, refining each output with 4 additional iterations, instead of blindly spitting out the first one.
In other news: water is wet and bears shit in the woods
Go-dAmn Sachs is wrong often, but in this I think they’re on point. Learned from the Crypto insanity.
Broken clock etc.
And yet, worth 150 billion.
It’s costing them money, and they’re not sure they’re going to get it back.
Naw if they’re publicly bashing it they’ve already dumped on all the downside risk onto their customers and now they’re net short.
Removed by mod
You’re right. Once it settles into its niches and the hype dies down, it won’t be overhyped anymore because everyone will have moved on.
I’ve been working with generative AI for years now and we still struggle to solve real world problems with it. It isn’t useless or anything. It’s way too unreliable, and this isn’t one of those things where time will solve it - it’s being used to solve problems that have no perfect solutions, like human interfacing and generating culturally-appropriate and visually-accurate images. I’d expect it to improve at those tasks over time, but the scope needs to drop from every problem humanity has ever faced to the problems that these models are good at solving.
Removed by mod
deleted by creator
If Goldman Sachs said that, then most likely the opposite is true.
I’m surprised how everyone here believes what that capitalist company is saying, just because it fits their own narrative of AI being useless.
If Goldman Sachs said that, than most likely the opposite is true.
What makes you say that?
There are studies that suggest that the information investment firms publish is not based on what they believe to be true, but on what they want others, including their competitors, believe to be true. And in many cases for serving their investment strategy, it benefits them to publish the opposite of what they believe to be true.
Intentions aside, it’s just some independent research that anyone can review and critique. If the research is bad then it should be pointed out and won’t be taken seriously, undermining any influence from Goldman Sachs now and in the future
AI has been overhyped since it first played tic-tac-toe in the 1950s. One definition of “AI” is: “an algorithm that people don’t understand… yet” 🤷
The stuff they’re calling ai now is just predictive text algorithms. I really can’t wait to stop hearing about this because it is all artificial with no intelligence.
LLMs have been shown to have emergent math capabilities (that are the opposite of what is trained) so you’re simplifying way too much. Yes a lot is just “predictive text” but there’s a ton of “this was not the training and we don’t know how it knows this” as well.
Not exactly.
LLMs are predictive-associative token algorithms with a degree of randomness and some self-reflection. A key aspect is that anything can be a token, they can self-feed their own output, creating the basis for a thought cycle, as well as output control input for other algorithms. It remains to be seen whether the core of “(human) intelligence” is much more than that, and by how much.
Stable Diffusion is a random image generator that refines its output based on perceptual traits associated with a prompt. It’s like a “lite” version of human dreaming, only with a super-human training set. Kind of an “uncanny valley” version of dreaming.
It just so happens that both algorithms have been showcased at about the same time, and it’s the first time we can build a “set and forget” AI system that can both make decisions about its own next steps, and emulate human creativity… which has driven the hype into overdrive.
I don’t think we’ll stop hearing about it, but I do think there is much more to be done, and it’s pretty much impossible to feed any of the algorithms with human experience data, without registering at least one human learning cycle, as in over many years from inside a humanoid robot.
LLMs are predictive associative token algorithms
Ah, so they produce parts of words instead of whole words at a time. Totally different.
with a degree of randomness and self reflection.
And they’re hooked up to random number generators so if you give it the same input twice you’ll get different output. Totally makes it smarter.
A key aspect is that anything can be a token
…much like predictive text. Rarely will you find one that doesn’t suggest punctuation on occasion.
they can self feed their own output
…much like predictive text.
as well as output control input for other algorithms.
Oh, so you can tell it to suggest certain tokens more or less often. How fancy.
It remains to be seen whether the core of human intelligence is much more than that.
I mean, I’d say the ability to visualize things and reason about scenarios it hasn’t experienced before are a good start.
Not sure if you were unable or unwilling to understand anything of what I wrote, and I don’t like your tone. Feel free to come back with something more serious.
You know it’s funny how many times I’ve heard that “it’s just predictive text algorithms!” As a dismissal that I’m beginning to think we’re just predictive text algorithms.
We are prediction machines, but nothing like chatgpt. Current AI has no ability to learn, adapt, or even consider the future.
Current AI has no ability to learn, adapt, or even consider the future.
BS. The first two are all a neural net does.
Once. They do not have the ability to learn or adapt on their own. They are created by humans through “deep learning”, but that is fundamentally different from continuously learning based on one’s own actions and experiences.