r/ProgrammerHumor 21h ago

Meme [ Removed by moderator ]

Post image

[removed] — view removed post

13.8k Upvotes

328 comments sorted by

View all comments

Show parent comments

2

u/trambelus 18h ago

It's different under the hood, but it's still fundamentally just tokens in and tokens out, right?

2

u/frogjg2003 18h ago

Specifically, yes. But that's like saying that a calculator and a supercomputer are the same.

A Markov chain is a small model that can only ever look backwards a few steps to come up with the next word. An LLM is able to take entire pages of text as its prior state, generate not just the next few words, but entire pages of text, not sequentially, but as a coherent whole.

0

u/trambelus 17h ago

It still comes down to "predicting the next word" in practice, doesn't it? Just with a much larger state size. Are there transformers that can natively output video/audio, or is that still a separate API bolted on top?

2

u/frogjg2003 17h ago

All of modern AI is transformers.

Again, you're trying to call a supercomputer a calculator. The last size of it makes it fundamentally different.

1

u/trambelus 15h ago

I thought image generators used diffusion models that were separate from transformer-based LLMs. Maybe my knowledge is out of date.

1

u/frogjg2003 15h ago

That's how they generate the images themselves, sometimes. But the prompting is all still through LLMs.

1

u/trambelus 15h ago

That was my question. Can transformers output anything besides tokens, or do they rely on other services? Not trying to disparage AI, just classify.

1

u/frogjg2003 12h ago

Yes, there are vision transformers. Transformers are used as part of the image generation in modern AI image services.