How are LLMs built?

Abhishek Raj

Mar 20, 2024

Building blocks of LLM and Attention is all you need.

Read →

5 Comments

Subrat Sahoo

Mar 21, 2024

Hey thanks for sharing your knowledge, this article was one of the best. Waiting for your next article.

Btw in the part that you mentioned about tokenisers, you used a library to break the sentence. In that case you mentioned you could have used the split method ?

Or can you suggest use cases for it ?

Expand full comment

Reply (2)

Abhishek Raj

Mar 22, 2024

no, tokens length varies on AI models, suppose some has token length of 4

then "My name is Abhishek." will be ["My", "name", "is", "abhi", "shek", ".']

Though you can write your own function, if writing the AI algo from scratch.

Expand full comment

Reply (1)

Subrat Sahoo

Mar 23, 2024

Got it

Expand full comment

Subrat Sahoo

Mar 21, 2024

Also I tried to use the code for tokenize, I needed to add nltk.download(‘punkt’) to it

Expand full comment

Reply (1)

Abhishek Raj

Mar 22, 2024

yes, you will need to pip install nltk

Expand full comment

Inversible

How are LLMs built?