Hey thanks for sharing your knowledge, this article was one of the best. Waiting for your next article.
Btw in the part that you mentioned about tokenisers, you used a library to break the sentence. In that case you mentioned you could have used the split method ?
Hey thanks for sharing your knowledge, this article was one of the best. Waiting for your next article.
Btw in the part that you mentioned about tokenisers, you used a library to break the sentence. In that case you mentioned you could have used the split method ?
Or can you suggest use cases for it ?
no, tokens length varies on AI models, suppose some has token length of 4
then "My name is Abhishek." will be ["My", "name", "is", "abhi", "shek", ".']
Though you can write your own function, if writing the AI algo from scratch.
Got it
Also I tried to use the code for tokenize, I needed to add nltk.download(‘punkt’) to it
yes, you will need to pip install nltk