SМ
Size: a a a
SМ
SМ
A
SP
SМ
A
AK
A
DD
A
A
A
AL
AL
AL
A
AL
is_split_into_words=True
https://huggingface.co/transformers/main_classes/tokenizer.html#transformers.PreTrainedTokenizer.__call__tokenized_inputs = tokenizer(['mylongtoken', 'and', 'friends'], is_split_into_words=True)
for i, input_ids in enumerate(tokenized_inputs['input_ids']):
original_word_ids = tokenized_inputs.word_ids(i)
AK
AK