About 5,750,000 results
Open links in new tab
  1. What does Keras Tokenizer method exactly do? - Stack Overflow

    On occasion, circumstances require us to do the following: from keras.preprocessing.text import Tokenizer tokenizer = Tokenizer(num_words=my_max) Then, invariably, we chant this mantra: …

  2. How to do Tokenizer Batch processing? - HuggingFace

    Jun 7, 2023 · in the Tokenizer documentation from huggingface, the call fuction accepts List [List [str]] and says: text (str, List [str], List [List [str]], optional) — The sequence or batch of …

  3. Looking for a clear definition of what a "tokenizer", "parser" and ...

    Mar 28, 2018 · A tokenizer breaks a stream of text into tokens, usually by looking for whitespace (tabs, spaces, new lines). A lexer is basically a tokenizer, but it usually attaches extra context …

  4. OpenAI API: How do I count tokens before(!) I send an API request?

    Mar 21, 2023 · How do I count tokens before (!) I send an API request? As stated in the official OpenAI article: To further explore tokenization, you can use our interactive Tokenizer tool, …

  5. How to add new tokens to an existing Huggingface tokenizer?

    May 8, 2023 · And then it points to the train_new_from_iterator() function in Chapter 7 but I can't seem to find reference to how to use it to extend the tokenizer without re-training it.

  6. Huggingface saving tokenizer - Stack Overflow

    Oct 27, 2020 · Make sure that: - './models/tokenizer/' is a correct model identifier listed on 'https://huggingface.co/models' - or './models/tokenizer/' is the correct path to a directory …

  7. How to save a tokenizer after training it? - Stack Overflow

    Aug 12, 2021 · I have just followed this tutorial on how to train my own tokenizer. Now, from training my tokenizer, I have wrapped it inside a Transformers object, so that I can use it with …

  8. python - AutoTokenizer.from_pretrained fails to load locally saved ...

    from transformers import AutoTokenizer, AutoConfig tokenizer = AutoTokenizer.from_pretrained('distilroberta-base') config = …

  9. huggingface transformers - TypeError in SFTTrainer: Unexpected …

    Mar 18, 2025 · I'm trying to fine-tune a model using SFTTrainer from trl, but I'm facing multiple TypeError issues related to unexpected keyword arguments. from transformers import ...

  10. How to load the saved tokenizer from pretrained model

    Oct 16, 2019 · I fine-tuned a pretrained BERT model in Pytorch using huggingface transformer. All the training/validation is done on a GPU in cloud. At the end of the training, I save the model …