huggingface
Module Contents
Classes
Model wrapper around HuggingFace general models. |
|
Model wrapper around HuggingFace general models. |
- class lagent.llms.huggingface.HFTransformer(path, tokenizer_path=None, tokenizer_kwargs=dict(), tokenizer_only=False, model_kwargs=dict(device_map='auto'), meta_template=None, **kwargs)
Bases:
lagent.llms.base_llm.BaseModelModel wrapper around HuggingFace general models.
- Adapted from Internlm (https://github.com/InternLM/InternLM/blob/main/
chat/web_demo.py)
- Parameters:
path (str) – The name or path to HuggingFace’s model.
max_seq_len (int) – The maximum length of the input sequence. Defaults to 2048.
tokenizer_path (str) – The path to the tokenizer. Defaults to None.
tokenizer_kwargs (dict) – Keyword arguments for the tokenizer. Defaults to {}.
tokenizer_only (bool) – If True, only the tokenizer will be initialized. Defaults to False.
model_kwargs (dict) – Keyword arguments for the model, used in loader. Defaults to dict(device_map=’auto’).
meta_template (Dict, optional) – The model’s meta prompt template if needed, in case the requirement of injecting or wrapping of any meta instructions.
- tokenize(inputs)
Tokenize the input prompts.
- Parameters:
prompts (str | List[str]) – user’s prompt, or a batch prompts
inputs (str) –
- Returns:
prompt’s token ids, ids’ length and requested output length
- Return type:
Tuple(numpy.ndarray, numpy.ndarray, numpy.ndarray)
- generate(inputs, do_sample=True, **kwargs)
Return the chat completions in non-stream mode.
- Parameters:
inputs (Union[str, List[str]]) – input texts to be completed.
do_sample (bool) – do sampling if enabled
- Returns:
(a list of/batched) text/chat completion
- stream_generate(inputs, do_sample=True, **kwargs)
Return the chat completions in stream mode.
- Parameters:
inputs (Union[str, List[str]]) – input texts to be completed.
do_sample (bool) – do sampling if enabled
- Returns:
status, text/chat completion, generated token number
- Return type:
tuple(Status, str, int)
- class lagent.llms.huggingface.HFTransformerCasualLM(path, tokenizer_path=None, tokenizer_kwargs=dict(), tokenizer_only=False, model_kwargs=dict(device_map='auto'), meta_template=None, **kwargs)
Bases:
HFTransformerModel wrapper around HuggingFace general models.
- Adapted from Internlm (https://github.com/InternLM/InternLM/blob/main/
chat/web_demo.py)
- Parameters:
path (str) – The name or path to HuggingFace’s model.
max_seq_len (int) – The maximum length of the input sequence. Defaults to 2048.
tokenizer_path (str) – The path to the tokenizer. Defaults to None.
tokenizer_kwargs (dict) – Keyword arguments for the tokenizer. Defaults to {}.
tokenizer_only (bool) – If True, only the tokenizer will be initialized. Defaults to False.
model_kwargs (dict) – Keyword arguments for the model, used in loader. Defaults to dict(device_map=’auto’).
meta_template (Dict, optional) – The model’s meta prompt template if needed, in case the requirement of injecting or wrapping of any meta instructions.