`huggingface`

Module Contents

Classes

`HFTransformer`	Model wrapper around HuggingFace general models.
`HFTransformerCasualLM`	Model wrapper around HuggingFace general models.

class lagent.llms.huggingface.HFTransformer(path, tokenizer_path=None, tokenizer_kwargs=dict(), tokenizer_only=False, model_kwargs=dict(device_map='auto'), meta_template=None, **kwargs)

Bases: lagent.llms.base_llm.BaseModel

Model wrapper around HuggingFace general models.

Adapted from Internlm (https://github.com/InternLM/InternLM/blob/main/: chat/web_demo.py)

Parameters:

path (str) – The name or path to HuggingFace’s model.
max_seq_len (int) – The maximum length of the input sequence. Defaults to 2048.
tokenizer_path (str) – The path to the tokenizer. Defaults to None.
tokenizer_kwargs (dict) – Keyword arguments for the tokenizer. Defaults to {}.
tokenizer_only (bool) – If True, only the tokenizer will be initialized. Defaults to False.
model_kwargs (dict) – Keyword arguments for the model, used in loader. Defaults to dict(device_map=’auto’).
meta_template (Dict, optional) – The model’s meta prompt template if needed, in case the requirement of injecting or wrapping of any meta instructions.

tokenize(inputs)

Tokenize the input prompts.

Parameters:

prompts (str | List[str]) – user’s prompt, or a batch prompts
inputs (str) –

Returns:

prompt’s token ids, ids’ length and requested output length

Return type:

Tuple(numpy.ndarray, numpy.ndarray, numpy.ndarray)

generate(inputs, do_sample=True, **kwargs)

Return the chat completions in non-stream mode.

Parameters:

inputs (Union[str, List[str]]) – input texts to be completed.
do_sample (bool) – do sampling if enabled

Returns:

(a list of/batched) text/chat completion

stream_generate(inputs, do_sample=True, **kwargs)

Return the chat completions in stream mode.

Parameters:

inputs (Union[str, List[str]]) – input texts to be completed.
do_sample (bool) – do sampling if enabled

Returns:

status, text/chat completion, generated token number

Return type:

tuple(Status, str, int)

class lagent.llms.huggingface.HFTransformerCasualLM(path, tokenizer_path=None, tokenizer_kwargs=dict(), tokenizer_only=False, model_kwargs=dict(device_map='auto'), meta_template=None, **kwargs)

Bases: HFTransformer

Model wrapper around HuggingFace general models.

Adapted from Internlm (https://github.com/InternLM/InternLM/blob/main/: chat/web_demo.py)

Parameters:

path (str) – The name or path to HuggingFace’s model.
max_seq_len (int) – The maximum length of the input sequence. Defaults to 2048.
tokenizer_path (str) – The path to the tokenizer. Defaults to None.
tokenizer_kwargs (dict) – Keyword arguments for the tokenizer. Defaults to {}.
tokenizer_only (bool) – If True, only the tokenizer will be initialized. Defaults to False.
model_kwargs (dict) – Keyword arguments for the model, used in loader. Defaults to dict(device_map=’auto’).
meta_template (Dict, optional) – The model’s meta prompt template if needed, in case the requirement of injecting or wrapping of any meta instructions.

huggingface

Module Contents

Classes

`huggingface`