huggingface

Module Contents

Classes

HFTransformer

Model wrapper around HuggingFace general models.

HFTransformerCasualLM

Model wrapper around HuggingFace general models.

class lagent.llms.huggingface.HFTransformer(path, tokenizer_path=None, tokenizer_kwargs=dict(), tokenizer_only=False, model_kwargs=dict(device_map='auto'), meta_template=None, **kwargs)

Bases: lagent.llms.base_llm.BaseModel

Model wrapper around HuggingFace general models.

Adapted from Internlm (https://github.com/InternLM/InternLM/blob/main/

chat/web_demo.py)

Parameters:
  • path (str) – The name or path to HuggingFace’s model.

  • max_seq_len (int) – The maximum length of the input sequence. Defaults to 2048.

  • tokenizer_path (str) – The path to the tokenizer. Defaults to None.

  • tokenizer_kwargs (dict) – Keyword arguments for the tokenizer. Defaults to {}.

  • tokenizer_only (bool) – If True, only the tokenizer will be initialized. Defaults to False.

  • model_kwargs (dict) – Keyword arguments for the model, used in loader. Defaults to dict(device_map=’auto’).

  • meta_template (Dict, optional) – The model’s meta prompt template if needed, in case the requirement of injecting or wrapping of any meta instructions.

tokenize(inputs)

Tokenize the input prompts.

Parameters:
  • prompts (str | List[str]) – user’s prompt, or a batch prompts

  • inputs (str) –

Returns:

prompt’s token ids, ids’ length and requested output length

Return type:

Tuple(numpy.ndarray, numpy.ndarray, numpy.ndarray)

generate(inputs, do_sample=True, **kwargs)

Return the chat completions in non-stream mode.

Parameters:
  • inputs (Union[str, List[str]]) – input texts to be completed.

  • do_sample (bool) – do sampling if enabled

Returns:

(a list of/batched) text/chat completion

stream_generate(inputs, do_sample=True, **kwargs)

Return the chat completions in stream mode.

Parameters:
  • inputs (Union[str, List[str]]) – input texts to be completed.

  • do_sample (bool) – do sampling if enabled

Returns:

status, text/chat completion, generated token number

Return type:

tuple(Status, str, int)

class lagent.llms.huggingface.HFTransformerCasualLM(path, tokenizer_path=None, tokenizer_kwargs=dict(), tokenizer_only=False, model_kwargs=dict(device_map='auto'), meta_template=None, **kwargs)

Bases: HFTransformer

Model wrapper around HuggingFace general models.

Adapted from Internlm (https://github.com/InternLM/InternLM/blob/main/

chat/web_demo.py)

Parameters:
  • path (str) – The name or path to HuggingFace’s model.

  • max_seq_len (int) – The maximum length of the input sequence. Defaults to 2048.

  • tokenizer_path (str) – The path to the tokenizer. Defaults to None.

  • tokenizer_kwargs (dict) – Keyword arguments for the tokenizer. Defaults to {}.

  • tokenizer_only (bool) – If True, only the tokenizer will be initialized. Defaults to False.

  • model_kwargs (dict) – Keyword arguments for the model, used in loader. Defaults to dict(device_map=’auto’).

  • meta_template (Dict, optional) – The model’s meta prompt template if needed, in case the requirement of injecting or wrapping of any meta instructions.