LLMs¶

WalledEval's LLM architecture aims to support various kinds of LLMs. These LLMs are used as systems-under-test (SUTs), which allows generating question answers and prompt outputs. Below is a list of model families we attempt to support.

Model Family	Supported Versions	WalledEval Class
GPT	3.5 Turbo, 4, 4 Turbo, 4o	`llm.OpenAI`
Claude	Sonnet 3.5, Opus 3, Sonnet 3, Haiku 3	`llm.Claude`
Gemini	1.5 Flash, 1.5 Pro, 1.0 Pro	`llm.Gemini`
Cohere Command	R+, R, Base, Light	`llm.Cohere`

We also support a large variety of connectors to other major LLM runtimes, like HuggingFace and TogetherAI. Below is a list of some of the many connectors present in WalledEval.

Connector	Connector Type	WalledEval Class
HuggingFace	Local, runs LLM on computer	`llm.HF_LLM`
`llama.cpp`	Local, runs LLM on computer	`llm.Llama`
Together	Online, makes API calls	`llm.Together`
Groq	Online, makes API calls	`llm.Groq`
Anyscale	Online, makes API calls	`llm.Anyscale`
OctoAI	Online, makes API calls	`llm.OctoAI`
Azure OpenAI	Online, makes API calls	`llm.AzureOpenAI`

The HF_LLM is an example of a LLM class that loads models from HuggingFace. Here, we load Unsloth's 4-bit-quantized Llama 3 8B model as follows. The type is essentially used to indicate that we are loading an instruction-tuned model so it does inference based on that piece of information. It is important that we do this because we don't want the model to autocomplete responses to the prompt, but instead complete chat responses to the prompt.

We can then prompt this LLM using the chat method, and we have tried to get it to generate a response the same way a Swiftie would.

WalledEval attempts