Skip to content

LLMs

WalledEval's LLM architecture aims to support various kinds of LLMs. These LLMs are used as systems-under-test (SUTs), which allows generating question answers and prompt outputs. Below is a list of model families we attempt to support.

Model Family Supported Versions WalledEval Class
GPT 3.5 Turbo, 4, 4 Turbo, 4o llm.OpenAI
Claude Sonnet 3.5, Opus 3, Sonnet 3, Haiku 3 llm.Claude
Gemini 1.5 Flash, 1.5 Pro, 1.0 Pro llm.Gemini
Cohere Command R+, R, Base, Light llm.Cohere

We also support a large variety of connectors to other major LLM runtimes, like HuggingFace and TogetherAI. Below is a list of some of the many connectors present in WalledEval.

Connector Connector Type WalledEval Class
HuggingFace Local, runs LLM on computer llm.HF_LLM
llama.cpp Local, runs LLM on computer llm.Llama
Together Online, makes API calls llm.Together
Groq Online, makes API calls llm.Groq
Anyscale Online, makes API calls llm.Anyscale
OctoAI Online, makes API calls llm.OctoAI
Azure OpenAI Online, makes API calls llm.AzureOpenAI

The HF_LLM is an example of a LLM class that loads models from HuggingFace. Here, we load Unsloth's 4-bit-quantized Llama 3 8B model as follows. The type is essentially used to indicate that we are loading an instruction-tuned model so it does inference based on that piece of information. It is important that we do this because we don't want the model to autocomplete responses to the prompt, but instead complete chat responses to the prompt.

We can then prompt this LLM using the chat method, and we have tried to get it to generate a response the same way a Swiftie would.

WalledEval attempts