OpenAI

AI-Mocks OpenAI is a specialized mock server implementation for mocking the OpenAI API, built using Mokksy.

MockOpenai is tested against official openai-java SDK and popular JVM AI frameworks: LangChain4j and Spring AI.

Currently, it supports:

Chat Completions (streaming and non-streaming)
Embeddings
Moderations

Quick Start

Include the library in your test dependencies (Maven or Gradle).

build.gradle.kts
1testImplementation("dev.mokksy.aimocks:ai-mocks-openai-jvm:$latestVersion")

pom.xml
1<dependency>
2  <groupId>dev.mokksy.aimocks</groupId>
3  <artifactId>ai-mocks-openai-jvm</artifactId>
4  <version>[LATEST_VERSION]</version>
5  <scope>test</scope>
6</dependency>

Chat Completions API

Set up a mock server and define mock responses:

1val openai = MockOpenai(verbose = true)

Let's simulate OpenAI Chat Completions API:

 1// Define mock response
 2openai.completion {
temperature = 0.7
seed = 42
model = "gpt-4o-mini"
maxTokens = 100
topP = 0.95
systemMessageContains("helpful assistant")
userMessageContains("say 'Hello!'")
10} responds {
assistantContent = "Hello"
finishReason = "stop"
delay = 200.milliseconds // delay before answer
14}
15
16// OpenAI client setup
17val client: OpenAIClient =
OpenAIOkHttpClient
  .builder()
  .apiKey("dummy-api-key")
  .baseUrl(openai.baseUrl()) // connect to mock OpenAI
  .responseValidation(true)
  .build()
24
25// Use the mock endpoint
26val params =
ChatCompletionCreateParams
  .builder()
  .temperature(0.7)
  .maxCompletionTokens(100)
  .topP(0.95)
  .messages(
    listOf(
      ChatCompletionMessageParam.ofSystem(
        ChatCompletionSystemMessageParam
          .builder()
          .content(
            "You are a helpful assistant.",
          ).build(),
      ),
      ChatCompletionMessageParam.ofUser(
        ChatCompletionUserMessageParam
          .builder()
          .content("Just say 'Hello!' and nothing else")
          .build(),
      ),
    ),
  ).model("gpt-4o-mini")
  .build()
50
51val result: ChatCompletion =
client
  .chat()
  .completions()
  .create(params)
56
57println(result)

Mocking Negative Scenarios

With AI-Mocks it is possible to test negative scenarios, such as erroneous responses and delays.

Custom Error Response

 1openai.completion {
temperature = 0.7
seed = 42
model = "gpt-4o-mini"
maxTokens = 100
systemMessageContains("helpful assistant")
userMessageContains("say 'Hello!'")
 8}.respondsError(String::class) {
body =
  // language=json
  """
  {
    "type": "error",
    "code": "ERR_SOMETHING",
    "message": "Arrr, blast me barnacles! This be not what ye expect! 🏴‍☠️",
    "param": null
  }
  """.trimIndent()
contentType = ContentType.Text.Plain
delay = 100.milliseconds
httpStatus = HttpStatusCode.PreconditionFailed
22}

OpenAI-Compatible Error Response

 1openai.completion {
temperature = 0.7
seed = 42
model = "gpt-4o-mini"
maxTokens = 100
systemMessageContains("helpful assistant")
userMessageContains("say 'Hello!'")
 8}.respondsError(String::class) {
body =
  // language=json
  """
  {
      "error": {
         "type": "server_error",
        "code": "ERR_SOMETHING",
        "message": "Arrr, blast me barnacles! This be not what ye expect! 🏴‍☠️",
        "param": "foo"
      }
  }
  """.trimIndent()
delay = 150.milliseconds
contentType = ContentType.Application.Json
httpStatus = HttpStatusCode.InternalServerError
24}

Integration with LangChain4j

You may use also LangChain4J Kotlin Extensions:

 1val model: OpenAiChatModel =
OpenAiChatModel
  .builder()
  .apiKey("dummy-api-key")
  .baseUrl(openai.baseUrl())
  .build()
 7
 8val result =
model.chat {
  parameters =
    OpenAiChatRequestParameters
      .builder()
      .temperature(0.7)
      .modelName("gpt-4o-mini")
      .maxCompletionTokens(100)
      .topP(0.95)
      .seed(42)
      .build()
  messages += userMessage("Say Hello")
}
21
22println(result)

Stream Responses

Mock streaming responses easily with flow support or a list of chunks.

Streaming with List of Chunks

 1openai.completion {
temperature = 0.7
model = "gpt-4o-mini"
topP = 0.95
 5} respondsStream {
responseChunks = listOf("All", " we", " need", " is", " Love")
delay = 50.milliseconds
delayBetweenChunks = 10.milliseconds
finishReason = "stop"
10}
11
12// Create OpenAI client
13val client: OpenAIClient =
OpenAIOkHttpClient
  .builder()
  .apiKey("dummy-key")
  .baseUrl(openai.baseUrl())
  .build()
19
20// Make streaming request
21val params =
ChatCompletionCreateParams
  .builder()
  .temperature(0.7)
  .topP(0.95)
  .messages(
    listOf(
      ChatCompletionMessageParam.ofUser(
        ChatCompletionUserMessageParam
          .builder()
          .content("What do we need?")
          .build(),
      ),
    ),
  ).model("gpt-4o-mini")
  .build()
37
38val result = StringBuilder()
39client
.chat()
.completions()
.createStreaming(params)
.use { response ->
  response
    .stream()
    .flatMap { it.choices().stream() }
    .flatMap { it.delta().content().stream() }
    .forEach { result.append(it) }
}
50
51// Result: "All we need is Love"

Streaming with Kotlin Flow

 1openai.completion {
temperature = 0.7
model = "gpt-4o-mini"
 4} respondsStream {
responseFlow =
  flow {
    emit("All")
    emit(" we")
    emit(" need")
    emit(" is")
    emit(" Love")
  }
delay = 60.milliseconds
delayBetweenChunks = 15.milliseconds
finishReason = "stop"
16}

Integration with Spring-AI

To test Spring-AI integration:

 1// create mock server
 2val openai = MockOpenai(verbose = true)
 3
 4// create Spring-AI client
 5val chatClient =
 6  ChatClient
 7    .builder(
 8      org.springframework.ai.openai.OpenAiChatModel
 9        .builder()
10        .openAiApi(
11          OpenAiApi
12            .builder()
13            .apiKey("demo-key")
14            .baseUrl(openai.baseUrl())
15            .build(),
16        ).build(),
17    ).build()
18
19// Set up a mock for the LLM call
20openai.completion {
21  temperature = 0.7
22  seed = 42
23  model = "gpt-4o-mini"
24  maxTokens = 100
25  topP = 0.95
26  topK = 40
27  systemMessageContains("helpful pirate")
28  userMessageContains("say 'Hello!'")
29} responds {
30  assistantContent = "Ahoy there, matey! Hello!"
31  finishReason = "stop"
32  delay = 200.milliseconds
33}
34
35// Configure Spring-AI client call
36val response =
37  chatClient
38    .prompt()
39    .system("You are a helpful pirate")
40    .user("Just say 'Hello!'")
41    .options<OpenAiChatOptions>(
42      OpenAiChatOptions
43        .builder()
44        .maxCompletionTokens(100)
45        .temperature(0.7)
46        .topP(0.95)
47        .model("gpt-4o-mini")
48        .seed(42)
49        .build(),
50    )
51    // Make a call
52    .call()
53    .chatResponse()
54
55// Verify the response
56response?.result shouldNotBe null
57response?.result?.apply {
58metadata.finishReason shouldBe "STOP"
59output.text shouldBe "Ahoy there, matey! Hello!"
60}

Check for examples in the integration tests.

Embeddings API

Mock the OpenAI Embeddings API to test your embeddings generation:

Basic Embedding Response

 1// Set up mock server
 2val openai = MockOpenai(verbose = true)
 3
 4// Define mock response for embedding request
 5openai.embeddings {
 6    model = "text-embedding-3-small"
 7    inputContains("Hello")
 8    stringInput("Hello world")
 9} responds {
10    delay = 200.milliseconds
11    embeddings(
12        listOf(0.1f, 0.2f, 0.3f)
13    )
14}
15
16// Create OpenAI client
17val client: OpenAIClient =
18    OpenAIOkHttpClient
19        .builder()
20        .apiKey("dummy-key")
21        .baseUrl(openai.baseUrl())
22        .responseValidation(true)
23        .build()
24
25// Make embedding request
26val params = EmbeddingCreateParams
27    .builder()
28    .model("text-embedding-3-small")
29    .input(EmbeddingCreateParams.Input.ofString("Hello world"))
30    .build()
31
32val result = client
33    .embeddings()
34    .create(params)
35
36// Verify results
37result.model() // "text-embedding-3-small"
38result.data()[0].embedding() // [0.1, 0.2, 0.3]
39result.data()[0].index() // 0

Multiple Embeddings

You can mock multiple embeddings for batch input:

 1openai.embeddings {
 2    model = "text-embedding-3-small"
 3    stringListInput(listOf("Hello", "world"))
 4} responds {
 5    delay = 100.milliseconds
 6    embeddings(
 7        listOf(0.1f, 0.2f, 0.3f),
 8        listOf(0.4f, 0.5f, 0.6f)
 9    )
10}
11
12val params = EmbeddingCreateParams
13    .builder()
14    .model("text-embedding-3-small")
15    .input(EmbeddingCreateParams.Input.ofArrayOfStrings(listOf("Hello", "world")))
16    .build()
17
18val result = client
19    .embeddings()
20    .create(params)
21
22// Returns 2 embeddings
23result.data().size // 2
24result.data()[0].embedding() // [0.1, 0.2, 0.3]
25result.data()[1].embedding() // [0.4, 0.5, 0.6]

Advanced Input Matching

You can use inputContains() to match requests where the input contains specific substrings:

1openai.embeddings {
2    model = "text-embedding-3-small"
3    inputContains("Hello")
4    inputContains("world")
5    stringInput("Hello world")
6} responds {
7    embeddings(listOf(0.1f, 0.2f, 0.3f))
8}

Error Scenarios

Test error handling for embeddings:

 1openai.embeddings {
 2    model = "text-embedding-3-small"
 3    stringInput("boom")
 4}.respondsError(String::class) {
 5    body = "Kaboom!"
 6    contentType = ContentType.Text.Plain
 7    httpStatus = HttpStatusCode.BadRequest
 8    delay = 200.milliseconds
 9}
10
11// This will throw BadRequestException
12val params = EmbeddingCreateParams
13    .builder()
14    .model("text-embedding-3-small")
15    .input(EmbeddingCreateParams.Input.ofString("invalid input"))
16    .build()
17
18try {
19    client.embeddings().create(params)
20} catch (e: BadRequestException) {
21    // Handle error
22}

Moderations API

Mock the OpenAI Moderations API to test content moderation:

Basic Moderation Response

 1// Set up mock server
 2val openai = MockOpenai(verbose = true)
 3
 4// Define mock response for moderation request
 5openai.moderation {
 6    model = "omni-moderation-latest"
 7    inputContains("Hello world")
 8} responds {
 9    flagged = true
10    delay = 200.milliseconds
11    category(name = "harassment", score = 0.1, inputTypes = listOf(TEXT))
12    category(
13        name = ModerationCategory.SEXUAL,
14        score = 0.2,
15        inputTypes = listOf(TEXT, InputType.IMAGE)
16    )
17}
18
19// Create OpenAI client
20val client: OpenAIClient =
21    OpenAIOkHttpClient
22        .builder()
23        .apiKey("dummy-key")
24        .baseUrl(openai.baseUrl())
25        .responseValidation(true)
26        .build()
27
28// Make moderation request
29val params =
30    ModerationCreateParams
31        .builder()
32        .model("omni-moderation-latest")
33        .input("Hello world")
34        .build()
35
36val result = client
37    .moderations()
38    .create(params)
39
40// Verify results
41result.model() // "omni-moderation-latest"
42result.results()[0].flagged() // true
43result.results()[0].categories().harassment() // true
44result.results()[0].categoryScores().harassment() // 0.1
45result.results()[0].categoryAppliedInputTypes().harassment() // [TEXT]

Moderation Error Scenarios

 1openai.moderation {
 2    model = "omni-moderation-latest"
 3    inputContains("boom")
 4}.respondsError(String::class) {
 5    body = "Kaboom!"
 6    contentType = ContentType.Text.Plain
 7    httpStatus = HttpStatusCode.BadRequest
 8    delay = 200.milliseconds
 9}
10
11// This will throw BadRequestException
12val params = ModerationCreateParams
13    .builder()
14    .model("omni-moderation-latest")
15    .input("boom")
16    .build()
17
18try {
19    client.moderations().create(params)
20} catch (e: BadRequestException) {
21    // Handle error
22}