Koog
Test Koog applications against AI-Mocks provider endpoints. This guide uses a verified OpenAI-backed Spring Boot example for chat, streaming, moderation, and failure paths.
Koog is an AI framework, so the appropriate AI-Mocks module depends on the provider configured in your application. Use the corresponding AI-Mocks provider guide when Koog is configured for a supported provider.
The verified end-to-end example on this page is the OpenAI-backed pattern from the
koog-spring-boot-assistant
integration tests. In that setup, Koog talks to an OpenAI-compatible provider, so the integration
point is AI-Mocks OpenAI rather than plain Mokksy. The tests start
MockOpenai, point Koog at mockOpenai.baseUrl(), and exercise the real Spring Boot application
through HTTP and WebSocket clients.
Workflow context from the sample repo
The sample application is not a single prompt-in, prompt-out flow. Its README describes a Koog strategy with moderation, request mapping, streaming LLM output, and tool execution. That context matters because the integration tests stub several provider endpoints, not just one chat response.
---
title: streaming-strategy
---
stateDiagram
state "moderate-input" as moderate_input
state "mapStringToRequests" as mapStringToRequests
state "applyRequestToSession" as applyRequestToSession
state "nodeStreaming" as nodeStreaming
state "executeMultipleTools" as executeMultipleTools
state "mapToolCallsToRequests" as mapToolCallsToRequests
[*] --> moderate_input : transformed
moderate_input --> mapStringToRequests : transformed
moderate_input --> [*] : transformed
mapStringToRequests --> applyRequestToSession
applyRequestToSession --> nodeStreaming
nodeStreaming --> executeMultipleTools : onCondition
nodeStreaming --> [*] : onCondition
executeMultipleTools --> mapToolCallsToRequests
mapToolCallsToRequests --> applyRequestToSession
The same repository also exposes this graph through /api/koog/strategy/graph, and the
integration tests assert that the endpoint returns Mermaid output for the running strategy.
Inject the mock server into Koog
The sample app starts MockOpenai once in the test environment, prepares deterministic embeddings
for RAG ingestion, and then injects the mock base URL into Koog before Spring Boot starts:
1object TestEnvironment {
2 val mockOpenai = MockOpenai(verbose = true)
3
4 init {
5 System.setProperty("OPENAI_API_KEY", "dummyOpenAIKey")
6 System.setProperty("spring.profiles.active", "test")
7
8 listOf(
9 "Care for Magical Trees",
10 "Valley of Light",
11 "Magical Bow",
12 "Morning Pine Elixir",
13 "Teleportation and Portals",
14 ).forEach {
15 mockOpenai.embeddings {
16 inputContains(it)
17 } responds {
18 delay = 1.milliseconds
19 }
20 }
21 }
22}
23
24object Server {
25 init {
26 System.setProperty("ai.koog.openai.base-url", TestEnvironment.mockOpenai.baseUrl())
27
28 SpringApplication.run(
29 com.example.app.Application::class.java,
30 "--server.port=0",
31 "--spring.profiles.active=test",
32 )
33 }
34}
This keeps the real Koog and Spring Boot wiring intact while replacing the provider dependency with a deterministic local server.
The sample application also performs embedding requests during startup for RAG ingestion. Those embedding stubs must exist before Spring Boot starts, or the application will make unmatched calls while the test environment is still booting.
Test the full Koog request path
The positive-path test in the sample repo drives the real application client, not Koog internals. It stubs embeddings, moderation, and the chat completion stream, then verifies the final answer:
1mockOpenai.embeddings {
2 stringInput(question)
3} responds {
4 delay = 40.milliseconds
5}
6
7mockOpenai.moderation {
8 inputContains(question)
9} responds {
10 flagged = false
11}
12
13mockOpenai.completion {
14 systemMessageContains("witty and wise Elven assistant guiding adventurers")
15 userMessageContains(question)
16} respondsStream {
17 responseFlow = flowOf(expectedAnswer)
18}
19
20val response = chatClient.sendMessage(question)
That test shape is useful when you want to prove prompt routing, moderation checks, RAG lookups, and provider calls still produce the expected application response.
Stream token-by-token output
The same repo includes a WebSocket integration test that verifies streaming delivery timing. The mock server emits one token chunk at a time with a fixed delay between chunks:
1val delayBetweenChunks = 500.milliseconds
2
3mockOpenai.completion {
4 systemMessageContains("witty and wise Elven assistant guiding adventurers")
5 userMessageContains(question)
6} respondsStream {
7 responseFlow =
8 expectedTokens
9 .asFlow()
10 .onEach { delay(delayBetweenChunks) }
11}
The test then measures the WebSocket output and checks that Koog forwards the token stream with the expected pacing. This is the right place to catch buffering mistakes and streaming regressions.
Exercise moderation and failure paths
The sample repo does not stop at happy-path chat. It also verifies:
- moderation blocking with
mockOpenai.moderation { ... } responds { flagged = true } - embedding failures with
respondsError { httpStatusCode = ... } - moderation API failures with fallback behavior
- LLM request failures for both SSE and non-streaming completion paths
For example, the failure test uses provider-like HTTP status codes such as 400, 401, 403,
404, 418, 500, and 503, then verifies that the application returns a stable fallback
message instead of crashing.
Verify Koog-specific endpoints too
The repo also tests a Koog strategy-graph endpoint by fetching
/api/koog/strategy/graph and asserting that the response contains Mermaid state-diagram output.
That is a useful pattern when your application exposes Koog diagnostics or graph introspection
routes in addition to chat endpoints.