# Mokksy

> Mokksy and AI-Mocks - mock HTTP APIs with real-world behavior for Java and Kotlin integration tests

## Docs


- [Quick Start (5 minutes)](https://mokksy.dev/docs/mokksy/quick-start.md): Quick Start (5 minutes)
- [First integration test](https://mokksy.dev/docs/mokksy/first-integration-test.md): First integration test
- [Stubbing responses](https://mokksy.dev/docs/mokksy/stubbing.md): Stubbing responses
- [Request matching](https://mokksy.dev/docs/mokksy/request-matching.md): Match incoming requests with path, header, body, predicate, and call matchers, then resolve conflicts with specificity and priority.
- [Verification and request journal](https://mokksy.dev/docs/mokksy/verification.md): Verification and request journal
- [Multipart and file uploads](https://mokksy.dev/docs/mokksy/multipart.md): Multipart and file uploads
- [Streaming and SSE](https://mokksy.dev/docs/mokksy/streaming.md): Server-Sent Events (SSE) enable servers to push updates to clients over a single HTTP connection. The provided code demonstrates how to use mokksy to simulate an SSE stream and verify its response in both Kotlin and Java.
- [Failure simulation](https://mokksy.dev/docs/mokksy/failure-simulation.md): Failure simulation
- [File-based configuration](https://mokksy.dev/docs/mokksy/file-config.md): File-based configuration
- [Docker](https://mokksy.dev/docs/mokksy/docker.md): Docker
- [Ktor integration](https://mokksy.dev/docs/mokksy/ktor.md): Embed Mokksy directly inside a Ktor application for integration tests, internal API simulation, and authenticated stub routes.

- [Anthropic](https://mokksy.dev/docs/ai-mocks/anthropic.md): Anthropic
- [OpenAI](https://mokksy.dev/docs/ai-mocks/openai.md): OpenAI
- [Gemini](https://mokksy.dev/docs/ai-mocks/gemini.md): Gemini
- [Ollama](https://mokksy.dev/docs/ai-mocks/ollama.md): Ollama
- [A2A Protocol](https://mokksy.dev/docs/ai-mocks/a2a.md): Agent2Agent (A2A) Protocol

- [Spring Boot](https://mokksy.dev/docs/integrations/spring-boot.md): Use Mokksy as a mock HTTP server in Spring Boot integration tests by pointing application properties, RestClient, or WebClient at mokksy.baseUrl().
- [Quarkus](https://mokksy.dev/docs/integrations/quarkus.md): Test Quarkus applications against Mokksy or AI-Mocks by replacing outbound HTTP and AI provider dependencies with deterministic local endpoints.
- [LangChain4j](https://mokksy.dev/docs/integrations/langchain4j.md): Use AI-Mocks with LangChain4j to test provider-backed chat and streaming flows without real OpenAI, Anthropic, or Ollama calls.
- [Spring AI](https://mokksy.dev/docs/integrations/spring-ai.md): Test Spring AI clients against provider-compatible AI-Mocks servers for deterministic OpenAI and Gemini behavior, including streaming.
- [OpenAI Java SDK](https://mokksy.dev/docs/integrations/openai-sdk.md): Use AI-Mocks OpenAI with the official openai-java SDK for deterministic chat, streaming, embeddings, moderation, and error-path integration tests.
- [Anthropic Java SDK](https://mokksy.dev/docs/integrations/anthropic-sdk.md): Use AI-Mocks Anthropic with the official Anthropic Java SDK for deterministic Messages API and streaming integration tests.
- [Koog](https://mokksy.dev/docs/integrations/koog.md): Test Koog applications against AI-Mocks provider endpoints. This guide uses a verified OpenAI-backed Spring Boot example for chat, streaming, moderation, and failure paths.

- [Mokksy vs WireMock](https://mokksy.dev/docs/compare/wiremock.md): Compare Mokksy and WireMock for HTTP integration testing, SSE, chunked streaming, deterministic failures, and Kotlin-first JVM tests.
---


# Quick Start (5 minutes)


This guide gets you from an empty test to a local HTTP mock server. You will stub one endpoint, call it through a real HTTP client, and verify the response.

## Add the test dependency

Add Mokksy to the test classpath. Most JVM projects should use `mokksy-jvm` as a test dependency.

```kotlin
dependencies {
  testImplementation("dev.mokksy:mokksy-jvm:$latestVersion")
}
```
```xml
<dependency>
  <groupId>dev.mokksy</groupId>
  <artifactId>mokksy-jvm</artifactId>
  <version>[LATEST_VERSION]</version>
  <scope>test</scope>
</dependency>
```
## Stub and call an HTTP endpoint

Start Mokksy before the system under test creates its HTTP client, register the expected stub, then call the endpoint through a real HTTP client.

```kotlin
// before SUT starts
val mokksy = Mokksy(verbose = true).start()

// SUT setup
val client = HttpClient {
  install(DefaultRequest) {
    url(mokksy.baseUrl())
  }
}

// Given - before test
mokksy.get {
  path("/accounts/42")
} respondsWith {
  body = """{"id":"42","status":"active"}"""
  httpStatus = HttpStatusCode.OK
}

// When
val response = client.get("/accounts/42")

// Then
response.status shouldBe HttpStatusCode.OK
response.bodyAsText() shouldBe """{"id":"42","status":"active"}"""
```
```java
// before SUT starts
var mokksy = Mokksy.create().start();

// Given - before test
mokksy.get(spec -> spec.path("/accounts/42"))
    .respondsWith(response -> response
        .body("{\"id\":\"42\",\"status\":\"active\"}")
        .status(200));

// When
var httpClient = HttpClient.newHttpClient();
var response = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/accounts/42"))
        .GET()
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

// Then
assertThat(response.statusCode()).isEqualTo(200);
assertThat(response.body()).isEqualTo("{\"id\":\"42\",\"status\":\"active\"}");

// after SUT stop (or never)
mokksy.shutdown();

```
## What this proves

- Your test talks to a real HTTP server.
- The external service is replaced by Mokksy.
- The response is deterministic and can run in CI without API keys or network access.

Next, build a complete [first integration test](../first-integration-test/) or test a [streaming API](../streaming/).


# First integration test


Use Mokksy when your code normally calls an external HTTP API: payments, customer data, fraud scoring, telecom provisioning, document processing, or internal platform services.

```text
Application under test -> Mokksy -> Stubbed external HTTP API
```

## Test shape

1. Start Mokksy on a random local port.
2. Configure the application under test to use `mokksy.baseUrl()`.
3. Stub the external endpoint and response.
4. Execute the real application behavior.
5. Verify the response and the request journal.

## Example

```kotlin
val mokksy = Mokksy(verbose = true).start()
val client = HttpClient {
  install(DefaultRequest) {
    url(mokksy.baseUrl())
  }
}

mokksy.post {
  path("/risk/check")
  bodyContains("customer-123")
} respondsWith {
  httpStatus = HttpStatusCode.Accepted
  body = """{"decision":"review"}"""
}

val response = client.post("/risk/check") {
  contentType(ContentType.Application.Json)
  setBody("""{"customerId":"customer-123","amount":2500}""")
}

response.status shouldBe HttpStatusCode.Accepted
response.bodyAsText() shouldBe """{"decision":"review"}"""

mokksy.verifyNoUnexpectedRequests()
mokksy.verifyNoUnmatchedStubs()
```
```java
var mokksy = Mokksy.create().start();
var httpClient = HttpClient.newHttpClient();

try {
    mokksy.post(spec -> spec
        .path("/risk/check")
        .bodyContains("customer-123")
    ).respondsWith(response -> response
        .status(202)
        .body("{\"decision\":\"review\"}"));

    var response = httpClient.send(
        HttpRequest.newBuilder()
            .uri(URI.create(mokksy.baseUrl() + "/risk/check"))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(
                "{\"customerId\":\"customer-123\",\"amount\":2500}"
            ))
            .build(),
        HttpResponse.BodyHandlers.ofString()
    );

    assertThat(response.statusCode()).isEqualTo(202);
    assertThat(response.body()).isEqualTo("{\"decision\":\"review\"}");

    mokksy.verifyNoUnexpectedRequests();
} finally {
    mokksy.shutdown();
}
```
This catches two important integration failures: your code sent the wrong request, or it did not call the dependency at all.


# Stubbing responses

Mokksy supports all HTTP verbs. Here are some examples.

## GET request

```kotlin
// given
val expectedResponse =
  // language=json
  """
    {
        "response": "Pong"
    }
    """.trimIndent()

mokksy.get {
  path = beEqual("/ping")
  containsHeader("Foo", "bar")
} respondsWith {
  body = expectedResponse
}

// when
val result = client.get("/ping") {
  headers.append("Foo", "bar")
}

// then
result.status shouldBe HttpStatusCode.OK
result.bodyAsText() shouldBe expectedResponse
```
```java
// given
var expectedResponse = "{\"response\": \"Pong\"}";

mokksy.get(spec -> {
    spec.path("/ping");
    spec.containsHeader("Foo", "bar");
}).respondsWith(builder -> builder.body(expectedResponse));

// when
var response = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/ping"))
        .header("Foo", "bar")
        .GET()
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

// then
assertThat(response.statusCode()).isEqualTo(200);
assertThat(response.body()).isEqualTo(expectedResponse);
```
When the request does not match - Mokksy server returns `404 (Not Found)`:

```kotlin
val notFoundResult = client.get("/ping") {
  headers.append("Foo", "baz")
}

notFoundResult.status shouldBe HttpStatusCode.NotFound
```
```java
// Request without the required header → 404
var notFound = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/ping"))
        .header("Foo", "baz")
        .GET()
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

assertThat(notFound.statusCode()).isEqualTo(404);
```
## POST request

```kotlin
// given
val id = Random.nextInt()
val expectedResponse =
  // language=json
  """
    {
        "id": "$id",
        "name": "thing-$id"
    }
    """.trimIndent()

mokksy.post {
  path = beEqual("/things")
  bodyContains("\"$id\"")
} respondsWith {
  body = expectedResponse
  httpStatus = HttpStatusCode.Created
  headers {
    // type-safe builder style
    append(HttpHeaders.Location, "/things/$id")
  }
  headers += "Foo" to "bar" // list style
}

// when
val result =
  client.post("/things") {
    headers.append("Content-Type", "application/json")
    setBody(
      // language=json
      """
      {
          "id": "$id"
      }
      """.trimIndent(),
    )
  }

// then
result shouldNotBeNull {
  status shouldBe HttpStatusCode.Created
  bodyAsText() shouldBe expectedResponse
  headers["Location"] shouldBe "/things/$id"
  headers["Foo"] shouldBe "bar"
}
```
```java
// given
var expectedBody = "{\"id\":\"42\",\"name\":\"thing-42\"}";

mokksy.post(spec -> {
    spec.path("/things");
    spec.bodyContains("\"42\"");
}).respondsWith(builder -> builder
    .body(expectedBody)
    .status(201)
    .header("Location", "/things/42")
    .header("Foo", "bar"));

// when
var response = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/things"))
        .header("Content-Type", "application/json")
        .POST(HttpRequest.BodyPublishers.ofString("{\"id\":\"42\"}"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

// then
assertThat(response.statusCode()).isEqualTo(201);
assertThat(response.body()).isEqualTo(expectedBody);
assertThat(response.headers().firstValue("Location")).hasValue("/things/42");
assertThat(response.headers().firstValue("Foo")).hasValue("bar");
```
## Typed request body

When the request body type is known at compile time, use the **reified** overloads to let the compiler infer the type —
no explicit `::class` argument required:

```kotlin
@Serializable
@JvmRecord
data class CreateItemRequest(val name: String, val quantity: Int)

@Serializable
@JvmRecord
data class CreateItemResponse(val message: String)
```
```java
record CreateItemRequest(String name, int quantity) {}
record CreateItemResponse(String message) {}
```
### Reified overloads

```kotlin
val itemName = "Widget"

mokksy.post<CreateItemRequest>(name = "create-item") {
  path("/items")
  bodyMatchesPredicate("name should match") { it?.name == itemName }
} respondsWith {
  body = CreateItemResponse("Hello, $itemName!")
  httpStatus = HttpStatusCode.Created
  headers += "Foo" to "bar"
}

val result =
  client.post("/items") {
    contentType(ContentType.Application.Json)
    setBody(CreateItemRequest(itemName, quantity = 3))
  }

result shouldNotBeNull {
  status shouldBe HttpStatusCode.Created
  headers["Foo"] shouldBe "bar"
  body<CreateItemResponse>().message shouldBe "Hello, $itemName!"
}
```
```java
record CreateItemRequest(String name, int quantity) {}

mokksy.post(
    CreateItemRequest.class,
    spec -> spec
        .path("/items")
        .bodyMatchesPredicate(request -> "widget".equals(request.name()))
).respondsWith(builder -> builder
    .body("{\"message\":\"Hello, widget!\"}")
    .status(201)
    .header("Foo", "bar"));

var response = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/items"))
        .header("Content-Type", "application/json")
        .POST(HttpRequest.BodyPublishers.ofString("{\"name\":\"widget\",\"quantity\":3}"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

assertThat(response.statusCode()).isEqualTo(201);
assertThat(response.body()).isEqualTo("{\"message\":\"Hello, widget!\"}");
assertThat(response.headers().firstValue("Foo")).hasValue("bar");
```
Reified overloads are provided for all HTTP verbs (`get`, `post`, `put`, `delete`, `patch`, `head`,
`options`) and the generic `method` function. Two overloads exist per verb: one taking an optional
stub name (`name: String? = null`) and one taking a [`StubConfiguration`](../request-matching/#stub-specificity).

The deserialized request body is accessible inside the response lambda as `request.body()`.

### Explicit Class token

When the type is determined at runtime or when you want an explicit name on the stub,
pass a `kotlin.reflect.KClass` / `java.lang.Class` token using the named `requestType` parameter:

```kotlin
mokksy.post(requestType = CreateItemRequest::class) {
  path("/items/validated")
  bodyMatchesPredicate("name=widget and quantity>=5") {
    it?.name == "widget" && (it.quantity) >= 5
  }
} respondsWith {
  body = "accepted"
  httpStatus = HttpStatusCode.Created
}

val accepted =
  client.post("/items/validated") {
    contentType(ContentType.Application.Json)
    setBody(CreateItemRequest("widget", quantity = 10))
  }

accepted.status shouldBe HttpStatusCode.Created
accepted.bodyAsText() shouldBe "accepted"
```
```java
mokksy.post(
    CreateItemRequest.class,
    spec -> spec
        .path("/items/validated")
        .bodyMatchesPredicate(
            "name=widget and quantity>=5",
            request -> "widget".equals(request.name()) && request.quantity() >= 5
        )
).respondsWith(builder -> builder.body("accepted").status(201));

var accepted = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/items/validated"))
        .header("Content-Type", "application/json")
        .POST(HttpRequest.BodyPublishers.ofString("{\"name\":\"widget\",\"quantity\":10}"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

assertThat(accepted.statusCode()).isEqualTo(201);
```
Java supports typed request bodies too. Pass the request class token directly, as shown in the
Java tabs above, and use the Kotlin examples as the canonical shape for typed request-body
matching.

Deserialization uses Ktor's `ContentNegotiation` plugin. For projects that use Jackson instead of
`kotlinx.serialization`, create the server with `MokksyJackson.create()` (Java API) —
see [Jackson support](#jackson-support) below.

When no stub matches and verbose mode is on (`Mokksy(verbose = true)`), Mokksy logs the closest
partial match and its failed conditions to help diagnose the mismatch.

### Jackson support

By default, Mokksy uses `kotlinx.serialization` for request body deserialization. Java and Kotlin projects
that prefer [Jackson](https://github.com/FasterXML/jackson) can configure the server with Ktor's Jackson content negotiation.
In the example below `JacksonInput` is deserialized from the request body,
and `JacksonOutput` is serialized to response body.

```kotlin
val jacksonMokksy =
  MokksyServer(
    configuration =
      ServerConfiguration(
        verbose = true,
        contentNegotiationConfigurer = {
          it.jackson { findAndRegisterModules() }
        },
      ),
  ).apply { start() }

val jacksonClient =
  HttpClient(Java) {
    install(ContentNegotiation) {
      jackson()
    }
    install(DefaultRequest) {
      url(jacksonMokksy.baseUrl())
    }
  }

jacksonMokksy
  .post(requestType = JacksonInput::class) {
    path = beEqual("/jackson")
  }.respondsWith(JacksonOutput::class) {
    val input = request.body()
    body = JacksonOutput("Hello, ${input.name}")
  }

val result =
  jacksonClient.post("/jackson") {
    contentType(ContentType.Application.Json)
    setBody(JacksonInput("Bob"))
  }

result.status shouldBe HttpStatusCode.OK
result.bodyAsText() shouldBe """{"pikka-hi":"Hello, Bob"}"""

jacksonMokksy.verifyNoUnexpectedRequests()
```
For Java-first projects that prefer Jackson, use `MokksyJackson.create()`:

```java
import dev.mokksy.MokksyJackson;

// Default Jackson ObjectMapper
Mokksy mokksy = MokksyJackson.create();
mokksy.start();
```
To customize the `ObjectMapper`, pass a configuration lambda:

```java
import com.fasterxml.jackson.databind.ObjectMapper;
import dev.mokksy.MokksyJackson;

Mokksy mokksy = MokksyJackson.create(ObjectMapper::findAndRegisterModules);
mokksy.start();
```
## Status-only responses

Use `respondsWithStatus` when the test only needs to verify a status code — no body needed.
It's an infix function, so it reads naturally next to the stub definition:

```kotlin
mokksy.get { path("/ping") } respondsWithStatus HttpStatusCode.NoContent

val response = client.get("/ping")

response.status shouldBe HttpStatusCode.NoContent
```
```java
mokksy.get(spec -> spec.path("/status-only"))
    .respondsWithStatus(204);
```
## One-time stubs

Use `StubConfiguration(eventuallyRemove = true)` when a stub should match exactly once and then
become ineligible for future requests. This is the supported property for once-only behavior.

```kotlin
mokksy.get(
  configuration =
    StubConfiguration(
      name = "single-use",
      eventuallyRemove = true,
    ),
) {
  path("/once")
} respondsWith {
  body = "First and only response"
}

client.get("/once").status shouldBe HttpStatusCode.OK
client.get("/once").status shouldBe HttpStatusCode.NotFound
```
```java
mokksy.get(StubConfiguration.once("single-use"), spec -> spec.path("/once"))
    .respondsWith(response -> response.body("First and only response"));

var first = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/once"))
        .GET()
        .build(),
    HttpResponse.BodyHandlers.ofString()
);
assertThat(first.statusCode()).isEqualTo(200);

var second = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/once"))
        .GET()
        .build(),
    HttpResponse.BodyHandlers.ofString()
);
assertThat(second.statusCode()).isEqualTo(404);
```
## Run code when a request is matched

`respondsWith { ... }` and `respondsWithStream { ... }` are lambdas. Mokksy evaluates the lambda
for the matched request immediately before it builds the response, so you can inspect test state,
coordinate concurrent code, or build a response from the incoming request.

Use this for assertions that must happen at the moment the dependency is called. In this example,
the response lambda checks a coroutine `Semaphore` before returning the response:

```kotlin
val semaphore = Semaphore(permits = 0)

mokksy.post {
  path("/jobs")
} respondsWith {
  semaphore.availablePermits shouldBe 0
  body = "accepted"
  httpStatus = HttpStatusCode.Accepted
  semaphore.release()
}

val response = client.post("/jobs")

response.status shouldBe HttpStatusCode.Accepted
response.bodyAsText() shouldBe "accepted"
semaphore.availablePermits shouldBe 1
```
```java
var semaphore = new Semaphore(0);

mokksy.post(spec -> spec.path("/jobs"))
    .respondsWith(response -> {
        assertThat(semaphore.availablePermits()).isZero();
        response.status(202).body("accepted");
        semaphore.release();
    });

var result = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/jobs"))
        .POST(HttpRequest.BodyPublishers.noBody())
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

assertThat(result.statusCode()).isEqualTo(202);
assertThat(result.body()).isEqualTo("accepted");
assertThat(semaphore.availablePermits()).isEqualTo(1);
```


# Request matching

## Request specification matchers

Mokksy provides various matcher types to specify conditions for matching incoming HTTP requests:

- **Path matchers** — `path("/things")` or `path = beEqual("/things")`
- **Header matchers** — `containsHeader("X-Request-ID", "abc")` checks for a header with an exact value
- **Content matchers** — `bodyContains("value")` checks if the raw body string contains a substring;
  `bodyString += contain("value")` adds a [Kotest](https://kotest.io/docs/assertions/assertions.html) matcher directly
- **Predicate matchers** — `bodyMatchesPredicate { it?.name == "foo" }` matches against the typed,
  deserialized request body — see [Typed request body](../stubbing/#typed-request-body) for the full API
- **Call matchers** — `successCallMatcher` matches if a function called with the body does not throw
- **Priority** — `priority = 10` on `RequestSpecificationBuilder` sets the `RequestSpecification.priority`
  of the stub; higher values indicate higher priority. Default is `0`.
  Use negative values (e.g. `priority = -1`) for catch-all / fallback stubs.
  Priority is a tiebreaker: it applies only when two stubs match with an equal number of conditions satisfied.
  For most cases, specificity-based matching (see below) selects the right stub automatically.

Predicate and call matchers are executable code. Mokksy evaluates matchers while scoring an
incoming request against registered stubs, and it evaluates every matcher independently rather
than short-circuiting after the first mismatch. Keep matcher side effects deterministic and cheap:
a matcher can run for requests that eventually match a different stub, and a throwing matcher is
logged and counted as a failed matcher for that stub. If you need to assert application state at
the moment a matched response is sent, prefer a `respondsWith { ... }` lambda in
[Stubbing responses](../stubbing/#run-code-when-a-request-is-matched).

## Stub specificity

When multiple stubs could match the same request, Mokksy scores each one by counting how many conditions
it satisfies, then selects the highest-scoring stub. A stub with two matching conditions beats a stub with one,
regardless of registration order.

```kotlin
// Generic: matches any POST to /users
mokksy.post {
  path("/users")
} respondsWith {
  body = "any user"
}

// Specific: matches only requests whose body contains "admin" — two conditions
mokksy.post {
  path("/users")
  bodyContains("admin")
} respondsWith {
  body = "admin user"
}

// Admin request → specific stub wins (score 2 beats score 1)
val adminResult = client.post("/users") { setBody("admin") }
adminResult.bodyAsText() shouldBe "admin user"

// Other request → only the generic stub matches
val genericResult = client.post("/users") { setBody("regular") }
genericResult.bodyAsText() shouldBe "any user"
```
```java
// Generic: matches any POST to /users
mokksy.post(spec -> spec.path("/users"))
    .respondsWith(response -> response.body("any user"));

// Specific: matches only requests whose body contains "admin"
mokksy.post(spec -> spec
    .path("/users")
    .bodyContains("admin")
).respondsWith(response -> response.body("admin user"));

var adminResult = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/users"))
        .POST(HttpRequest.BodyPublishers.ofString("admin"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);
assertThat(adminResult.body()).isEqualTo("admin user");

var genericResult = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/users"))
        .POST(HttpRequest.BodyPublishers.ofString("regular"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);
assertThat(genericResult.body()).isEqualTo("any user");
```
## Priority example

If multiple stubs match with the same specificity score, the one with the higher `priority` value wins:

```kotlin
// Catch-all stub with low priority (negative value)
mokksy.get {
  path = contain("/things")
  priority = -1
} respondsWith {
  body = "Generic Thing"
}

// Specific stub with high priority (positive value)
mokksy.get {
  path = beEqual("/things/special")
  priority = 1
} respondsWith {
  body = "Special Thing"
}

// when
val generic = client.get("/things/123")
val special = client.get("/things/special")

// then
generic.bodyAsText() shouldBe "Generic Thing"
special.bodyAsText() shouldBe "Special Thing"
```
```java
// Catch-all stub: matches any POST, returns 400
mokksy.post(spec -> {
    spec.path("/v1/chat/completions");
    spec.bodyMatchesPredicate(body -> true);
    spec.priority(-1);
}).respondsWith(builder -> builder
    .body("{\"error\":\"unsupported request\"}")
    .status(400));

// Specific stub: matches only when body contains "gpt-4", returns 200
mokksy.post(spec -> {
    spec.path("/v1/chat/completions");
    spec.bodyContains("gpt-4");
    spec.priority(1);
}).respondsWith(builder -> builder
    .body("{\"model\":\"gpt-4\"}")
    .status(200));

// Specific request → specific stub wins
var specific = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/v1/chat/completions"))
        .header("Content-Type", "application/json")
        .POST(HttpRequest.BodyPublishers.ofString("{\"model\":\"gpt-4\"}"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);
assertThat(specific.statusCode()).isEqualTo(200);

// Unmatched request → catch-all fallback kicks in
var fallback = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/v1/chat/completions"))
        .header("Content-Type", "application/json")
        .POST(HttpRequest.BodyPublishers.ofString("{\"model\":\"other\"}"))
        .build(),
    HttpResponse.BodyHandlers.ofString()
);
assertThat(fallback.statusCode()).isEqualTo(400);
```


# Verification and request journal

Mokksy provides two complementary verification methods that check opposite sides of the stub/request contract.

## Verify all stubs were triggered

`verifyNoUnmatchedStubs()` fails if any registered stub was never matched by an incoming request.
Use this to catch stubs you set up but that were never actually called — a sign the code under test took
a different path than expected.

```kotlin
// Fails if any stub has never been matched
mokksy.verifyNoUnmatchedStubs()
```
```java
// Fails if any stub has never been matched
mokksy.verifyNoUnmatchedStubs();
```
> **Note:** Be careful when running tests in parallel against a single `MokksyServer` instance.
> Some stubs might be unmatched when one test completes. Avoid calling this in `@AfterEach`/`@AfterTest`
> unless each test owns its own server instance.

## Verify no unexpected requests arrived

`verifyNoUnexpectedRequests()` fails if any HTTP request arrived at the server but no stub matched it.
These requests are recorded in the `RequestJournal` and reported together.

```kotlin
// Fails if any request arrived with no matching stub
mokksy.verifyNoUnexpectedRequests()
```
```java
// Fails if any request arrived with no matching stub
mokksy.verifyNoUnexpectedRequests();
```
## Recommended AfterEach setup

Always run `verifyNoUnexpectedRequests()` in `@AfterEach` to catch requests that arrived but
matched no stub. For `verifyNoUnmatchedStubs()`, the right placement depends on your fixture strategy:

- **Per-test instance** (`@TestInstance(Lifecycle.PER_METHOD)` or a fresh server per test): call
  both checks in `@AfterEach` — every stub registered during that test should have been matched
  before the server is torn down.
- **Shared instance** (`@TestInstance(Lifecycle.PER_CLASS)` or a companion-object server): call
  `verifyNoUnmatchedStubs()` in `@AfterAll`, immediately before `shutdown()`. Calling it after
  each individual test would falsely report stubs registered for _later_ tests as unmatched.

```kotlin
@TestInstance(TestInstance.Lifecycle.PER_CLASS)
class MyTest {

  val mokksy = Mokksy(verbose = true)
  lateinit var client: HttpClient

  @BeforeAll
  suspend fun setup() {
    mokksy.startSuspend()
    mokksy.awaitStarted() // port() and baseUrl() are safe after this point
    client = HttpClient {
      install(DefaultRequest) {
        url(mokksy.baseUrl())
      }
    }
  }

  @Test
  suspend fun testSomething() {
    mokksy.get {
      path("/hi")
    } respondsWith {
      delay = 100.milliseconds // wait 100ms, then reply
      body = "Hello"
    }

    // when
    val response = client.get("/hi")

    // then
    response.status shouldBe HttpStatusCode.OK
    response.bodyAsText() shouldBe "Hello"
  }

  @AfterEach
  fun afterEach() {
    mokksy.verifyNoUnexpectedRequests()
  }

  @AfterAll
  suspend fun afterAll() {
    client.close()
    mokksy.verifyNoUnmatchedStubs() // shared instance: check once, after all tests ran
    mokksy.shutdownSuspend()
  }
}
```
```java
@TestInstance(TestInstance.Lifecycle.PER_CLASS)
class MyTest {

    private final Mokksy mokksy = Mokksy.create().start();
    private final HttpClient httpClient = HttpClient.newHttpClient();

    @Test
    void testSomething() throws Exception {
        mokksy.get(spec -> spec.path("/hi"))
            .respondsWith(response -> response
                .delayMillis(100)
                .body("Hello"));

        var response = httpClient.send(
            HttpRequest.newBuilder()
                .uri(URI.create(mokksy.baseUrl() + "/hi"))
                .GET()
                .build(),
            HttpResponse.BodyHandlers.ofString()
        );

        assertThat(response.statusCode()).isEqualTo(200);
        assertThat(response.body()).isEqualTo("Hello");
    }

    @AfterEach
    void afterEach() {
        mokksy.verifyNoUnexpectedRequests();
    }

    @AfterAll
    void afterAll() {
        mokksy.verifyNoUnmatchedStubs();
        mokksy.shutdown();
    }
}
```
## Inspecting unmatched items

Use the `find*` variants to retrieve the unmatched items directly for custom assertions:

```kotlin
// List<RecordedRequest> — HTTP requests with no matching stub
val unmatchedRequests: List<RecordedRequest> = mokksy.findAllUnexpectedRequests()

// List<RecordedRequest> — matched requests, populated only in JournalMode.FULL
val matchedRequests: List<RecordedRequest> = mokksy.findAllMatchedRequests()

// List<StubHandle> — stubs that were never triggered
val unmatchedStubs: List<StubHandle> = mokksy.findAllUnmatchedStubs()
```
```java
// List<RecordedRequest> - HTTP requests with no matching stub
var unmatchedRequests = mokksy.findAllUnexpectedRequests();

// List<StubHandle> - stubs that were never triggered
var unmatchedStubs = mokksy.findAllUnmatchedStubs();
```
`StubHandle` and `RecordedRequest` answer different questions:

- `StubHandle` identifies a registered stub. It exposes the optional stub `name`, `matchCount()`,
  and verification helpers such as `verifyCalled()`. It does not contain HTTP request details.
- `RecordedRequest` is the request journal entry. It captures the incoming request `method`,
  `uri`, `headers`, whether it `matched` a stub, and `bodyAsText` when Mokksy can safely read the
  request body as text.

Use `StubHandle` when you need to verify that a known stub was called. Use `RecordedRequest` when
you need to inspect what the client actually sent, especially for unexpected requests.

## Request journal

Mokksy records incoming requests in a `RequestJournal`. The recording mode is controlled by `JournalMode` in
`ServerConfiguration`:

- **JournalMode.NONE** - Disables request recording entirely. `findAllUnexpectedRequests()`, `findAllMatchedRequests()`, and `verifyNoUnexpectedRequests()` throw `IllegalStateException`.
- **JournalMode.LEAN** _(default)_ – Records only requests with no matching stub. Lower overhead; sufficient for
  `verifyNoUnexpectedRequests()`.
- **JournalMode.FULL** - Records all incoming requests, both matched and unmatched.

```kotlin
val mokksy = MokksyServer(
  configuration = ServerConfiguration(
    journalMode = JournalMode.FULL,
  ),
)
```
Call `resetMatchState()` between scenarios to clear stub match state and the journal:

```kotlin
@AfterTest
fun afterEach() {
  mokksy.resetMatchState()
}
```
```java
@AfterEach
void afterEach() {
    mokksy.resetMatchState();
}
```
> **Note:** Stubs configured with `eventuallyRemove = true` are permanently removed from the registry
> on first match and cannot be re-armed by `resetMatchState()`. Re-register them before the next scenario.


# Multipart and file uploads

This page covers Mokksy's multipart body-matching APIs: `body { form { ... } }`, file-part matchers, `FormEncoding`, and `multipart(...)` for non-form bodies.

Use these matchers when your client sends more than JSON. File uploads, mixed metadata-plus-binary requests, and strict form-encoding checks all work through the same request DSL.

## Match multipart form fields and file uploads

`body { form { ... } }` matches both `application/x-www-form-urlencoded` and `multipart/form-data` by default. Add `field(...)` matchers for text parts and `file(...)` matchers for uploaded files.

```kotlin
mokksy.post {
  path("/upload")
  body {
    form {
      field("description", "Mokksy upload")
      file("avatar") {
        filename("photo.bin")
        contentType("application/octet-stream")
        bytes { it?.contentEquals(uploadFile.readBytes()) == true }
      }
    }
  }
} respondsWith {
  body = "file-upload-ok"
}

val response = client.post("/upload") {
  setBody(
    MultiPartFormDataContent(
      formData {
        append("description", "Mokksy upload")
        append(
          "avatar",
          uploadFile.readBytes(),
          Headers.build {
            append(
              HttpHeaders.ContentDisposition,
              "form-data; name=\"avatar\"; filename=\"photo.bin\"",
            )
            append(HttpHeaders.ContentType, "application/octet-stream")
          },
        )
      },
    ),
  )
}

response.status shouldBe HttpStatusCode.OK
response.bodyAsText() shouldBe "file-upload-ok"
```
```java
var uploadFile = Files.createTempFile("avatar", ".bin");
Files.writeString(uploadFile, "expected");

mokksy.post(spec -> spec
    .path("/upload")
    .body(body -> body.form(form -> form
        .field("description", "Mokksy upload")
        .file("avatar", file -> file
            .filename("photo.bin")
            .contentType("application/octet-stream")
            .bytesMatches(bytes -> {
                try {
                    org.assertj.core.api.Assertions.assertThat(bytes)
                        .containsExactly(Files.readAllBytes(uploadFile));
                    return true;
                } catch (IOException e) {
                    throw new UncheckedIOException(e);
                }
            })
        )
    ))
).respondsWith(rb -> rb.body("file-upload-ok"));
```
For file parts, Mokksy supports `filename(...)`, `contentType(...)`, `text(...)`, and `bytes(...)`. Each matcher adds specificity, so a stub that checks field values and file content automatically outranks a looser fallback stub.

## Restrict accepted form encoding

Use `FormEncoding.MULTIPART` or `FormEncoding.URL_ENCODED` when a stub must reject the other form style. Leave the default `AUTO` when either encoding is acceptable.

```kotlin
mokksy.post {
  path("/multipart-only")
  body {
    form(FormEncoding.MULTIPART) {
      field("key", "value")
    }
  }
} respondsWith {
  body = "multipart-only-ok"
}

val multipartResult = client.post("/multipart-only") {
  setBody(
    MultiPartFormDataContent(
      formData { append("key", "value") },
    ),
  )
}

val urlEncodedResult =
  client.submitForm(
    url = "/multipart-only",
    formParameters = parameters { append("key", "value") },
  )

multipartResult.status shouldBe HttpStatusCode.OK
urlEncodedResult.status shouldBe HttpStatusCode.NotFound
```
```java
mokksy.post(spec -> spec
    .path("/multipart-only")
    .body(body -> body.form(FormEncoding.MULTIPART, form -> form
        .field("key", "value")
    ))
).respondsWith(rb -> rb.body("multipart-only-ok"));
```
## Match non-form multipart bodies

Not every multipart request is `multipart/form-data`. Use `body { multipart(...) { ... } }` for payloads such as `multipart/mixed`, where each part has its own content type and semantic role.

```kotlin
mokksy.post {
  path("/multipart-mixed")
  body {
    multipart("multipart/mixed") {
      boundary("WebAppBoundary")
      part("metadata") {
        contentType("application/json")
        text { it?.contains("Ktor logo") == true }
      }
      part("image") {
        contentType("image/png")
        bytes { it?.isNotEmpty() == true }
      }
    }
  }
} respondsWith {
  body = "multipart-mixed-ok"
}

val response = client.post("/multipart-mixed") {
  setBody(
    MultiPartFormDataContent(
      formData {
        append(
          "metadata",
          """{"description":"Ktor logo"}""".encodeToByteArray(),
          Headers.build {
            append(HttpHeaders.ContentDisposition, "form-data; name=\"metadata\"")
            append(HttpHeaders.ContentType, "application/json")
          },
        )
        append(
          "image",
          "png-data".encodeToByteArray(),
          Headers.build {
            append(HttpHeaders.ContentDisposition, "form-data; name=\"image\"")
            append(HttpHeaders.ContentType, "image/png")
          },
        )
      },
      boundary = "WebAppBoundary",
      contentType =
        ContentType.MultiPart.Mixed.withParameter(
          "boundary",
          "WebAppBoundary",
        ),
    ),
  )
}

response.status shouldBe HttpStatusCode.OK
response.bodyAsText() shouldBe "multipart-mixed-ok"
```
```java
mokksy.post(spec -> spec
    .path("/multipart-mixed")
    .body(body -> body.multipart("multipart/mixed", multipart -> multipart
        .boundary("WebAppBoundary")
        .part("metadata", part -> part
            .contentType("application/json")
            .textMatches(text -> text != null && text.contains("Ktor logo"))
        )
        .part("image", part -> part
            .contentType("image/png")
            .bytesMatches(bytes -> bytes != null && bytes.length > 0)
        )
    ))
).respondsWith(rb -> rb.body("multipart-mixed-ok"));
```


# Streaming and SSE

## Server-Sent Events (SSE)

[Server-Sent Events (SSE)][sse] allow a server to push updates to the client over a single, long-lived HTTP connection.

Streaming clients fail in ways static JSON tests cannot catch: missed chunks, early completion, timeout handling, buffering, and reconnect logic.

```text
Client opens SSE connection -> Mokksy sends event chunks -> Client handles stream completion, delay, or timeout
```

This example defines an SSE endpoint, emits two events with controlled timing, and verifies that the client receives a real `text/event-stream` response.

```kotlin
mokksy.post {
  path = beEqual("/sse")
} respondsWithSseStream {
  flow =
    flow {
      delay(200.milliseconds)
      emit(
        ServerSentEvent(
          data = "One",
        ),
      )
      delay(50.milliseconds)
      emit(
        ServerSentEvent(
          data = "Two",
        ),
      )
    }
}

// when
val result = client.post("/sse")

// then
result shouldNotBeNull {
  status shouldBe HttpStatusCode.OK
  contentType() shouldBe ContentType.Text.EventStream.withCharsetIfNeeded(Charsets.UTF_8)
  bodyAsText() shouldBe "data: One\r\n\r\ndata: Two\r\n\r\n"
}
```
```java
mokksy.post(spec -> spec.path("/sse"))
    .respondsWithSseStream(builder -> builder
        .chunk(SseEvent.data("One"))
        .chunk(SseEvent.data("Two")));

var response = httpClient.send(
    HttpRequest.newBuilder()
        .uri(URI.create(mokksy.baseUrl() + "/sse"))
        .POST(HttpRequest.BodyPublishers.noBody())
        .build(),
    HttpResponse.BodyHandlers.ofString()
);

assertThat(response.statusCode()).isEqualTo(200);
assertThat(response.body()).isEqualTo("data: One\r\n\r\ndata: Two\r\n\r\n");
```
## Long-lived SSE streams

By default, the SSE stream closes when the flow completes.

To keep it open (e.g. for clients that reconnect on close), end the flow with `awaitCancellation()`:

```kotlin
mokksy.post {
  path = beEqual("/sse-ll")
} respondsWithSseStream {
  flow = flow {
    emit(ServerSentEvent(data = "hello"))
    awaitCancellation()
  }
}
```
```java
mokksy.post(spec -> spec.path("/sse-ll"))
    .respondsWithSseStream(stream -> stream
        .chunks(Stream.generate(() -> SseEvent.data("heartbeat")))
        .delayBetweenChunksMillis(1_000));
```
## SSE response with chunk delays

Use `delayBetweenChunks` when you want to verify that a client handles events as they arrive, instead of waiting for the full response body.

```kotlin
mokksy.get {
  path("/events")
} respondsWithSseStream {
  delayBetweenChunks = 100.milliseconds
  chunks += ServerSentEvent(data = """{"status":"accepted"}""")
  chunks += ServerSentEvent(data = """{"status":"processed"}""")
}
```
```java
mokksy.get(spec -> spec.path("/events"))
    .respondsWithSseStream(stream -> stream
        .delayBetweenChunksMillis(100)
        .chunk(SseEvent.data("{\"status\":\"accepted\"}"))
        .chunk(SseEvent.data("{\"status\":\"processed\"}")));
```
## Plain text stream

Use `respondsWithStream` for non-SSE streaming responses, such as line-delimited downloads or APIs that return partial data before the transfer completes.

```kotlin
mokksy.get {
  path("/download")
} respondsWithStream {
  delayBetweenChunks = 50.milliseconds
  chunks += "part-1\n"
  chunks += "part-2\n"
}
```
```java
mokksy.get(spec -> spec.path("/download"))
    .respondsWithStream(stream -> stream
        .delayBetweenChunksMillis(50)
        .chunk("part-1\n")
        .chunk("part-2\n"));
```
Use these examples to test code that processes data before the full response is available.
For timeout, retry, and malformed-stream cases, continue with [Failure simulation](../failure-simulation/).

[sse]: https://html.spec.whatwg.org/multipage/server-sent-events.html "Server-Side Events Specification"


# Failure simulation


Production HTTP clients need more than happy-path JSON. Use these patterns to verify retries, backoff, timeouts, stream parsing, and fallback behavior.

## Delayed response

Use `delay` when the server should accept the request but wait before sending the response.

```kotlin
mokksy.get {
  path("/slow")
} respondsWith {
  delay = 2.seconds
  body = "eventually-ok"
}
```
```java
mokksy.get(spec -> spec.path("/slow"))
    .respondsWith(response -> response
        .delayMillis(2_000)
        .body("eventually-ok"));
```
## Delayed chunks

Use `delayBetweenChunks` when the response should arrive incrementally and the client must process partial data.

```kotlin
mokksy.get {
  path("/slow-stream")
} respondsWithStream {
  delayBetweenChunks = 500.milliseconds
  chunks += "first\n"
  chunks += "second\n"
}
```
```java
mokksy.get(spec -> spec.path("/slow-stream"))
    .respondsWithStream(stream -> stream
        .delayBetweenChunksMillis(500)
        .chunk("first\n")
        .chunk("second\n"));
```
## Hanging request or stream

Use a stream that never completes when you need to verify client-side read timeouts, cancellation, or reconnect behavior.

```kotlin
mokksy.get {
  path("/never-finishes")
} respondsWithSseStream {
  flow = flow {
    emit(ServerSentEvent(data = "started"))
    awaitCancellation()
  }
}
```
```java
mokksy.get(spec -> spec.path("/never-finishes"))
    .respondsWithSseStream(stream -> stream
        .chunks(Stream.generate(() -> SseEvent.data("heartbeat")))
        .delayBetweenChunksMillis(1_000));
```
Use this with a short client timeout to verify timeout handling.

## Retry-after and rate limiting

Return `429 Too Many Requests` with `Retry-After` when the client should back off and retry later.

```kotlin
mokksy.post {
  path("/payments")
} respondsWith {
  httpStatus = HttpStatusCode.TooManyRequests
  headers += HttpHeaders.RetryAfter to "30"
  body = """{"error":"rate_limited"}"""
}
```
```java
mokksy.post(spec -> spec.path("/payments"))
    .respondsWith(response -> response
        .status(429)
        .header("Retry-After", "30")
        .body("{\"error\":\"rate_limited\"}"));
```
## Malformed SSE

Send malformed event-stream data when you need to test parser failures and fallback behavior.

```kotlin
mokksy.get {
  path("/malformed-events")
} respondsWithStream {
  contentType = ContentType.Text.EventStream
  chunks += "data: valid\n\n"
  chunks += "this is not a valid event frame"
}
```
```java
mokksy.get(spec -> spec.path("/malformed-events"))
    .respondsWithStream(stream -> stream
        .contentType("text/event-stream")
        .chunk("data: valid\n\n")
        .chunk("this is not a valid event frame"));
```
## Partial failure

Send only part of the expected response when the client must handle incomplete transfers or timeout after partial data.

```kotlin
mokksy.get {
  path("/statement")
} respondsWithStream {
  chunks += "header\n"
  chunks += "row-1\n"
  delayBetweenChunks = 250.milliseconds
}
```
```java
mokksy.get(spec -> spec.path("/statement"))
    .respondsWithStream(stream -> stream
        .chunk("header\n")
        .chunk("row-1\n")
        .delayBetweenChunksMillis(250));
```
Keep the client timeout lower than the full expected transfer time to verify partial-data handling.


# File-based configuration


File-based configuration lets you define stubs in a YAML file and load them at startup — without writing any Kotlin code.

## Minimal example

```yaml
stubs:
  - name: ping
    method: GET
    path: /ping
    response:
      body: '{"response":"Pong"}'
      status: 200
```

Load the file and start the server:

```kotlin
val mokksy = Mokksy().start()
mokksy.loadStubsFromFile("/path/to/stubs.yaml")
```
```java
Mokksy mokksy = Mokksy.create().start();
mokksy.loadStubsFromFile("/path/to/stubs.yaml");
```
## Stub fields

| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `name` | no | — | Optional identifier shown in logs and error messages |
| `method` | no | `GET` | HTTP method: `GET`, `POST`, `PUT`, `DELETE`, `PATCH`, `HEAD`, `OPTIONS` |
| `path` | **yes** | — | Exact request path to match |
| `match` | no | — | Additional matching criteria (see below) |
| `response` | **yes** | — | Response definition |

### Matching criteria

```yaml
match:
  bodyContains:
    - '"userId":"42"'       # request body must contain this string
    - '"type":"order"'      # multiple strings — all must match
  headers:
    Authorization: Bearer token123   # request must carry this header value
    Content-Type: application/json
```

### Response fields

| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `type` | no | `plain` | `plain`, `sse`, or `stream` |
| `body` | no | — | Response body (`plain` type only) |
| `status` | no | `200` | HTTP status code |
| `headers` | no | — | Response headers as a map |
| `delayMs` | no | `0` | Delay before the response is sent (milliseconds) |
| `chunks` | yes\* | — | Ordered list of chunks (`sse` and `stream` types) |
| `delayBetweenChunksMs` | no | `0` | Delay between chunks (milliseconds) |
| `contentType` | no | — | Override content type for `stream` responses |

\* Required when `type` is `sse` or `stream`.

## Response types

### Plain response

```yaml
stubs:
  - name: create-order
    method: POST
    path: /orders
    match:
      bodyContains:
        - '"product":"widget"'
    response:
      body: '{"orderId":"abc-123"}'
      status: 201
      headers:
        Location: /orders/abc-123
      delayMs: 50
```

### Server-Sent Events (SSE)

```yaml
stubs:
  - name: order-updates
    method: POST
    path: /orders/stream
    response:
      type: sse
      chunks:
        - '{"status":"processing"}'
        - '{"status":"shipped"}'
        - '{"status":"delivered"}'
      delayBetweenChunksMs: 100
```

The response content type is automatically set to `text/event-stream; charset=UTF-8`.

### Plain text stream

```yaml
stubs:
  - name: data-feed
    method: GET
    path: /feed
    response:
      type: stream
      chunks:
        - "chunk-one\n"
        - "chunk-two\n"
        - "chunk-three\n"
      contentType: text/plain; charset=UTF-8
      delayBetweenChunksMs: 50
```

## Loading the config

### Explicit path

```kotlin
val mokksy = Mokksy().start()
mokksy.loadStubsFromFile("/app/stubs.yaml")  // absolute path
mokksy.loadStubsFromFile("stubs.yaml")       // relative to working directory
```
```java
Mokksy mokksy = Mokksy.create().start();
mokksy.loadStubsFromFile("/app/stubs.yaml");  // absolute path
mokksy.loadStubsFromFile("stubs.yaml");       // relative to working directory
```
### Environment variable or system property

`loadStubsFromEnv()` checks `MOKKSY_CONFIG` first, then the `-Dmokksy.config` system property.
When either is set, `start()` loads the stubs automatically — you do not need to call `loadStubsFromEnv()` explicitly in that case.

```kotlin
// explicit load — useful when MOKKSY_CONFIG is not in the environment
val mokksy = Mokksy().start()
mokksy.loadStubsFromEnv()
```
```java
// explicit load — useful when MOKKSY_CONFIG is not in the environment
Mokksy mokksy = Mokksy.create().start();
mokksy.loadStubsFromEnv();
```
```bash
# via environment variable — stubs are loaded automatically on start()
MOKKSY_CONFIG=/app/stubs.yaml java -jar app.jar

# via system property
java -Dmokksy.config=/app/stubs.yaml -jar app.jar
```

## Validation errors

Mokksy validates the config at load time and reports clear errors when something is wrong:

| Problem | Error message |
|---------|---------------|
| File not found | `Mokksy config file not found: /path/to/stubs.yaml` |
| Malformed YAML | `Invalid YAML in Mokksy config file '/path/...': <parser message>` |
| Unknown HTTP method | `<name>: unknown HTTP method 'BREW'. Valid methods: GET, POST, ...` |
| Stream with no chunks | `<name>: response type 'sse' requires at least one chunk` |

## Combining file config with code stubs

File-based configuration and the code API can be used together — they register stubs on the same server:

```kotlin
val mokksy = Mokksy().start()

// load shared stubs from file
mokksy.loadStubsFromFile("shared-stubs.yaml")

// add test-specific stubs via DSL
mokksy.get { path("/health") } respondsWithStatus HttpStatusCode.OK
```
```java
Mokksy mokksy = Mokksy.create().start();

// load shared stubs from file
mokksy.loadStubsFromFile("shared-stubs.yaml");

// add test-specific stubs via Java API
mokksy.get("/health").respondsWith(builder -> builder.body("OK"));
```


# Docker


The `mokksy/server-jvm` image runs Mokksy as a standalone HTTP mock server. Stubs are loaded from a YAML file at startup, before the server begins accepting connections.

## Quick start

Create a stubs file:

```yaml
stubs:
  - name: ping
    method: GET
    path: /ping
    response:
      body: '{"response":"Pong"}'
      status: 200
```

Start the container, mounting the file to the default config path:

```bash
docker run -p 8080:8080 \
  -v ./stubs.yaml:/config/stubs.yaml \
  mokksy/server-jvm:snapshot
```

The image sets `MOKKSY_CONFIG=/config/stubs.yaml` by default, so mounting the file there requires no extra environment variables.

## Docker Compose

```yaml
services:
  mokksy:
    image: mokksy/server-jvm
    ports:
      - "8080:8080"
    volumes:
      - ./stubs.yaml:/config/stubs.yaml
```

To use a different path, override `MOKKSY_CONFIG`:

```yaml
services:
  mokksy:
    image: mokksy/server-jvm
    ports:
      - "8080:8080"
    volumes:
      - ./stubs.yaml:/app/stubs.yaml
    environment:
      MOKKSY_CONFIG: /app/stubs.yaml
```

## Environment variables

| Variable | Default | Description |
|----------|---------|-------------|
| `MOKKSY_CONFIG` | `/config/stubs.yaml` | Path to the YAML stubs file inside the container |
| `JAVA_OPTS` | `-XX:MaxRAMPercentage=75 -XX:+ExitOnOutOfMemoryError` | JVM flags passed to the server process |

## Startup behaviour

Stubs are loaded from `MOKKSY_CONFIG` **before** the server binds its port. Every request is matchable from the first connection — there is no window where a request can arrive before stubs are registered.

If `MOKKSY_CONFIG` is unset or the file is absent, the server starts with an empty stub registry and returns `404` for all requests.

## Stubs file format

See [File-based configuration](../file-config/) for the full YAML schema, supported response types, and matching options.


# Ktor integration

If you already own a [Ktor][ktor] `Application` — a test harness with authentication middleware, custom plugins, or routes that must coexist with stubs — use the `mokksy` extension functions to mount stub handling directly, without allocating a second embedded server.

## Application-level installation

`Application.mokksy(server)` installs [SSE][sse], `DoubleReceive`, and `ContentNegotiation`
automatically, then mounts a catch-all route that dispatches every incoming request through the
stub registry:

```kotlin
import dev.mokksy.mokksy.MokksyServer
import dev.mokksy.mokksy.mokksy
import io.ktor.server.engine.embeddedServer
import io.ktor.server.netty.Netty

val server = MokksyServer()
server.get { path("/ping") } respondsWith { body = "pong" }

embeddedServer(Netty, port = 8080) {
    mokksy(server)
}.start(wait = true)
```
Use this overload when Mokksy owns the entire application and you want the simplest possible setup.

## Route-level installation

`Route.mokksy(server)` mounts the stub handler inside an existing route scope. Unlike the
application-level overload, it does **not** install plugins — you are responsible for installing
`SSE`, `DoubleReceive`, and `ContentNegotiation` on the surrounding application. This makes it
suitable when Mokksy stubs coexist with real routes:

```kotlin
routing {
    get("/health") { call.respondText("OK") }
    mokksy(server)
}
```
To place stubs behind an authentication check, install the required plugins and wrap `mokksy` in
an `authenticate` block:

```kotlin
install(SSE)
install(DoubleReceive)
install(ContentNegotiation) { json() }
install(Authentication) {
    basic("auth-basic") {
        validate { credentials ->
            if (credentials.name == "user" && credentials.password == "pass") {
              UserIdPrincipal(credentials.name)
            } else null
        }
  }
}

routing {
    authenticate("auth-basic") { 
        mokksy(server)
    }
}
```
Both extension functions accept any `path` pattern as a second parameter (default: `"{...}"`,
which matches all routes). Narrow the scope by passing a prefix:

```kotlin
mokksy(server, path = "/api/{...}")
```
[sse]: https://html.spec.whatwg.org/multipage/server-sent-events.html "Server-Side Events Specification"
[ktor]: https://ktor.io


# Anthropic

[![Maven Central](https://img.shields.io/maven-central/v/dev.mokksy.aimocks/ai-mocks-anthropic.svg?label=Maven%20Central)](https://central.sonatype.com/artifact/dev.mokksy.aimocks/ai-mocks-anthropic) 

[MockAnthropic](https://github.com/mokksy/ai-mocks/blob/main/ai-mocks-anthropic/src/commonMain/kotlin/dev/mokksy/aimocks/anthropic/MockAnthropic.kt)
provides a local mock server for simulating [Anthropic API endpoints](https://docs.anthropic.com/en/api). It simplifies
testing by allowing you to define request expectations and responses without making real network calls.

## Quick Start

### Add Dependency

Include the library in your test dependencies (Maven or Gradle).
```kotlin
testImplementation("dev.mokksy.aimocks:ai-mocks-anthropic-jvm:$latestVersion")
```
```xml
<dependency>
  <groupId>dev.mokksy.aimocks</groupId>
  <artifactId>ai-mocks-anthropic-jvm</artifactId>
  <version>[LATEST_VERSION]</version>
  <scope>test</scope>
</dependency>
```
### Initialize the Server

```kotlin
val anthropic = MockAnthropic(verbose = true)
```
- The server will start on a random free port by default.
- You can retrieve the server's base URL via `anthropic.baseUrl()`.

### Configure Requests and Responses

Here's an example that sets up a mock "messages" endpoint and defines the response:

```kotlin
anthropic.messages {
  temperature = 0.42
  model = "claude-3-7-sonnet-latest"
  maxTokens = 100
  topP = 0.95
  topK = 40
  userId = "user123"
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
} responds {
  messageId = "msg_1234567890"
  assistantContent = "Hello" // response content
  delay = 200.milliseconds // simulate delay
  stopReason = "end_turn" // reason for stopping
}
```
- The `messages { ... }` block sets how the incoming request must look.
- The `responds { ... }` block defines what the mock server returns.

### Calling Anthropic API Client

Here's an example that sets up and calls the
official [Anthropic SDK client](https://github.com/anthropics/anthropic-sdk-java):

```kotlin
// create Anthropic SDK client
val client =
  AnthropicOkHttpClient
    .builder()
    .apiKey("my-anthropic-api-key")
    .baseUrl(anthropic.baseUrl())
    .build()

// prepare Anthropic SDK call
val params =
  MessageCreateParams
    .builder()
    .temperature(0.42)
    .maxTokens(100)
    .system("You are a helpful assistant.")
    .addUserMessage("Just say 'Hello!' and nothing else")
    .model("claude-3-7-sonnet-latest")
    .build()

val result =
  client
    .messages()
    .create(params)

result
  .content()
  .first()
  .asText()
  .text() shouldBe "Hello" // kotest matcher
```
## Streaming Responses

You can also configure streaming responses (such as chunked SSE events) for testing:

```kotlin
anthropic.messages {
  temperature = 0.7
  model = "claude-3-7-sonnet-latest"
  maxTokens = 150
  topP = 0.95
  topK = 40
  userId = "user123"
  systemMessageContains("person from 60s")
  userMessageContains("What do we need?")
} respondsStream {
  responseChunks = listOf("All", " we", " need", " is", " Love")
  delay = 50.milliseconds
  delayBetweenChunks = 10.milliseconds
  stopReason = "end_turn"
}
```
Or, you can use a flow to generate the response:

```kotlin
anthropic.messages("anthropic-messages-flow") {
  temperature = 0.7
  model = "claude-3-7-sonnet-latest"
  maxTokens = 150
  topP = 0.95
  topK = 40
  userId = "user123"
  systemMessageContains("person from 60s")
  userMessageContains("What do we need?")
} respondsStream {
  responseFlow =
    flow {
      emit("All")
      emit(" we")
      emit(" need")
      emit(" is")
      emit(" Love")
    }
  delay = 60.milliseconds
  delayBetweenChunks = 15.milliseconds
  stopReason = "end_turn"
}
```
Call Anthropic client:

```kotlin
val params =
  MessageCreateParams
    .builder()
    .temperature(0.7)
    .maxTokens(150)
    .topP(0.95)
    .topK(40)
    .metadata(Metadata.builder().userId("user123").build())
    .system("You are a person from 60s")
    .addUserMessage("What do we need?")
    .model("claude-3-7-sonnet-latest")
    .build()

val timedValue =
  measureTimedValue {
    client
      .messages()
      .createStreaming(params)
      .use { streamResponse ->
        streamResponse.stream().count()
      }
  }
timedValue.duration shouldBeLessThan 10.seconds
timedValue.value shouldBeLessThan 10L
```
Use your Anthropic client to invoke the endpoint at `anthropic.baseUrl()`, and it will receive a streamed response.

## Error Simulation

To test client behavior for exceptional cases:

```kotlin
anthropic.messages {
  // expected request
} respondsError {
  httpStatus = HttpStatusCode.InternalServerError // Set an error status code
  body = """{
      "type": "error",
      "error": {
        "type": "api_error",
        "message": "An unexpected error has occurred internal to Anthropic's systems."
      }
    }"""
  // Optionally add a delay or other properties
}
```
## Practical Example in Tests

```kotlin
@Test
fun `test basic conversation`() {
  // Arrange: mock the messages API
  anthropic.messages {
    userMessageContains("Hello")
  } responds {
    assistantContent = "Hi from mock!"
  }

  // Act: call the mocked endpoint in your test code
  val result = yourAnthropicClient.sendMessage("Hello")

  // Assert: verify the response
  assertEquals("Hi from mock!", result.assistantMessage)
}
```
## Integration with LangChain4j

You may use also LangChain4J Kotlin Extensions:

```kotlin
// Set up mock response
anthropic.messages {
  userMessageContains("Hello")
} responds {
  assistantContent = "Hello"
  delay = 42.milliseconds
}

// Create the LangChain4j model
val model: AnthropicChatModel =
  AnthropicChatModel
    .builder()
    .apiKey("foo")
    .baseUrl(anthropic.baseUrl() + "/v1")
    .modelName("claude-3-5-haiku-20241022")
    .build()

// Make the request using Kotlin DSL
val result =
  model.chat {
    messages += userMessage("Say Hello")
  }

// Verify the response
result.apply {
  finishReason() shouldBe FinishReason.STOP
  tokenUsage() shouldNotBe null
  aiMessage().text() shouldBe "Hello"
}
```
### Stream Responses

Mock streaming responses easily with flow support:

```kotlin
// Example 1: Using responseChunks
val userMessage = "What do we need?"
anthropic.messages {
  systemMessageContains("You are a person of 60s")
  userMessageContains(userMessage)
} respondsStream {
  responseChunks = listOf("All", " we", " need", " is", " Love")
}

// Example 2: Using responseFlow
val userMessage2 = "What is in the sea?"
anthropic.messages {
  systemMessageContains("You are a person of 60s")
  userMessageContains(userMessage2)
} respondsStream {
  responseFlow =
    flow {
      emit("Yellow")
      emit(" submarine")
    }
}

// Create the streaming model
val model: AnthropicStreamingChatModel =
  AnthropicStreamingChatModel
    .builder()
    .apiKey("foo")
    .baseUrl(anthropic.baseUrl() + "/v1")
    .modelName("claude-3-5-haiku-20241022")
    .build()

// Method 1: Using Kotlin Flow API
model
  .chatFlow {
    messages += systemMessage("You are a person of 60s")
    messages += userMessage(userMessage2)
  }.buffer(capacity = 8096)
  .collect {
    when (it) {
      is StreamingChatModelReply.PartialResponse -> {
        println("token = ${it.partialResponse}")
      }

      is StreamingChatModelReply.CompleteResponse -> {
        println("Completed: $it")
      }

      is StreamingChatModelReply.Error -> {
        println("Error: $it")
      }
    }
  }

// Method 2: Using Java-style API with a handler
model.chat(
  ChatRequest
    .builder()
    .messages(
      systemMessage("You are a person of 60s"),
      userMessage(userMessage2)
    )
    .build(),
  object : StreamingChatResponseHandler {
    override fun onCompleteResponse(completeResponse: ChatResponse) {
      println("Received CompleteResponse: $completeResponse")
    }

    override fun onPartialResponse(partialResponse: String) {
      println("Received partial response: $partialResponse")
    }

    override fun onError(error: Throwable) {
      println("Received error: $error")
    }
  }
)
```
## Stopping the Server

```kotlin
anthropic.shutdown()
```
Stops the mock server and frees up resources.


# OpenAI

[![Maven Central](https://img.shields.io/maven-central/v/dev.mokksy.aimocks/ai-mocks-openai.svg?label=Maven%20Central)](https://central.sonatype.com/artifact/dev.mokksy.aimocks/ai-mocks-openai)

AI-Mocks OpenAI is a specialized mock server implementation for mocking the OpenAI API, built using Mokksy.

`MockOpenai` is tested against official [openai-java SDK](https://github.com/openai/openai-java) and popular JVM AI
frameworks: [LangChain4j](https://github.com/langchain4j/langchain4j)
and [Spring AI](https://docs.spring.io/spring-ai/reference/api/chatclient.html).

Currently, it supports:
- [Chat Completions](https://platform.openai.com/docs/api-reference/chat/create) (streaming and non-streaming)
- [Embeddings](https://platform.openai.com/docs/api-reference/embeddings/create)
- [Moderations](https://platform.openai.com/docs/api-reference/moderations/create)

## Quick Start

Include the library in your test dependencies (Maven or Gradle).
```kotlin
testImplementation("dev.mokksy.aimocks:ai-mocks-openai-jvm:$latestVersion")
```
```xml
<dependency>
  <groupId>dev.mokksy.aimocks</groupId>
  <artifactId>ai-mocks-openai-jvm</artifactId>
  <version>[LATEST_VERSION]</version>
  <scope>test</scope>
</dependency>
```
## Chat Completions API

Set up a mock server and define mock responses:

```kotlin
val openai = MockOpenai(verbose = true)
```
Let's simulate OpenAI [Chat Completions API](https://platform.openai.com/docs/api-reference/chat):

```kotlin
// Define mock response
openai.completion {
  temperature = 0.7
  seed = 42
  model = "gpt-4o-mini"
  maxTokens = 100
  topP = 0.95
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
} responds {
  assistantContent = "Hello"
  finishReason = "stop"
  delay = 200.milliseconds // delay before answer
}

// OpenAI client setup
val client: OpenAIClient =
  OpenAIOkHttpClient
    .builder()
    .apiKey("dummy-api-key")
    .baseUrl(openai.baseUrl()) // connect to mock OpenAI
    .responseValidation(true)
    .build()

// Use the mock endpoint
val params =
  ChatCompletionCreateParams
    .builder()
    .temperature(0.7)
    .maxCompletionTokens(100)
    .topP(0.95)
    .messages(
      listOf(
        ChatCompletionMessageParam.ofSystem(
          ChatCompletionSystemMessageParam
            .builder()
            .content(
              "You are a helpful assistant.",
            ).build(),
        ),
        ChatCompletionMessageParam.ofUser(
          ChatCompletionUserMessageParam
            .builder()
            .content("Just say 'Hello!' and nothing else")
            .build(),
        ),
      ),
    ).model("gpt-4o-mini")
    .build()

val result: ChatCompletion =
  client
    .chat()
    .completions()
    .create(params)

println(result)
```
## Mocking Negative Scenarios

With AI-Mocks it is possible to test negative scenarios, such as erroneous responses and delays.

### Custom Error Response

```kotlin
openai.completion {
  temperature = 0.7
  seed = 42
  model = "gpt-4o-mini"
  maxTokens = 100
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
}.respondsError(String::class) {
  body =
    // language=json
    """
    {
      "type": "error",
      "code": "ERR_SOMETHING",
      "message": "Arrr, blast me barnacles! This be not what ye expect! 🏴‍☠️",
      "param": null
    }
    """.trimIndent()
  contentType = ContentType.Text.Plain
  delay = 100.milliseconds
  httpStatus = HttpStatusCode.PreconditionFailed
}
```
### OpenAI-Compatible Error Response

```kotlin
openai.completion {
  temperature = 0.7
  seed = 42
  model = "gpt-4o-mini"
  maxTokens = 100
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
}.respondsError(String::class) {
  body =
    // language=json
    """
    {
        "error": {
           "type": "server_error",
          "code": "ERR_SOMETHING",
          "message": "Arrr, blast me barnacles! This be not what ye expect! 🏴‍☠️",
          "param": "foo"
        }
    }
    """.trimIndent()
  delay = 150.milliseconds
  contentType = ContentType.Application.Json
  httpStatus = HttpStatusCode.InternalServerError
}
```
## Integration with LangChain4j

You may use also LangChain4J Kotlin Extensions:

```kotlin
val model: OpenAiChatModel =
  OpenAiChatModel
    .builder()
    .apiKey("dummy-api-key")
    .baseUrl(openai.baseUrl())
    .build()

val result =
  model.chat {
    parameters =
      OpenAiChatRequestParameters
        .builder()
        .temperature(0.7)
        .modelName("gpt-4o-mini")
        .maxCompletionTokens(100)
        .topP(0.95)
        .seed(42)
        .build()
    messages += userMessage("Say Hello")
  }

println(result)
```
### Stream Responses

Mock streaming responses easily with flow support or a list of chunks.

#### Streaming with List of Chunks

```kotlin
openai.completion {
  temperature = 0.7
  model = "gpt-4o-mini"
  topP = 0.95
} respondsStream {
  responseChunks = listOf("All", " we", " need", " is", " Love")
  delay = 50.milliseconds
  delayBetweenChunks = 10.milliseconds
  finishReason = "stop"
}

// Create OpenAI client
val client: OpenAIClient =
  OpenAIOkHttpClient
    .builder()
    .apiKey("dummy-key")
    .baseUrl(openai.baseUrl())
    .build()

// Make streaming request
val params =
  ChatCompletionCreateParams
    .builder()
    .temperature(0.7)
    .topP(0.95)
    .messages(
      listOf(
        ChatCompletionMessageParam.ofUser(
          ChatCompletionUserMessageParam
            .builder()
            .content("What do we need?")
            .build(),
        ),
      ),
    ).model("gpt-4o-mini")
    .build()

val result = StringBuilder()
client
  .chat()
  .completions()
  .createStreaming(params)
  .use { response ->
    response
      .stream()
      .flatMap { it.choices().stream() }
      .flatMap { it.delta().content().stream() }
      .forEach { result.append(it) }
  }

// Result: "All we need is Love"
```
#### Streaming with Kotlin Flow

```kotlin
openai.completion {
  temperature = 0.7
  model = "gpt-4o-mini"
} respondsStream {
  responseFlow =
    flow {
      emit("All")
      emit(" we")
      emit(" need")
      emit(" is")
      emit(" Love")
    }
  delay = 60.milliseconds
  delayBetweenChunks = 15.milliseconds
  finishReason = "stop"
}
```
## Integration with Spring-AI

To test Spring-AI integration:

```kotlin
// create mock server
val openai = MockOpenai(verbose = true)

// create Spring-AI client
val chatClient =
  ChatClient
    .builder(
      org.springframework.ai.openai.OpenAiChatModel
        .builder()
        .openAiApi(
          OpenAiApi
            .builder()
            .apiKey("demo-key")
            .baseUrl(openai.baseUrl())
            .build(),
        ).build(),
    ).build()

// Set up a mock for the LLM call
openai.completion {
  temperature = 0.7
  seed = 42
  model = "gpt-4o-mini"
  maxTokens = 100
  topP = 0.95
  topK = 40
  systemMessageContains("helpful pirate")
  userMessageContains("say 'Hello!'")
} responds {
  assistantContent = "Ahoy there, matey! Hello!"
  finishReason = "stop"
  delay = 200.milliseconds
}

// Configure Spring-AI client call
val response =
  chatClient
    .prompt()
    .system("You are a helpful pirate")
    .user("Just say 'Hello!'")
    .options<OpenAiChatOptions>(
      OpenAiChatOptions
        .builder()
        .maxCompletionTokens(100)
        .temperature(0.7)
        .topP(0.95)
        .model("gpt-4o-mini")
        .seed(42)
        .build(),
    )
    // Make a call
    .call()
    .chatResponse()

// Verify the response
response?.result shouldNotBe null
response?.result?.apply {
metadata.finishReason shouldBe "STOP"
output.text shouldBe "Ahoy there, matey! Hello!"
}
```
Check for examples in the [integration tests](https://github.com/mokksy/ai-mocks/tree/main/ai-mocks-openai/src/jvmTest/kotlin/dev/mokksy/aimocks/openai/springai).

## Embeddings API

Mock the OpenAI [Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) to test your embeddings generation:

### Basic Embedding Response

```kotlin
// Set up mock server
val openai = MockOpenai(verbose = true)

// Define mock response for embedding request
openai.embeddings {
    model = "text-embedding-3-small"
    inputContains("Hello")
    stringInput("Hello world")
} responds {
    delay = 200.milliseconds
    embeddings(
        listOf(0.1f, 0.2f, 0.3f)
    )
}

// Create OpenAI client
val client: OpenAIClient =
    OpenAIOkHttpClient
        .builder()
        .apiKey("dummy-key")
        .baseUrl(openai.baseUrl())
        .responseValidation(true)
        .build()

// Make embedding request
val params = EmbeddingCreateParams
    .builder()
    .model("text-embedding-3-small")
    .input(EmbeddingCreateParams.Input.ofString("Hello world"))
    .build()

val result = client
    .embeddings()
    .create(params)

// Verify results
result.model() // "text-embedding-3-small"
result.data()[0].embedding() // [0.1, 0.2, 0.3]
result.data()[0].index() // 0
```
### Multiple Embeddings

You can mock multiple embeddings for batch input:

```kotlin
openai.embeddings {
    model = "text-embedding-3-small"
    stringListInput(listOf("Hello", "world"))
} responds {
    delay = 100.milliseconds
    embeddings(
        listOf(0.1f, 0.2f, 0.3f),
        listOf(0.4f, 0.5f, 0.6f)
    )
}

val params = EmbeddingCreateParams
    .builder()
    .model("text-embedding-3-small")
    .input(EmbeddingCreateParams.Input.ofArrayOfStrings(listOf("Hello", "world")))
    .build()

val result = client
    .embeddings()
    .create(params)

// Returns 2 embeddings
result.data().size // 2
result.data()[0].embedding() // [0.1, 0.2, 0.3]
result.data()[1].embedding() // [0.4, 0.5, 0.6]
```
### Advanced Input Matching

You can use `inputContains()` to match requests where the input contains specific substrings:

```kotlin
openai.embeddings {
    model = "text-embedding-3-small"
    inputContains("Hello")
    inputContains("world")
    stringInput("Hello world")
} responds {
    embeddings(listOf(0.1f, 0.2f, 0.3f))
}
```
### Error Scenarios

Test error handling for embeddings:

```kotlin
openai.embeddings {
    model = "text-embedding-3-small"
    stringInput("boom")
}.respondsError(String::class) {
    body = "Kaboom!"
    contentType = ContentType.Text.Plain
    httpStatus = HttpStatusCode.BadRequest
    delay = 200.milliseconds
}

// This will throw BadRequestException
val params = EmbeddingCreateParams
    .builder()
    .model("text-embedding-3-small")
    .input(EmbeddingCreateParams.Input.ofString("invalid input"))
    .build()

try {
    client.embeddings().create(params)
} catch (e: BadRequestException) {
    // Handle error
}
```
## Moderations API

Mock the OpenAI [Moderations API](https://platform.openai.com/docs/api-reference/moderations) to test content moderation:

### Basic Moderation Response

```kotlin
// Set up mock server
val openai = MockOpenai(verbose = true)

// Define mock response for moderation request
openai.moderation {
    model = "omni-moderation-latest"
    inputContains("Hello world")
} responds {
    flagged = true
    delay = 200.milliseconds
    category(name = "harassment", score = 0.1, inputTypes = listOf(TEXT))
    category(
        name = ModerationCategory.SEXUAL,
        score = 0.2,
        inputTypes = listOf(TEXT, InputType.IMAGE)
    )
}

// Create OpenAI client
val client: OpenAIClient =
    OpenAIOkHttpClient
        .builder()
        .apiKey("dummy-key")
        .baseUrl(openai.baseUrl())
        .responseValidation(true)
        .build()

// Make moderation request
val params =
    ModerationCreateParams
        .builder()
        .model("omni-moderation-latest")
        .input("Hello world")
        .build()

val result = client
    .moderations()
    .create(params)

// Verify results
result.model() // "omni-moderation-latest"
result.results()[0].flagged() // true
result.results()[0].categories().harassment() // true
result.results()[0].categoryScores().harassment() // 0.1
result.results()[0].categoryAppliedInputTypes().harassment() // [TEXT]
```
### Moderation Error Scenarios

```kotlin
openai.moderation {
    model = "omni-moderation-latest"
    inputContains("boom")
}.respondsError(String::class) {
    body = "Kaboom!"
    contentType = ContentType.Text.Plain
    httpStatus = HttpStatusCode.BadRequest
    delay = 200.milliseconds
}

// This will throw BadRequestException
val params = ModerationCreateParams
    .builder()
    .model("omni-moderation-latest")
    .input("boom")
    .build()

try {
    client.moderations().create(params)
} catch (e: BadRequestException) {
    // Handle error
}
```


# Gemini

[![Maven Central](https://img.shields.io/maven-central/v/dev.mokksy.aimocks/ai-mocks-gemini.svg?label=Maven%20Central)](https://central.sonatype.com/artifact/dev.mokksy.aimocks/ai-mocks-gemini)

AI-Mocks Gemini is a specialized mock server implementation for mocking the Google Vertex AI Gemini API, built using Mokksy.

`MockGemini` is tested against the Spring AI framework with the Vertex AI Gemini integration.

Currently, it supports basic content generation requests and streaming responses.

## Quick Start

Include the library in your test dependencies (Maven or Gradle).
```kotlin
testImplementation("dev.mokksy.aimocks:ai-mocks-gemini-jvm:$latestVersion")
```
```xml
<dependency>
  <groupId>dev.mokksy.aimocks</groupId>
  <artifactId>ai-mocks-gemini-jvm</artifactId>
  <version>[LATEST_VERSION]</version>
  <scope>test</scope>
</dependency>
```
## Content Generation API

Set up a mock server and define mock responses:

```kotlin
val gemini = MockGemini(verbose = true)
```
Let's simulate Gemini content generation API:

```kotlin
// Define mock response
gemini.generateContent {
  temperature = 0.7
  model = "gemini-2.0-flash"
  project = "your-project-id"
  location = "us-central1"
  apiVersion = "v1beta1"
  path = null // custom request path, overrides "apiVersion"
  seed = 42
  maxTokens = 100
  topK = 40
  topP = 0.95
  maxOutputTokens(200)
  systemMessageContains("helpful pirate")
  userMessageContains("say 'Hello!'")
  requestBodyContains("helpful")
  requestBodyContainsIgnoringCase("PIRATE")
  requestBodyDoesNotContains("unwanted text")
  requestBodyDoesNotContainsIgnoringCase("unwanted case insensitive text")
  requestMatchesPredicate { it.generationConfig?.topP == 0.95 }
} responds {
  content = "Ahoy there, matey! Hello!"
  finishReason = "stop"
  role = "model"
  delay = 42.milliseconds // delay before answer
}
```
### Configuration Options

The following tables list all available configuration options for mocking Gemini API calls.

#### Request Configuration Options

| Option                                   | Description                                                                        |
|------------------------------------------|------------------------------------------------------------------------------------|
| `temperature`                            | Controls randomness of the output. Lower values make output more deterministic.    |
| `model`                                  | The Gemini model to use.                                                           |
| `maxTokens`                              | Maximum number of tokens to generate.                                              |
| `topK`                                   | Limits token selection to the K most likely next tokens.                           |
| `topP`                                   | Limits token selection to tokens with cumulative probability of P.                 |
| `project`                                | Google Cloud project ID.                                                           |
| `location`                               | Google Cloud location.                                                             |
| `apiVersion`                             | API version to use.                                                                |
| `path`                                   | Custom request path.                                                               |
| `seed`                                   | Seed for deterministic generation.                                                 |
| `maxOutputTokens`                        | Maximum number of tokens to generate.                                              |
| `systemMessageContains`                  | Matches requests with system messages containing the specified text.               |
| `userMessageContains`                    | Matches requests with user messages containing the specified text.                 |
| `requestBodyContains`                    | Matches requests with bodies containing the specified text.                        |
| `requestBodyContainsIgnoringCase`        | Matches requests with bodies containing the specified text (case-insensitive).     |
| `requestBodyDoesNotContains`             | Matches requests with bodies not containing the specified text.                    |
| `requestBodyDoesNotContainsIgnoringCase` | Matches requests with bodies not containing the specified text (case-insensitive). |
| `requestMatchesPredicate`                | Matches requests satisfying a custom predicate.                                    |

#### Response Configuration Options

| Option         | Description                                            | Default Value                                |
|----------------|--------------------------------------------------------|----------------------------------------------|
| `content`      | The content to include in the response.                | `"This is a mock response from Gemini API."` |
| `finishReason` | The reason why the model stopped generating tokens.    | `"STOP"`                                     |
| `role`         | The role of the content.                               | `"model"`                                    |
| `delay`        | The delay before sending the response.                 | `Duration.ZERO`                              |
| `delayMillis`  | The delay before sending the response in milliseconds. | N/A                                          |

#### Streaming Content Generation

Here's an example of setting up a streaming content generation mock:

```kotlin
// Define streaming mock response
gemini.generateContentStream {
  temperature = 0.7
  model = "gemini-2.0-flash"
  project = "your-project-id"
  location = "us-central1"
  apiVersion = "v1beta1"
  seed = 42
  maxTokens = 100
  topK = 40
  topP = 0.95
  maxOutputTokens(200)
  systemMessageContains("helpful pirate")
  userMessageContains("say 'Hello!'")
} respondsStream {
  responseFlow = flow {
    emit("Ahoy")
    emit(" there,")
    delay(100.milliseconds)
    emit(" matey!")
    emit(" Hello!")
  }
  // Alternatively, you can use responseChunks = listOf("Ahoy", " there,", " matey!", " Hello!")
  // Or chunks("Ahoy", " there,", " matey!", " Hello!")
  finishReason = "stop"
  delay = 60.milliseconds // delay before first chunk
  delayBetweenChunks = 15.milliseconds // delay between chunks
}
```
#### Streaming Response Configuration Options

| Option               | Description                                                    | Default Value   |
|----------------------|----------------------------------------------------------------|-----------------|
| `responseFlow`       | A flow of content chunks to include in the streaming response. | `null`          |
| `responseChunks`     | A list of content chunks to include in the streaming response. | `null`          |
| `chunks`             | Sets the chunks of content for the streaming response.         | N/A             |
| `delayBetweenChunks` | The delay between sending chunks.                              | `Duration.ZERO` |
| `finishReason`       | The reason why the model stopped generating tokens.            | `"STOP"`        |

## Integration with Spring-AI

First, we need a function to create VertexAI client, configured to use the arbitrary server endpoint and credentials.

```kotlin
internal fun createTestVertexAI(
    endpoint: String,
    projectId: String,
    location: String,
    timeout: Duration,
): VertexAI {
    try {
        val channelProvider =
            LlmUtilityServiceStubSettings
                .defaultHttpJsonTransportProviderBuilder()
                .setEndpoint(endpoint)
                .build()

        val newHttpJsonBuilder = LlmUtilityServiceStubSettings.newHttpJsonBuilder()
        newHttpJsonBuilder.unaryMethodSettingsBuilders().forEach { builder ->
            builder.setSimpleTimeoutNoRetriesDuration(timeout.toJavaDuration())
        }

        val llmUtilityServiceStubSettings =
            newHttpJsonBuilder
                .setEndpoint(endpoint)
                .setCredentialsProvider(NoCredentialsProvider.create())
                .setTransportChannelProvider(channelProvider)
                .build()

        val llmUtilityServiceClient =
            LlmUtilityServiceClient.create(
                LlmUtilityServiceSettings.create(llmUtilityServiceStubSettings),
            )

        val predictionServiceSettingsBuilder =
            PredictionServiceSettings
                .newHttpJsonBuilder()
                .setEndpoint(endpoint)
                .setCredentialsProvider(NoCredentialsProvider.create())
                .applyToAllUnaryMethods { updater ->
                    updater.setSimpleTimeoutNoRetriesDuration(timeout.toJavaDuration()) as? Void?
                }

        val predictionServiceSettings = predictionServiceSettingsBuilder.build()
        val predictionClient = PredictionServiceClient.create(predictionServiceSettings)

        return VertexAI
            .Builder()
            .setTransport(Transport.REST)
            .setProjectId(projectId)
            .setLocation(location)
            .setLlmClientSupplier { llmUtilityServiceClient }
            .setPredictionClientSupplier { predictionClient }
            .setCredentials(ApiKeyCredentials.create("dummy-key"))
            .build()
    } catch (e: IOException) {
        throw RuntimeException(e)
    }
}
```
Then we should create `MockGemini` server and test Spring-AI integration:

```kotlin
// create mock server
val gemini = MockGemini(verbose = true)

// Create a VertexAI client that connects to the mock server
val vertexAI = createTestVertexAI(
    endpoint = gemini.baseUrl(),
    projectId = "your-project-id",
    location = "us-central1",
    timeout = 5.seconds,
)

// create Spring-AI client
val chatClient =
  ChatClient
    .builder(
      VertexAiGeminiChatModel
        .builder()
        .vertexAI(vertexAI)
        .build(),
    ).build()

// Set up a mock for the LLM call
gemini.generateContent {
  temperature = 0.7
  model = "gemini-2.0-flash"
  project = "your-project-id"
  location = "us-central1"
  systemMessageContains("You are a helpful pirate")
  userMessageContains("Just say 'Hello!'")
} responds {
  content = "Ahoy there, matey! Hello!"
  finishReason = "stop"
  delay = 42.milliseconds
}

// Configure Spring-AI client call
val response =
  chatClient
    .prompt()
    .system("You are a helpful pirate")
    .user("Just say 'Hello!'")
    .options(VertexAiGeminiChatOptions.builder().temperature(0.7).build())
    // Make a call
    .call()
    .chatResponse()

// Verify the response
response shouldNotBeNull {
  result shouldNotBeNull {
    metadata.finishReason shouldBe "STOP"
    output.text shouldBe "Ahoy there, matey! Hello!"
  }
}
```
## Streaming Responses

Mock streaming responses easily with flow support:

```kotlin
// configure mock gemini
gemini.generateContentStream {
  temperature = 0.7
  model = "gemini-2.0-flash"
  project = "your-project-id"
  location = "us-central1"
  systemMessageContains("You are a helpful pirate")
  userMessageContains("Just say 'Hello!'")
}.respondsStream(sse = false) {
  responseFlow =
    flow {
      emit("Ahoy")
      emit(" there,")
      delay(100.milliseconds)
      emit(" matey!")
      emit(" Hello!")
    }
  delay = 60.milliseconds
  delayBetweenChunks = 50.milliseconds
}

// Use Spring AI's streaming API
val buffer = StringBuffer()
val chunkCount =
  chatClient
    .prompt()
    .system("You are a helpful pirate")
    .user("Just say 'Hello!'")
    .options(VertexAiGeminiChatOptions.builder().temperature(0.7).build())
    .stream()
    .chatResponse()
    .doOnNext { chunk ->
      // Process each chunk as it arrives
      chunk.result.output.text?.let(buffer::append)
    }.count()
    .block(5.seconds.toJavaDuration())

// Verify the complete response
buffer.toString() shouldBe "Ahoy there, matey! Hello!"
```
## Integration with Google Gen AI Java SDK

AI-Mocks Gemini can also be used to test applications that use
the [Google Gen AI Java SDK](https://github.com/googleapis/java-genai) directly.

### Setting up the Client

First, create a mock Gemini server:

```kotlin
val gemini = MockGemini(verbose = true)
```
Then, configure the Google Gen AI Java SDK client to use the mock server:

```kotlin
val client = Client.builder()
  .project("your-project-id")
  .location("us-central1")
  .credentials(
    GoogleCredentials.create(
      AccessToken.newBuilder().setTokenValue("dummy-token").build()
    )
  )
  .vertexAI(true)
  .httpOptions(HttpOptions.builder().baseUrl(gemini.baseUrl()).build())
  .build()
```
### Regular Content Generation

Set up a mock response for a regular content generation request:

```kotlin
gemini.generateContent {
  temperature = 0.7
  seed = 42
  model = "gemini-2.0-flash"
  project = "your-project-id"
  location = "us-central1"
  apiVersion = "v1beta1"
  systemMessageContains("You are a helpful pirate")
  userMessageContains("Just say 'Hello!'")
} responds {
  content = "Ahoy there, matey! Hello!"
  delay = 60.milliseconds
}
```
Make a request using the Google Gen AI Java SDK:

```kotlin
val config = GenerateContentConfig.builder()
  .seed(42)
  .maxOutputTokens(100)
  .temperature(0.7f)
  .systemInstruction(
    Content.builder().role("system")
      .parts(Part.fromText("You are a helpful pirate")).build()
  )
  .build()

val response = client.models.generateContent(
  "gemini-2.0-flash",
  "Just say 'Hello!'",
  config
)

// Verify the response
response.text() shouldBe "Ahoy there, matey! Hello!"
```
### Streaming Content Generation

Set up a mock response for a streaming content generation request:

```kotlin
gemini.generateContentStream {
  temperature = 0.7
  apiVersion = "v1beta1"
  location = "us-central1"
  maxOutputTokens(100)
  model = "gemini-2.0-flash"
  project = "your-project-id"
  seed = 42
  systemMessageContains("You are a helpful pirate")
  userMessageContains("Just say 'Hello!'")
} respondsStream {
  responseFlow =
    flow {
      emit("Ahoy")
      emit(" there,")
      delay(100.milliseconds)
      emit(" matey!")
      emit(" Hello!")
    }
  delay = 60.milliseconds
  delayBetweenChunks = 15.milliseconds
}
```
Make a streaming request using the Google Gen AI Java SDK:

```kotlin
val response = client.models.generateContentStream(
  "gemini-2.0-flash",
  "Just say 'Hello!'",
  config
)

// Collect and verify the streaming response
val fullResponse = response.joinToString(separator = "") {
  it.text() ?: ""
}
fullResponse shouldBe "Ahoy there, matey! Hello!"
```
Check for examples in
the [integration tests](https://github.com/mokksy/ai-mocks/tree/main/ai-mocks-gemini/src/jvmTest/kotlin/dev/mokksy/aimocks/gemini).


# Ollama

[![Maven Central](https://img.shields.io/maven-central/v/dev.mokksy.aimocks/ai-mocks-ollama.svg?label=Maven%20Central)](https://central.sonatype.com/artifact/dev.mokksy.aimocks/ai-mocks-ollama) 

AI-Mocks Ollama is a specialized mock server implementation for mocking
the [Ollama API](https://github.com/ollama/ollama/blob/main/docs/api.md), built using Mokksy.

`MockOllama` is tested against the [LangChain4j](https://github.com/langchain4j/langchain4j) framework with the Ollama
integration.

Currently, it supports the main endpoints of the Ollama API, including:

- Generate completions
- Chat completions
- Model management
- Embeddings

## Quick Start

Include the library in your test dependencies (Maven or Gradle).
```kotlin
testImplementation("dev.mokksy.aimocks:ai-mocks-ollama-jvm:$latestVersion")
```
```xml
<dependency>
  <groupId>dev.mokksy.aimocks</groupId>
  <artifactId>ai-mocks-ollama-jvm</artifactId>
  <version>[LATEST_VERSION]</version>
  <scope>test</scope>
</dependency>
```
## Basic Setup

Set up a mock server and define mock responses:

```kotlin
// Create a mock Ollama server
val ollama = MockOllama(verbose = true)

// Get the base URL of the mock server
val baseUrl = ollama.baseUrl()
```
## Generate Completions API

Let's simulate Ollama's Generate Completions API:

```kotlin
// Define mock response
ollama.generate {
  model = "llama3"
  userMessageContains("Tell me a joke")
} responds {
  content("Why did the chicken cross the road? To get to the other side!")
  doneReason("stop")
  delay = 42.milliseconds
}

// Create request
val request = GenerateRequest(
  model = "llama3",
  prompt = "Tell me a joke",
  stream = false,
  options = ModelOptions(temperature = 0.7, topP = 0.9)
)

// Send request to mock server
val httpRequest = HttpRequest.newBuilder()
  .uri(URI.create("${ollama.baseUrl()}/api/generate"))
  .header("Content-Type", "application/json")
  .POST(
    HttpRequest.BodyPublishers.ofString(
      json.encodeToString(GenerateRequest.serializer(), request)
    )
  )
  .build()

val response = client.send(httpRequest, HttpResponse.BodyHandlers.ofString())

// Verify response
response.statusCode() shouldBe 200
val generateResponse = json.decodeFromString<GenerateResponse>(response.body())
generateResponse.response shouldBe "Why did the chicken cross the road? To get to the other side!"
generateResponse.model shouldBe "llama3"
generateResponse.done shouldBe true
generateResponse.doneReason shouldBe "stop"
```
## Chat Completions API

Let's simulate Ollama's Chat Completions API:

```kotlin
// Define mock response
ollama.chat {
  model = "llama3"
  userMessageContains("Hello")
} responds {
  content("Hello, how can I help you today?")
  delay = 42.milliseconds
}

// Create request
val request = ChatRequest(
  model = "llama3",
  messages = listOf(
    Message(
      role = "user",
      content = "Hello"
    )
  ),
  stream = false,
  options = ModelOptions(temperature = 0.7, topP = 0.9)
)

// Send request to mock server
val httpRequest = HttpRequest.newBuilder()
  .uri(URI.create("${ollama.baseUrl()}/api/chat"))
  .header("Content-Type", "application/json")
  .POST(
    HttpRequest.BodyPublishers.ofString(
      json.encodeToString(ChatRequest.serializer(), request)
    )
  )
  .build()

val response = client.send(httpRequest, HttpResponse.BodyHandlers.ofString())

// Verify response
response.statusCode() shouldBe 200
val chatResponse = json.decodeFromString<ChatResponse>(response.body())
chatResponse.message.content shouldBe "Hello, how can I help you today?"
chatResponse.model shouldBe "llama3"
chatResponse.done shouldBe true
```
## Embeddings API

Let's simulate Ollama's Embeddings API:

```kotlin
// Define mock response for a single string input
val embeddings = listOf(listOf(0.1f, 0.2f, 0.3f, 0.4f, 0.5f))

ollama.embed {
  model = "llama3"
  stringInput = "The sky is blue"
} responds {
  embeddings(embeddings)
  delay = 42.milliseconds
}

// Create request
val request = EmbeddingsRequest(
  model = "llama3",
  input = listOf("The sky is blue"),
  options = ModelOptions(temperature = 0.7, topP = 0.9)
)

// Send request to mock server
val httpRequest = HttpRequest.newBuilder()
  .uri(URI.create("${ollama.baseUrl()}/api/embed"))
  .header("Content-Type", "application/json")
  .POST(
    HttpRequest.BodyPublishers.ofString(
      json.encodeToString(EmbeddingsRequest.serializer(), request)
    )
  )
  .build()

val response = client.send(httpRequest, HttpResponse.BodyHandlers.ofString())

// Verify response
response.statusCode() shouldBe 200
val embedResponse = json.decodeFromString<EmbeddingsResponse>(response.body())
embedResponse.embeddings shouldBe embeddings
embedResponse.model shouldBe "llama3"
```
You can also mock embeddings for a list of strings:

```kotlin
// Define mock response for multiple string inputs
val embeddings = listOf(
  listOf(0.1f, 0.2f, 0.3f, 0.4f, 0.5f),
  listOf(0.6f, 0.7f, 0.8f, 0.9f, 1.0f)
)

ollama.embed {
  model = "llama3"
  stringListInput = listOf("The sky is blue", "The grass is green")
} responds {
  embeddings(embeddings)
  delay = 42.milliseconds
}

// Create request
val request = EmbeddingsRequest(
  model = "llama3",
  input = listOf("The sky is blue", "The grass is green"),
  options = ModelOptions(temperature = 0.7, topP = 0.9)
)

// Send request to mock server
val httpRequest = HttpRequest.newBuilder()
  .uri(URI.create("${ollama.baseUrl()}/api/embed"))
  .header("Content-Type", "application/json")
  .POST(
    HttpRequest.BodyPublishers.ofString(
      json.encodeToString(EmbeddingsRequest.serializer(), request)
    )
  )
  .build()

val response = client.send(httpRequest, HttpResponse.BodyHandlers.ofString())

// Verify response
response.statusCode() shouldBe 200
val embedResponse = json.decodeFromString<EmbeddingsResponse>(response.body())
embedResponse.embeddings shouldBe embeddings
embedResponse.model shouldBe "llama3"
```
## Streaming Responses

AI-Mocks-Ollama supports streaming responses for both generate and chat endpoints:

```kotlin
// Define streaming mock response for generate endpoint
ollama.generate {
  model = "llama3"
  stream = true
  userMessageContains("Tell me a story")
} respondsStream {
  responseChunks = listOf(
    "Once upon a time",
    " in a land far, far away",
    " there lived a programmer",
    " who never had to debug in production."
  )
  delayBetweenChunks = 100.milliseconds
}

// Define streaming mock response for chat endpoint
ollama.chat {
  model = "llama3"
  stream = true
} respondsStream {
  responseChunks = listOf(
    "Hello",
    ", how can I",
    " help you today?"
  )
  delayBetweenChunks = 100.milliseconds
}
```
## Request Configuration Options

The following tables list the available configuration options for mocking Ollama API calls.

### Generate Request Configuration Options

| Option              | Description                                |
|---------------------|--------------------------------------------|
| `model`             | The model to match in the request          |
| `prompt`            | The prompt to match in the request         |
| `system`            | The system message to match in the request |
| `template`          | The template to match in the request       |
| `stream`            | Whether to match streaming requests        |
| `requestBodyString` | Adds a string matcher for the request body |

### Chat Request Configuration Options

| Option              | Description                                   |
|---------------------|-----------------------------------------------|
| `model`             | The model to match in the request             |
| `messages`          | The messages to match in the request          |
| `stream`            | Whether to match streaming requests           |
| `requestBodyString` | Adds a string matcher for the request body    |
| `userMessage`       | Adds a user message to match in the request   |
| `systemMessage`     | Adds a system message to match in the request |

### Embed Request Configuration Options

| Option              | Description                                                |
|---------------------|------------------------------------------------------------|
| `model`             | The model to match in the request                          |
| `stringInput`       | The string input to match in the request                   |
| `stringListInput`   | The list of string inputs to match in the request          |
| `truncate`          | Whether to truncate the input to fit within context length |
| `options`           | Additional model parameters to match in the request        |
| `keepAlive`         | Controls how long the model will stay loaded into memory   |
| `requestBodyString` | Adds a string matcher for the request body                 |

## Response Configuration Options

### Generate Response Configuration Options

| Option       | Description                                                  | Default Value                            |
|--------------|--------------------------------------------------------------|------------------------------------------|
| `content`    | The content to include in the response                       | `"This is a mock response from Ollama."` |
| `doneReason` | The reason why generation completed (e.g., "stop", "length") | `"stop"`                                 |
| `delay`      | The delay before sending the response                        | `Duration.ZERO`                          |

### Chat Response Configuration Options

| Option      | Description                               | Default Value                            |
|-------------|-------------------------------------------|------------------------------------------|
| `content`   | The content to include in the response    | `"This is a mock response from Ollama."` |
| `thinking`  | The thinking process of the model         | `null`                                   |
| `toolCalls` | The tool calls to include in the response | `null`                                   |
| `delay`     | The delay before sending the response     | `Duration.ZERO`                          |

### Embed Response Configuration Options

| Option       | Description                                   | Default Value                                  |
|--------------|-----------------------------------------------|------------------------------------------------|
| `embeddings` | The embeddings to include in the response     | `listOf(listOf(0.1f, 0.2f, 0.3f, 0.4f, 0.5f))` |
| `embedding`  | A single embedding to include in the response | N/A                                            |
| `model`      | The model name to include in the response     | `null`                                         |
| `delay`      | The delay before sending the response         | `Duration.ZERO`                                |

### Streaming Response Configuration Options

| Option               | Description                                         | Default Value   | Availability    |
|----------------------|-----------------------------------------------------|-----------------|-----------------|
| `responseFlow`       | A flow of content chunks for the streaming response | `null`          | Generate & Chat |
| `responseChunks`     | A list of content chunks for the streaming response | `null`          | Generate & Chat |
| `delayBetweenChunks` | The delay between sending chunks                    | `Duration.ZERO` | Generate & Chat |
| `doneReason`         | The reason why generation completed                 | `"stop"`        | Generate only   |

## Integration Testing

Create a test class with a `MockOllama` instance to test your Ollama client integration:

```kotlin
class MyOllamaTest {
  private val ollama = MockOllama()

  @Test
  fun `Should respond to Chat Completion`() = runTest {
    // Configure mock response
    ollama.chat {
      model = "llama3"
    } responds {
      content("Hello, how can I help you today?")
    }

    // Use your Ollama client to make a request and verify the response
  }
}
```
## Integration with LangChain4j

AI-Mocks-Ollama can be used with LangChain4j's Ollama integration:

```kotlin
// Create a mock Ollama server
val ollama = MockOllama(verbose = true)

// Configure mock response
ollama.chat {
  model = "llama3"
} responds {
  content("Hello, how can I help you today?")
  delay = 42.milliseconds
}

// Create LangChain4j Ollama client
val model = OllamaChatModel.builder()
  .baseUrl(ollama.baseUrl())
  .modelName("llama3")
  .temperature(0.7)
  .topP(0.9)
  .build()

// Use LangChain4j Kotlin DSL to send a request
val result = model.chat {
  messages += userMessage("Hello")
}

// Verify response
result.apply {
  aiMessage().text() shouldBe "Hello, how can I help you today?"
}
```
Check for examples in
the [integration tests](https://github.com/mokksy/ai-mocks/tree/main/ai-mocks-ollama/src/jvmTest/kotlin/dev/mokksy/aimocks/ollama).


# A2A Protocol

[![Maven Central](https://img.shields.io/maven-central/v/dev.mokksy.aimocks/ai-mocks-a2a.svg?label=Maven%20Central)](https://central.sonatype.com/artifact/dev.mokksy.aimocks/ai-mocks-a2a)

[MockAgentServer](https://github.com/mokksy/ai-mocks/blob/main/ai-mocks-a2a/src/commonMain/kotlin/dev/mokksy/aimocks/a2a/MockAgentServer.kt) provides a local mock server for simulating [A2A (Agent-to-Agent) API](https://a2a-protocol.org/latest/specification/) endpoints.
It simplifies testing by allowing you to define request expectations and responses without making real network calls.

**NB!** The server only supports [JSON-RPC 2.0 transport](https://a2a-protocol.org/latest/specification/#321-json-rpc-20-transport).
Supported A2A protocol version is **0.3.0**.

## Quick Start

### Add Dependency

Include the library in your test dependencies (Maven or Gradle).

```xml
<dependency>
    <groupId>dev.mokksy.aimocks</groupId>
    <artifactId>ai-mocks-a2a-jvm</artifactId>
    <version>[LATEST_VERSION]</version>
  <scope>test</scope>
</dependency>
```
```kotlin
dependencies {
    testImplementation("me.kpavlov.aimocks:ai-mocks-a2a:0.x.x")
    // Optional: typed model classes
    testImplementation("me.kpavlov.aimocks:ai-mocks-a2a-models:0.x.x")
}
```
```groovy
dependencies {
    testImplementation 'me.kpavlov.aimocks:ai-mocks-a2a:0.x.x'
    testImplementation 'me.kpavlov.aimocks:ai-mocks-a2a-models:0.x.x'
}
```
### Initialize the Server

```kotlin
val a2aServer = MockAgentServer(verbose = true)
```
- The server will start on a random free port by default.
- You can retrieve the server's base URL via `a2aServer.baseUrl()`.

## HTTP Client Setup

You may use any HTTP client that supports Server-Sent Events (SSE) to make requests to the mock server. The AI-Mocks A2A
library provides a convenient function to create a Ktor client configured for A2A:

```kotlin
// Create a Ktor client configured for A2A
val a2aClient = A2AClientFactory.create(baseUrl = a2aServer.baseUrl())
```
Alternatively, you can create the client manually:

```kotlin
// Create a Ktor client configured for A2A
val a2aClient = HttpClient(Java) {
    val json = Json {
        prettyPrint = true
        isLenient = true
    }
    install(ContentNegotiation) {
        json(json)
    }
    install(SSE) {
        showRetryEvents()
        showCommentEvents()
    }
    install(DefaultRequest) {
        url(a2aServer.baseUrl()) // Set the base URL
    }
}
```
## Agent Card Endpoint

The [Agent Card endpoint](https://a2a-protocol.org/latest/specification/#55-agentcard-object-structure) provides information about the agent's capabilities, skills, and authentication mechanisms. Remote Agents that support A2A are required to publish an **Agent Card** in JSON format describing the agent's capabilities/skills and authentication mechanism. Clients use the Agent Card information to identify the best agent that can perform a task and leverage A2A to communicate with that remote agent.

Mock Server configuration:

```kotlin
// Create an AgentCard object
val agentCard = AgentCard.create {
    name = "test-agent"
    description = "test-agent-description"
    url = a2aServer.baseUrl()
    documentationUrl = "https://example.com/documentation"
    version = "0.0.1"
    provider {
        organization = "Acme, Inc."
        url = "https://example.com/organization"
    }
    capabilities {
        streaming = true
        pushNotifications = true
        stateTransitionHistory = true
    }
    skills += skill {
        id = "walk"
        name = "Walk the walk"
      description = "I can walk"
      tags = listOf("move")
    }
    skills += skill {
        id = "talk"
        name = "Talk the talk"
      description = "I can talk"
      tags = listOf("communicate")
    }
}

// Configure the mock server to respond with the AgentCard
a2aServer.agentCard() responds {
    delay = 1.milliseconds
    card = agentCard
}
```
Client call example:

```kotlin
// Make a GET request to the Agent Card endpoint
val response = a2aClient
  .get("/.well-known/agent-card.json") {
    }.call
    .response
    .body<String>()

// Parse the response into an AgentCard object
val receivedCard = Json.decodeFromString<AgentCard>(response)
```
## Get Task Endpoint

The [Get Task endpoint](https://a2a-protocol.org/latest/specification/#73-tasksget) allows clients to retrieve information about a specific task. Clients may use this method to retrieve the generated Artifacts for a Task. The agent determines the retention window for Tasks previously submitted to it. The client may also request the last N items of history of the Task which will include all Messages, in order, sent by client and server.

Mock Server configuration:

```kotlin
// Configure the mock server to respond with a task
a2aServer.getTask() responds {
    id = 1
    result {
        id = "tid_12345"
        contextId = "ctx_12345"
        status {
            state = "completed"
        }
        artifacts += artifact {
            name = "joke"
            parts += textPart {
                text = "This is a joke"
            }
        }
    }
}
```
You can also configure the mock server to respond with an error:

```kotlin
// Configure the mock server to respond with a task not found error
a2aServer.getTask() responds {
    id = 1
    error = taskNotFoundError()
}
```
Client call example:

```kotlin
// Create a GetTaskRequest object
val jsonRpcRequest = GetTaskRequest(
    id = "1",
    params = TaskQueryParams(
        id = UUID.randomUUID().toString(),
        historyLength = 2,
    ),
)

// Make a POST request to the Get Task endpoint
val response = a2aClient
    .post("/") {
        contentType(ContentType.Application.Json)
        setBody(Json.encodeToString(jsonRpcRequest))
    }.call
    .response

// Parse the response into a GetTaskResponse object
val body = response.body<String>()
val payload = Json.decodeFromString<GetTaskResponse>(body)
```
## Send Message Endpoint

The [Send Message endpoint](https://a2a-protocol.org/latest/specification/#71-messagesend) allows clients to send a message to the agent for processing. This method allows a client to send content to a remote agent to start a new Task, resume an interrupted Task or reopen a completed Task. A Task interrupt may be caused due to an agent requiring additional user input or a runtime error.

Mock Server configuration:

```kotlin
// Create a Task object
val task = Task.create {
  id = "tid_12345"
  contextId = "ctx_12345"
  status {
    state = "completed"
  }
  artifact {
    name = "joke"
    parts += text { "This is a joke" }
    parts += file { uri = "https://example.com/readme.md" }
    parts += file { bytes = "1234".toByteArray() }
    parts += data { mapOf("foo" to "bar") }
  }
}

// Configure the mock server to respond with the task
a2aServer.sendMessage() responds {
  id = 1
  result = task
}
```
Client call example:

```kotlin
// Create a SendMessageRequest object using the builder function
val jsonRpcRequest = sendMessageRequest {
    id = "1"
    params {
        message {
            role = Message.Role.user
            parts += text { "Tell me a joke" }
            parts += file { uri = "https://example.com/readme.md" }
            parts += file { bytes = "1234".toByteArray() }
            parts += data { mapOf("foo" to "bar") }
        }
    }
}

// Make a POST request to the Send Message endpoint
val response = a2aClient
    .post("/") {
        contentType(ContentType.Application.Json)
        setBody(Json.encodeToString(jsonRpcRequest))
    }.call
    .response

// Parse the response into a SendMessageResponse object
val body = response.body<String>()
val payload = Json.decodeFromString<SendMessageResponse>(body)
```
## Send Message Streaming Endpoint

The [Send Message Streaming endpoint](https://a2a-protocol.org/latest/specification/#72-messagestream) allows clients to send a message to the agent for processing and receive streaming updates. For clients and remote agents capable of communicating over HTTP with Server-Sent Events (SSE), clients can send the RPC request with method `message/stream` when creating a new Task. The remote agent can respond with a stream of TaskStatusUpdateEvents (to communicate status changes or instructions/requests) and TaskArtifactUpdateEvents (to stream generated results).

Mock Server configuration:

```kotlin
// Configure the mock server to respond with streaming updates
val taskId = "task_12345"

a2aServer.sendMessageStreaming() responds {
    delayBetweenChunks = 1.seconds
    responseFlow = flow {
      emit(
        taskStatusUpdateEvent {
          id = taskId
          status {
            state = "working"
            timestamp = Clock.System.now()
          }
        }
      )
      emit(
        taskArtifactUpdateEvent {
          id = taskId
          artifact {
            name = "joke"
            parts += textPart {
              text = "This"
            }
          }
        }
      )
      emit(
        taskArtifactUpdateEvent {
          id = taskId
          artifact {
            name = "joke"
            parts += textPart {
              text = "is"
            }
            append = true
          }
        }
      )
      emit(
        taskArtifactUpdateEvent {
          id = taskId
          artifact {
            name = "joke"
            parts += textPart {
              text = "a"
            }
            append = true
          }
        }
      )
      emit(
        taskArtifactUpdateEvent {
          id = taskId
          artifact {
            name = "joke"
            parts += textPart {
              text = "joke!"
            }
            append = true
            lastChunk = true
          }
        }
      )
      emit(
        taskStatusUpdateEvent {
          id = taskId
          status {
            state = "completed"
            timestamp = Clock.System.now()
          }
          final = true
        }
      )
    }
}
```
Client call example:

```kotlin
// Create a collection to store the events
var collectedEvents = ConcurrentLinkedQueue<TaskUpdateEvent>()

// Helper function to handle events
fun handleEvent(event: TaskUpdateEvent): Boolean {
    when (event) {
        is TaskStatusUpdateEvent -> {
            println("Task status: $event")
            if (event.final) {
                return false
            }
        }
        is TaskArtifactUpdateEvent -> {
            println("Task artifact: $event")
        }
    }
    return true
}

// Make a POST request to the Send Message Streaming endpoint with SSE
a2aClient.sse(
    request = {
        url { a2aServer.baseUrl() }
        method = HttpMethod.Post
        val payload = SendStreamingMessageRequest(
            id = "1",
            params = MessageSendParams.create {
                message {
                    role = Message.Role.user
                    parts += textPart {
                        text = "Tell me a joke"
                    }
                }
            },
        )
        body = TextContent(
            text = Json.encodeToString(payload),
            contentType = ContentType.Application.Json,
        )
    },
) {
    var reading = true
    while (reading) {
        incoming.collect {
            println("Event from server:\n$it")
            it.data?.let {
                val event = Json.decodeFromString<TaskUpdateEvent>(it)
                collectedEvents.add(event)
                if (!handleEvent(event)) {
                    reading = false
                    cancel("Finished")
                }
            }
        }
    }
}
```
## Cancel Task Endpoint

The [Cancel Task endpoint](https://a2a-protocol.org/latest/specification/#74-taskscancel) allows clients to cancel a task that is in progress. A client may choose to cancel previously submitted Tasks, for example when the user no longer needs the result or wants to stop a long-running task.

Mock Server configuration:

```kotlin
// Configure the mock server to respond with a canceled task
a2aServer.cancelTask() responds {
    id = 1
    result {
        id = "tid_12345"
        contextId = UUID.randomUUID().toString()
        status = TaskStatus(state = "canceled")
    }
}
```
Client call example:

```kotlin
// Create a CancelTaskRequest object
val jsonRpcRequest = cancelTaskRequest {
    id = "1"
    params {
        id = UUID.randomUUID().toString()
    }
}

// Make a POST request to the Cancel Task endpoint
val response = a2aClient
    .post("/") {
        contentType(ContentType.Application.Json)
        setBody(Json.encodeToString(jsonRpcRequest))
    }.call
    .response

// Parse the response into a CancelTaskResponse object
val body = response.body<String>()
val payload = Json.decodeFromString<CancelTaskResponse>(body)
```
## Set Task Push Notification Config Endpoint

The [Set Task Push Notification endpoint](https://a2a-protocol.org/latest/specification/#75-taskspushnotificationconfigset) allows clients to configure push notifications for a task. Clients may configure a push notification URL for receiving updates on Task status changes. This is particularly useful for long-running tasks where the client may not want to maintain an open connection.

Mock Server configuration:

```kotlin
// Create a TaskPushNotificationConfig object
val taskId: TaskId = "task_12345"
val config = TaskPushNotificationConfig.create {
    id = taskId
    pushNotificationConfig {
        url = "https://example.com/callback"
        token = "abc.def.jk"
        authentication {
            credentials = "secret"
            schemes += "Bearer"
        }
    }
}

// Configure the mock server to respond with the config
a2aServer.setTaskPushNotification() responds {
    id = 1
    result {
        id = taskId
        pushNotificationConfig {
            url = "https://example.com/callback"
            token = "abc.def.jk"
            authentication {
                credentials = "secret"
                schemes += "Bearer"
            }
        }
    }
}
```
Client call example:

```kotlin
// Create a TaskPushNotificationConfig object
val config = TaskPushNotificationConfig.create {
    id = "task_12345"
    pushNotificationConfig {
        url = "https://example.com/callback"
        token = "abc.def.jk"
        authentication {
            credentials = "secret"
            schemes += "Bearer"
        }
    }
}

// Create a SetTaskPushNotificationRequest object
val jsonRpcRequest = SetTaskPushNotificationRequest(
    id = "1",
    params = config,
)

// Make a POST request to the Set Task Push Notification endpoint
val response = a2aClient
    .post("/") {
        contentType(ContentType.Application.Json)
        setBody(Json.encodeToString(jsonRpcRequest))
    }.call
    .response

// Parse the response into a SetTaskPushNotificationResponse object
val body = response.body<String>()
val payload = Json.decodeFromString<SetTaskPushNotificationResponse>(body)
```
## Get Task Push Notification Config Endpoint

The [Get Task Push Notification endpoint](https://a2a-protocol.org/latest/specification/#76-taskspushnotificationconfigget) allows clients to retrieve the push notification configuration for a specific task. Clients may retrieve the currently configured push notification configuration for a Task using this method, which is useful for verifying or displaying the current notification settings.

Mock Server configuration:

```kotlin
// Create a TaskPushNotificationConfig object
val taskId: TaskId = "task_12345"
val config = TaskPushNotificationConfig(
    id = taskId,
    pushNotificationConfig = PushNotificationConfig(
        url = "https://example.com/callback",
        token = "abc.def.jk",
        authentication = AuthenticationInfo(
            schemes = listOf("Bearer"),
        ),
    ),
)

// Configure the mock server to respond with the config
a2aServer.getTaskPushNotification() responds {
    id = 1
    result = config
}
```
Client call example:

```kotlin
// Create a GetTaskPushNotificationRequest object
val jsonRpcRequest = GetTaskPushNotificationRequest(
    id = "1",
    params = TaskIdParams(
        id = taskId,
    ),
)

// Make a POST request to the Get Task Push Notification endpoint
val response = a2aClient
    .post("/") {
        contentType(ContentType.Application.Json)
        setBody(Json.encodeToString(jsonRpcRequest))
    }.call
    .response

// Parse the response into a GetTaskPushNotificationResponse object
val body = response.body<String>()
val payload = Json.decodeFromString<GetTaskPushNotificationResponse>(body)
```
## List Task Push Notification Config Endpoint

The [List Task Push Notification Config endpoint](https://a2a-protocol.org/latest/specification/#77-taskspushnotificationconfiglist)
allows clients to list configured push notification destinations. This can be useful to inspect or manage existing
configurations.

Mock Server configuration:

```kotlin
// Configure the mock server to respond with a list of push notification configs
val taskId: TaskId = "task_12345"

a2aServer.listTaskPushNotificationConfig() responds {
  id = 1
  result = listOf(
    TaskPushNotificationConfig.create {
      id = taskId
      pushNotificationConfig {
        url = "https://example.com/callback"
        token = "abc.def.jk"
        authentication {
          schemes += "Bearer"
        }
      }
    }
  )
}
```
Client call example:

```kotlin
// Build a ListTaskPushNotificationConfigRequest
val jsonRpcRequest = ListTaskPushNotificationConfigRequest(
  id = "1",
  params = ListTaskPushNotificationConfigParams.create {
    limit(10)
    offset(0)
  },
)

// Make a POST request to the List Task Push Notification Config endpoint
val response = a2aClient
  .post("/") {
    contentType(ContentType.Application.Json)
    setBody(Json.encodeToString(jsonRpcRequest))
  }.call
  .response

// Parse the response
val body = response.body<String>()
val payload = Json.decodeFromString<ListTaskPushNotificationConfigResponse>(body)
```
## Delete Task Push Notification Config Endpoint

The [Delete Task Push Notification Config endpoint](https://a2a-protocol.org/latest/specification/#78-taskspushnotificationconfigdelete)
allows clients to delete the configured push notification destination for a task.

Mock Server configuration:

```kotlin
// Configure the mock server to respond to delete push notification config
val taskId: TaskId = "task_12345"

a2aServer.deleteTaskPushNotificationConfig() responds {
  id = 1
  // success without error
}
```
Client call example:

```kotlin
// Build a DeleteTaskPushNotificationConfigRequest
val jsonRpcRequest = DeleteTaskPushNotificationConfigRequest(
  id = "1",
  params = deleteTaskPushNotificationConfigParams {
    id(taskId)
  },
)

// Make a POST request to the Delete Task Push Notification Config endpoint
val response = a2aClient
  .post("/") {
    contentType(ContentType.Application.Json)
    setBody(Json.encodeToString(jsonRpcRequest))
  }.call
  .response

// Parse the response
val body = response.body<String>()
val payload = Json.decodeFromString<DeleteTaskPushNotificationConfigResponse>(body)
```
## Task Resubscription Endpoint

The [Task Resubscription endpoint](https://a2a-protocol.org/latest/specification/#79-tasksresubscribe) allows clients to resubscribe to streaming updates for a task that was previously created. This is useful when a client loses connection and needs to resume receiving updates for an ongoing task. A disconnected client may resubscribe to a remote agent that supports streaming to receive Task updates via Server-Sent Events (SSE).

Mock Server configuration:

```kotlin
// Configure the mock server to respond with streaming updates
val taskId: TaskId = "task_12345"

a2aServer.taskResubscription() responds {
    delayBetweenChunks = 1.seconds
    responseFlow = flow {
        emit(
            taskStatusUpdateEvent {
                id = taskId
                status {
                    state = "working"
                  timestamp = Clock.System.now()
                }
            }
        )
        emit(
            taskArtifactUpdateEvent {
                id = taskId
                artifact {
                    name = "joke"
                    parts += textPart {
                        text = "This is a resubscribed joke!"
                    }
                    lastChunk = true
                }
            }
        )
        emit(
            taskStatusUpdateEvent {
                id = taskId
                status {
                    state = "completed"
                  timestamp = Clock.System.now()
                }
                final = true
            }
        )
    }
}
```
Client call example:

```kotlin
// Create a collection to store the events
val collectedEvents = ConcurrentLinkedQueue<TaskUpdateEvent>()

// Helper function to handle events
fun handleEvent(event: TaskUpdateEvent): Boolean {
    when (event) {
        is TaskStatusUpdateEvent -> {
            println("Task status: $event")
            if (event.final) {
                return false
            }
        }
        is TaskArtifactUpdateEvent -> {
            println("Task artifact: $event")
        }
    }
    return true
}

// Make a POST request to the Task Resubscription endpoint with SSE
a2aClient.sse(
    request = {
        url { a2aServer.baseUrl() }
        method = HttpMethod.Post
        contentType(ContentType.Application.Json)
        val payload = TaskResubscriptionRequest(
            id = "1",
            params = TaskQueryParams(
                id = taskId,
            ),
        )
        setBody(payload)
    },
) {
    var reading = true
    while (reading) {
        incoming.collect {
            println("Event from server:\n$it")
            it.data?.let {
                val event = Json.decodeFromString<TaskUpdateEvent>(it)
                collectedEvents.add(event)
                if (!handleEvent(event)) {
                    reading = false
                    cancel("Finished")
                }
            }
        }
    }
}
```
## Testing Push Notifications

The A2A protocol supports push notifications, which allow agents to notify clients of updates outside a connected session. This is particularly useful for long-running tasks where the client may not want to maintain an open connection.

### Accessing Task Notification History

You can access the notification history for a specific task using the `getTaskNotifications` method:

```kotlin
val taskId: TaskId = "task_12345"
val notificationHistory = a2aServer.getTaskNotifications(taskId)

// Verify that the history is initially empty
notificationHistory.events() shouldHaveSize 0
```
### Sending Push Notifications

You can send push notifications using the `sendPushNotification` method:

```kotlin
val taskUpdateEvent = taskArtifactUpdateEvent {
    id = taskId
    artifact {
        name = "joke"
        parts += textPart {
            text = "This is a notification joke!"
        }
        lastChunk = true
    }
}
a2aServer.sendPushNotification(event = taskUpdateEvent)
```
### Verifying Notifications

You can verify that notifications were received by checking the notification history:

```kotlin
// Verify that the notification history contains the event
notificationHistory.events() shouldContain taskUpdateEvent
```
## Verifying Requests

After your test is complete, you can verify that all expected requests were received:

```kotlin
a2aServer.verifyNoUnexpectedRequests()
```
This ensures that your test made all the expected requests to the mock server.


# Spring Boot


Use Mokksy in Spring Boot when your application calls an external HTTP dependency through
configuration, `RestClient`, `WebClient`, or another HTTP client bean.

```text
Spring Boot test -> application HTTP client -> Mokksy -> stubbed external API
```

## Typical setup

1. Start Mokksy in the test fixture.
2. Inject `mokksy.baseUrl()` into the property your application uses for the external base URL.
3. Register stubs before the application code makes the call.
4. Execute the real Spring Boot behavior and verify the request journal.

In practice, step 2 usually means overriding the Spring configuration property that holds the
outbound service URL during test startup, rather than changing production bean wiring by hand.

This works well for payment gateways, fraud or risk APIs, telecom provisioning services, document
pipelines, and internal platform dependencies.

## Where to go next

This is a routing guide for Spring Boot projects. Use the linked Mokksy pages for the tested stub and verification APIs, and override your application's external base-URL property in the test profile or test fixture.

- [First integration test](../../mokksy/first-integration-test/) for the end-to-end test shape
- [Stubbing responses](../../mokksy/stubbing/) for response DSL examples
- [Request matching](../../mokksy/request-matching/) for path, header, and body matching

If the dependency is OpenAI, Anthropic, Gemini, Ollama, or A2A, use [AI-Mocks](../../ai-mocks/)
instead of raw Mokksy.


# Quarkus


Quarkus follows the same base-URL replacement pattern as Spring Boot: start Mokksy before the
test, configure the application to call `mokksy.baseUrl()`, then exercise the real Quarkus HTTP
client code.

In practice, that usually means overriding the Quarkus config value that normally holds the
external service base URL during test startup, rather than swapping clients manually inside the
application.

```text
Quarkus test -> application HTTP client -> Mokksy -> stubbed external API
```

Use this for standard HTTP integrations as well as AI provider clients that sit behind Quarkus
services.

## Demos

- ["LangChain4j with Quarkus (KotlinConf`25)"](https://2025.kotlinconf.com/talks/795976/)
- ["Financial Assistant Chatbot with Easy RAG"](https://github.com/kpavlov/quarkus-assistant-demo)
- ["Quarkus LC4J Demo"](https://github.com/kpavlov/quarkus-ai-demo/tree/main#demo-2---service-with-mock-openai)

## Where to go next

This is a routing guide for Quarkus projects. Use the linked Mokksy and AI-Mocks pages for the tested APIs, and override the outbound base-URL configuration value in your Quarkus test setup.

- [Mokksy overview](../../mokksy/) for core HTTP and SSE mocks
- [LangChain4j](../langchain4j/) if the Quarkus service uses LangChain4j
- [OpenAI SDK](../openai-sdk/) or [Anthropic SDK](../anthropic-sdk/) for provider SDK clients


# LangChain4j


Use AI-Mocks when your LangChain4j code talks to a provider API. Mokksy supplies the underlying
HTTP and SSE behavior, and AI-Mocks adds provider-compatible request and response shapes.

## Supported provider guides

This is a routing guide for LangChain4j. Choose the provider page that matches the model client configured in your application.

- [OpenAI with LangChain4j](../../ai-mocks/openai/#integration-with-langchain4j)
- [Anthropic with LangChain4j](../../ai-mocks/anthropic/#integration-with-langchain4j)
- [Ollama with LangChain4j](../../ai-mocks/ollama/#integration-with-langchain4j)

## Demo

- ["LangChain4j with Quarkus (KotlinConf`25)"](https://2025.kotlinconf.com/talks/795976/)
- ["Financial Assistant Chatbot with Easy RAG"](https://github.com/kpavlov/quarkus-assistant-demo)

## Product choice

- Use [Mokksy](../../mokksy/) directly for generic HTTP dependencies.
- Use [AI-Mocks](../../ai-mocks/) for provider-backed LangChain4j tests.


# Spring AI


Spring AI sits on top of provider APIs, so the correct integration point is AI-Mocks rather than
plain Mokksy. Use the provider-specific guide that matches your Spring AI client.

## Supported provider guides

This is a routing guide for Spring AI. Choose the provider page that matches the Spring AI client configured in your application.

- [OpenAI with Spring AI](../../ai-mocks/openai/#integration-with-spring-ai)
- [Gemini with Spring AI](../../ai-mocks/gemini/#integration-with-spring-ai)

These guides cover provider-compatible request formats, streaming behavior, and error handling
without live provider credentials, rate limits, or provider outages.

## Product choice

- Use [Mokksy](../../mokksy/) for general HTTP services in Spring applications.
- Use [AI-Mocks](../../ai-mocks/) for Spring AI clients.


# OpenAI Java SDK


Use [AI-Mocks OpenAI](../../ai-mocks/openai/) when production code calls the official
[`openai-java`](https://github.com/openai/openai-java) SDK. Your test runs the real SDK client
against a local OpenAI-compatible endpoint by replacing only the base URL and using a dummy
credential.

```text
Integration test -> openai-java client -> AI-Mocks OpenAI -> Mokksy HTTP/SSE server
```

The examples below follow the official SDK integration tests in the AI-Mocks repository:
[Kotlin chat and streaming tests](https://github.com/mokksy/ai-mocks/tree/main/ai-mocks-openai/src/jvmTest/kotlin/dev/mokksy/aimocks/openai/official)
and the [Java chat test](https://github.com/mokksy/ai-mocks/blob/main/ai-mocks-openai/src/jvmTest/java/dev/mokksy/aimocks/openai/MockOpenaiJavaTest.java).

## Configure the client

Point the official SDK client to `openai.baseUrl()`. The API key is required by the SDK builder,
but no live OpenAI credential is used because requests go to the local mock server.

```kotlin
import com.openai.client.OpenAIClient
import com.openai.client.okhttp.OpenAIOkHttpClient
import dev.mokksy.aimocks.openai.MockOpenai

val openai = MockOpenai(verbose = true)

val client: OpenAIClient =
  OpenAIOkHttpClient.builder()
    .apiKey("dummy-key-for-tests")
    .baseUrl(openai.baseUrl())
    .responseValidation(true)
    .build()
```
```java
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import dev.mokksy.aimocks.openai.MockOpenai;

var openai = new MockOpenai();
OpenAIClient client = OpenAIOkHttpClient.builder()
    .apiKey("dummy-key-for-tests")
    .baseUrl(openai.baseUrl())
    .build();
```
## Test a chat completion

Register the expected provider request with AI-Mocks, then make the SDK call through the configured
client. The test fails if application code sends a request that does not match the stub.

```kotlin
import com.openai.models.chat.completions.ChatCompletionCreateParams
import com.openai.models.chat.completions.ChatCompletionMessageParam
import com.openai.models.chat.completions.ChatCompletionUserMessageParam
import io.kotest.matchers.shouldBe

openai.completion {
  model = "gpt-4o-mini"
  userMessageContains("say 'Hello!'")
} responds {
  assistantContent = "Hello"
  finishReason = "stop"
}

val params =
  ChatCompletionCreateParams.builder()
    .messages(
      listOf(
        ChatCompletionMessageParam.ofUser(
          ChatCompletionUserMessageParam.builder()
            .content("Just say 'Hello!' and nothing else")
            .build()
        )
      )
    )
    .model("gpt-4o-mini")
    .build()

val result = client.chat().completions().create(params)
result.choices().first().message().content().orElseThrow() shouldBe "Hello"
```
```java
import com.openai.core.JsonValue;
import com.openai.models.ChatModel;
import com.openai.models.chat.completions.ChatCompletionCreateParams;
import com.openai.models.chat.completions.ChatCompletionMessageParam;
import com.openai.models.chat.completions.ChatCompletionUserMessageParam;
import java.util.List;
import static org.assertj.core.api.Assertions.assertThat;

openai.completion(req -> {
    req.model("gpt-4o-mini");
    req.requestBodyContains("say 'Hey!'");
}).responds(response -> {
    response.assistantContent("Hey!");
    response.finishReason("stop");
});

var params = ChatCompletionCreateParams.builder()
    .messages(List.of(ChatCompletionMessageParam.ofUser(
        ChatCompletionUserMessageParam.builder()
            .role(JsonValue.from("user"))
            .content("Just say 'Hey!'").build())))
    .model(ChatModel.GPT_4O_MINI)
    .build();

var result = client.chat().completions().create(params);
assertThat(result.choices().get(0).message().content()).hasValue("Hey!");
```
## Test streaming behavior

The repository also tests `client.chat().completions().createStreaming(...)` against
`openai.completion { ... } respondsStream { ... }`, including delays before the first response and
between chunks. Use that path when application behavior depends on incremental delivery rather
than only the final message.

See the runnable [OpenAI streaming examples](../../ai-mocks/openai/#stream-responses) for the
complete Kotlin setup.

## Covered provider surfaces

The AI-Mocks OpenAI integration tests exercise the official SDK with:

- Chat Completions, including streaming completions
- Responses inputs
- Embeddings
- Moderations
- HTTP error responses

## Next steps

- [OpenAI provider reference](../../ai-mocks/openai/) for the mock DSL and supported endpoint examples
- [Spring AI](../spring-ai/) if the SDK is hidden behind Spring AI
- [LangChain4j](../langchain4j/) if the SDK is hidden behind LangChain4j
- [Spring Boot](../spring-boot/) or [Quarkus](../quarkus/) for application-level base URL configuration
- [Integrations overview](../) for all client and framework guides


# Anthropic Java SDK


Use [AI-Mocks Anthropic](../../ai-mocks/anthropic/) when production code calls the official
Anthropic Java SDK. Point the real SDK client at `anthropic.baseUrl()` so tests execute
provider-shaped HTTP and streaming requests locally.

```text
Integration test -> Anthropic Java SDK -> AI-Mocks Anthropic -> Mokksy HTTP/SSE server
```

The official SDK API shapes shown below are backed by
[Anthropic SDK integration tests](https://github.com/mokksy/ai-mocks/tree/main/ai-mocks-anthropic/src/jvmTest/kotlin/dev/mokksy/aimocks/anthropic/official).
This repository currently verifies the official Anthropic SDK from Kotlin; it also verifies
LangChain4j usage separately.

## Configure the client

The SDK requires an API key value during construction. Because `baseUrl()` routes the request to
the local mock server, use a dummy value in tests rather than a live provider credential.

```kotlin
import com.anthropic.client.AnthropicClient
import com.anthropic.client.okhttp.AnthropicOkHttpClient
import dev.mokksy.aimocks.anthropic.MockAnthropic

val anthropic = MockAnthropic(verbose = true)

val client: AnthropicClient =
  AnthropicOkHttpClient.builder()
    .apiKey("dummy-key-for-tests")
    .baseUrl(anthropic.baseUrl())
    .responseValidation(true)
    .build()
```

## Test the Messages API

Register the request criteria and deterministic reply before invoking `client.messages().create(...)`.

```kotlin
import com.anthropic.models.messages.MessageCreateParams
import io.kotest.matchers.shouldBe
import kotlin.jvm.optionals.getOrNull

anthropic.messages {
  model = "claude-3-7-sonnet-latest"
  maxTokens = 100
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
} responds {
  messageId = "msg_test"
  assistantContent = "Hello"
}

val params =
  MessageCreateParams.builder()
    .model("claude-3-7-sonnet-latest")
    .maxTokens(100)
    .system("You are a helpful assistant.")
    .addUserMessage("Just say 'Hello!' and nothing else")
    .build()

val result = client.messages().create(params)
result.content().mapNotNull { it.text().getOrNull() }.first().text() shouldBe "Hello"
```

## Test streaming Messages

The official SDK tests also configure streamed message content and consume it through
`client.messages().createStreaming(...)`:

```kotlin
import com.anthropic.models.messages.MessageCreateParams
import io.kotest.matchers.collections.shouldContainExactly
import kotlin.time.Duration.Companion.milliseconds

val tokens = listOf("All", " we", " need", " is", " Love")

anthropic.messages {
  model = "claude-3-7-sonnet-latest"
  userMessageContains("What do we need?")
} respondsStream {
  responseChunks = tokens
  delay = 50.milliseconds
  delayBetweenChunks = 10.milliseconds
  stopReason = "end_turn"
}

val params =
  MessageCreateParams.builder()
    .model("claude-3-7-sonnet-latest")
    .maxTokens(100)
    .addUserMessage("What do we need?")
    .build()

val received = mutableListOf<String>()
client.messages().createStreaming(params).use { response ->
  response.stream()
    .filter { it.isContentBlockDelta() }
    .forEachOrdered { chunk ->
      received += chunk.asContentBlockDelta().delta().asText().text()
    }
}

received shouldContainExactly tokens
```

## Next steps

- [Anthropic provider reference](../../ai-mocks/anthropic/) for the mock DSL, streaming options, and error responses
- [LangChain4j](../langchain4j/) if your application uses Anthropic through LangChain4j
- [Spring Boot](../spring-boot/) or [Quarkus](../quarkus/) for application-level base URL configuration
- [Integrations overview](../) for all client and framework guides


# Koog


Koog is an AI framework, so the appropriate AI-Mocks module depends on the provider configured in
your application. Use the corresponding [AI-Mocks provider guide](../../ai-mocks/) when Koog is
configured for a supported provider.

The verified end-to-end example on this page is the OpenAI-backed pattern from the
[`koog-spring-boot-assistant`](https://github.com/kpavlov/koog-spring-boot-assistant/tree/main/integration-tests/src/test/kotlin/com/example/it)
integration tests. In that setup, Koog talks to an OpenAI-compatible provider, so the integration
point is [AI-Mocks OpenAI](../../ai-mocks/openai/) rather than plain Mokksy. The tests start
`MockOpenai`, point Koog at `mockOpenai.baseUrl()`, and exercise the real Spring Boot application
through HTTP and WebSocket clients.

## Workflow context from the sample repo

The sample application is not a single prompt-in, prompt-out flow. Its README describes a Koog
strategy with moderation, request mapping, streaming LLM output, and tool execution. That context
matters because the integration tests stub several provider endpoints, not just one chat response.

```mermaid
---
title: streaming-strategy
---
stateDiagram
    state "moderate-input" as moderate_input
    state "mapStringToRequests" as mapStringToRequests
    state "applyRequestToSession" as applyRequestToSession
    state "nodeStreaming" as nodeStreaming
    state "executeMultipleTools" as executeMultipleTools
    state "mapToolCallsToRequests" as mapToolCallsToRequests

    [*] --> moderate_input : transformed
    moderate_input --> mapStringToRequests : transformed
    moderate_input --> [*] : transformed
    mapStringToRequests --> applyRequestToSession
    applyRequestToSession --> nodeStreaming
    nodeStreaming --> executeMultipleTools : onCondition
    nodeStreaming --> [*] : onCondition
    executeMultipleTools --> mapToolCallsToRequests
    mapToolCallsToRequests --> applyRequestToSession
```

The same repository also exposes this graph through `/api/koog/strategy/graph`, and the
integration tests assert that the endpoint returns Mermaid output for the running strategy.

## Inject the mock server into Koog

The sample app starts `MockOpenai` once in the test environment, prepares deterministic embeddings
for RAG ingestion, and then injects the mock base URL into Koog before Spring Boot starts:

```kotlin
object TestEnvironment {
    val mockOpenai = MockOpenai(verbose = true)

    init {
        System.setProperty("OPENAI_API_KEY", "dummyOpenAIKey")
        System.setProperty("spring.profiles.active", "test")

        listOf(
            "Care for Magical Trees",
            "Valley of Light",
            "Magical Bow",
            "Morning Pine Elixir",
            "Teleportation and Portals",
        ).forEach {
            mockOpenai.embeddings {
                inputContains(it)
            } responds {
                delay = 1.milliseconds
            }
        }
    }
}

object Server {
    init {
        System.setProperty("ai.koog.openai.base-url", TestEnvironment.mockOpenai.baseUrl())

        SpringApplication.run(
            com.example.app.Application::class.java,
            "--server.port=0",
            "--spring.profiles.active=test",
        )
    }
}
```

This keeps the real Koog and Spring Boot wiring intact while replacing the provider dependency with
a deterministic local server.

The sample application also performs embedding requests during startup for RAG ingestion. Those
embedding stubs must exist before Spring Boot starts, or the application will make unmatched calls
while the test environment is still booting.

## Test the full Koog request path

The positive-path test in the sample repo drives the real application client, not Koog internals.
It stubs embeddings, moderation, and the chat completion stream, then verifies the final answer:

```kotlin
mockOpenai.embeddings {
    stringInput(question)
} responds {
    delay = 40.milliseconds
}

mockOpenai.moderation {
    inputContains(question)
} responds {
    flagged = false
}

mockOpenai.completion {
    systemMessageContains("witty and wise Elven assistant guiding adventurers")
    userMessageContains(question)
} respondsStream {
    responseFlow = flowOf(expectedAnswer)
}

val response = chatClient.sendMessage(question)
```

That test shape is useful when you want to prove prompt routing, moderation checks, RAG lookups,
and provider calls still produce the expected application response.

## Stream token-by-token output

The same repo includes a WebSocket integration test that verifies streaming delivery timing. The
mock server emits one token chunk at a time with a fixed delay between chunks:

```kotlin
val delayBetweenChunks = 500.milliseconds

mockOpenai.completion {
    systemMessageContains("witty and wise Elven assistant guiding adventurers")
    userMessageContains(question)
} respondsStream {
    responseFlow =
        expectedTokens
            .asFlow()
            .onEach { delay(delayBetweenChunks) }
}
```

The test then measures the WebSocket output and checks that Koog forwards the token stream with the
expected pacing. This is the right place to catch buffering mistakes and streaming regressions.

## Exercise moderation and failure paths

The sample repo does not stop at happy-path chat. It also verifies:

- moderation blocking with `mockOpenai.moderation { ... } responds { flagged = true }`
- embedding failures with `respondsError { httpStatusCode = ... }`
- moderation API failures with fallback behavior
- LLM request failures for both SSE and non-streaming completion paths

For example, the failure test uses provider-like HTTP status codes such as `400`, `401`, `403`,
`404`, `418`, `500`, and `503`, then verifies that the application returns a stable fallback
message instead of crashing.

## Verify Koog-specific endpoints too

The repo also tests a Koog strategy-graph endpoint by fetching
`/api/koog/strategy/graph` and asserting that the response contains Mermaid state-diagram output.
That is a useful pattern when your application exposes Koog diagnostics or graph introspection
routes in addition to chat endpoints.

## Source and next steps

- [Koog Spring Boot Assistant integration tests](https://github.com/kpavlov/koog-spring-boot-assistant/tree/main/integration-tests/src/test/kotlin/com/example/it)
- [AI-Mocks providers](../../ai-mocks/)
- [AI-Mocks OpenAI](../../ai-mocks/openai/)
- [Spring Boot](../spring-boot/)
- [OpenAI SDK](../openai-sdk/)


# Mokksy vs WireMock


WireMock remains a strong general-purpose HTTP stubbing tool.
Mokksy focuses on Kotlin and Java integration tests where streaming behavior,
Server-Sent-Events (SSE), and deterministic failure simulation matter.

## Comparison

| Capability | Mokksy | WireMock |
|------------|--------|----------|
| HTTP stubbing and request matching | Yes | Yes |
| SSE-specific response API | `respondsWithSseStream` with event chunks | Use WireMock response configuration or evaluate extensions for your SSE scenario |
| Application-defined stream chunks | `respondsWithStream` accepts chunks or a flow | [Chunked Dribble Delay](https://wiremock.org/docs/simulating-faults/#chunked-dribble-delay) divides a configured body into chunks |
| Inter-chunk timing | Direct `delayBetweenChunks` control | Chunk count and total response duration determine pacing |
| Long-lived streams for client timeout tests | A flow can remain open with `awaitCancellation()` | Evaluate against your timeout scenario and WireMock setup |
| HTTP status and delayed-response scenarios | Yes | Yes |
| Connection-level fault injection | Not positioned as a core Mokksy API | Documented faults include malformed chunks and connection reset |
| Verification API | Request journal and stub verification | Yes |
| Kotlin-first DSL and Java API | Yes | General Java DSL; Kotlin use is through its Java API or integrations |
| Provider-shaped AI API mocks | Available through AI-Mocks | Not a WireMock core provider toolkit |
| Embedding in an existing Ktor application | Yes | Not a Mokksy-equivalent Ktor embedding API |

WireMock capability statements above are based on its
[official fault-simulation documentation](https://wiremock.org/docs/simulating-faults/).
If your decision depends on a WireMock extension or a newer product surface, validate that
specific setup before migrating.

## When to choose Mokksy

- You test clients that consume SSE or streaming APIs.
- You need Kotlin or Java tests that define SSE events and stream chunks directly in test code.
- You want concise Kotlin DSLs and Java-friendly APIs in JVM test suites.
- You use AI provider SDKs and want AI-Mocks on top of a real HTTP/SSE mock server.

## When WireMock may be enough

- Your team already has a mature WireMock setup and its delay or fault APIs cover your scenarios.
- You need connection-reset or malformed-response faults that WireMock already documents directly.