|
| 1 | +# Evaluation |
| 2 | + |
| 3 | +The simplified LLM/VLM API allows you to load a model and evaluate prompts with only a few lines of code. |
| 4 | + |
| 5 | +For example, this loads a model and asks a question and a follow-on question: |
| 6 | + |
| 7 | +```swift |
| 8 | +let model = try await loadModel(id: "mlx-community/Qwen3-4B-4bit") |
| 9 | +let session = ChatSession(model) |
| 10 | +print(try await session.respond(to: "What are two things to see in San Francisco?") |
| 11 | +print(try await session.respond(to: "How about a great place to eat?") |
| 12 | +``` |
| 13 | + |
| 14 | +The second question actually refers to information (the location) from the first |
| 15 | +question -- this context is maintained inside the ``ChatSession`` object. |
| 16 | + |
| 17 | +If you need a one-shot prompt/response simply create a ``ChatSession``, evaluate |
| 18 | +the prompt and discard. Multiple ``ChatSession`` instances could also be used |
| 19 | +(at the cost of the memory in the `KVCache`) to handle multiple streams of |
| 20 | +context. |
| 21 | + |
| 22 | +## Streaming Output |
| 23 | + |
| 24 | +The previous example produced the entire response in one call. Often |
| 25 | +users want to see the text as it is generated -- you can do this with |
| 26 | +a stream: |
| 27 | + |
| 28 | +```swift |
| 29 | +let model = try await loadModel(id: "mlx-community/Qwen3-4B-4bit") |
| 30 | +let session = ChatSession(model) |
| 31 | + |
| 32 | +for try await item in session.streamResponse(to: "Why is the sky blue?") { |
| 33 | + print(item, terminator: "") |
| 34 | +} |
| 35 | +print() |
| 36 | +``` |
| 37 | + |
| 38 | +## VLMs (Vision Language Models) |
| 39 | + |
| 40 | +This same API supports VLMs as well. Simply present the image or video |
| 41 | +to the ``ChatSession``: |
| 42 | + |
| 43 | +```swift |
| 44 | +let model = try await loadModel(id: "mlx-community/Qwen2.5-VL-3B-Instruct-4bit") |
| 45 | +let session = ChatSession(model) |
| 46 | + |
| 47 | +let answer1 = try await session.respond( |
| 48 | + to: "what kind of creature is in the picture?" |
| 49 | + image: .url(URL(fileURLWithPath: "support/test.jpg")) |
| 50 | +) |
| 51 | +print(answer1) |
| 52 | + |
| 53 | +// we can ask a followup question referring back to the previous image |
| 54 | +let answer2 = try await session.respond( |
| 55 | + to: "What is behind the dog?" |
| 56 | +) |
| 57 | +print(answer2) |
| 58 | +``` |
| 59 | + |
| 60 | +## Advanced Usage |
| 61 | + |
| 62 | +The ``ChatSession`` has a number of parameters you can supply when creating it: |
| 63 | + |
| 64 | +- **instructions**: optional instructions to the chat session, e.g. describing what type of responses to give |
| 65 | + - for example you might instruct the language model to respond in rhyme or |
| 66 | + talking like a famous character from a movie |
| 67 | + - or that the responses should be very brief |
| 68 | +- **generateParameters**: parameters that control the generation of output, e.g. token limits and temperature |
| 69 | + - see ``GenerateParameters`` |
| 70 | +- **processing**: optional media processing instructions |
0 commit comments