Skip to content

Commit b7721f8

Browse files
committed
Merge branch 'develop'
Signed-off-by: Yam Marcovitz <[email protected]>
2 parents a91627a + 679682a commit b7721f8

File tree

77 files changed

+8649
-862
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

77 files changed

+8649
-862
lines changed

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,35 @@ All notable changes to Parlant will be documented here.
66

77
TBD
88

9+
## [3.0.2] - 2025-08-27
10+
11+
### Added
12+
13+
- Added docs/\* and llms.txt
14+
- Added Vertex NLP service
15+
- Added Ollama NLP service
16+
- Added LiteLLM support to the SDK
17+
- Added Gemini support to the SDK
18+
- Added Journey.create_observation() helper
19+
- Added auth permission READ_AGENT_DESCRIPTION
20+
- Added optional AWS_SESSION_TOKEN to BedrockService
21+
- Support creating status events via the API
22+
23+
### Changed
24+
25+
- Moved tool call success log to DEBUG level
26+
- Optimized canrep to not generate a draft in strict mode if no canrep candidates found
27+
- Removed `acknowledged_event_offset` from status events
28+
- Removed `last_known_event_offset` from `LoadedContext.interaction`
29+
30+
### Fixed
31+
32+
- Fixed presentation of missing API keys for built-in NLP services
33+
- Improvements to canned response generation
34+
- Fixed bug with null journey paths in some cases
35+
- Fixed tiny bug with terminal nodes in journey node selection
36+
- Fixed evaluations not showing properly after version upgrade
37+
938
## [3.0.1] - 2025-08-16
1039

1140
### Changed

README.md

Lines changed: 56 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,18 @@
1414
<a href="https://www.parlant.io/docs/quickstart/examples" target="_blank">📖 Examples</a>
1515
</p>
1616

17+
<p>
18+
<!-- Keep these links. Translations will automatically update with the README. -->
19+
<a href="https://zdoc.app/de/emcie-co/parlant">Deutsch</a> |
20+
<a href="https://zdoc.app/es/emcie-co/parlant">Español</a> |
21+
<a href="https://zdoc.app/fr/emcie-co/parlant">français</a> |
22+
<a href="https://zdoc.app/ja/emcie-co/parlant">日本語</a> |
23+
<a href="https://zdoc.app/ko/emcie-co/parlant">한국어</a> |
24+
<a href="https://zdoc.app/pt/emcie-co/parlant">Português</a> |
25+
<a href="https://zdoc.app/ru/emcie-co/parlant">Русский</a> |
26+
<a href="https://zdoc.app/zh/emcie-co/parlant">中文</a>
27+
</p>
28+
1729
<p>
1830
<a href="https://pypi.org/project/parlant/"><img alt="PyPI" src="https://img.shields.io/pypi/v/parlant?color=blue"></a>
1931
<img alt="Python 3.10+" src="https://img.shields.io/badge/python-3.10+-blue">
@@ -39,22 +51,42 @@ You build an AI agent. It works great in testing. Then real users start talking
3951

4052
**Sound familiar?** You're not alone. This is the #1 pain point for developers building production AI agents.
4153

42-
## ⚡ The Solution: Teach Principles, Not Scripts
54+
## ⚡ The Solution: Stop Fighting Prompts, Teach Principles
4355

44-
Parlant flips the script on AI agent development. Instead of hoping your LLM will follow instructions, **Parlant guarantees it**.
56+
Parlant flips the script on AI agent development. Instead of hoping your LLM will follow instructions, **Parlant ensures it**.
4557

4658
```python
4759
# Traditional approach: Cross your fingers 🤞
4860
system_prompt = "You are a helpful assistant. Please follow these 47 rules..."
4961

50-
# Parlant approach: Guaranteed compliance ✅
62+
# Parlant approach: Ensured compliance ✅
5163
await agent.create_guideline(
5264
condition="Customer asks about refunds",
5365
action="Check order status first to see if eligible",
5466
tools=[check_order_status],
5567
)
5668
```
5769

70+
#### Parlant gives you all the structure you need to build customer-facing agents that behave exactly as your business requires:
71+
72+
- **[Journeys](https://parlant.io/docs/concepts/customization/journeys)**:
73+
Define clear customer journeys and how your agent should respond at each step.
74+
75+
- **[Behavioral Guidelines](https://parlant.io/docs/concepts/customization/guidelines)**:
76+
Easily craft agent behavior; Parlant will match the relevant elements contextually.
77+
78+
- **[Tool Use](https://parlant.io/docs/concepts/customization/tools)**:
79+
Attach external APIs, data fetchers, or backend services to specific interaction events.
80+
81+
- **[Domain Adaptation](https://parlant.io/docs/concepts/customization/glossary)**:
82+
Teach your agent domain-specific terminology and craft personalized responses.
83+
84+
- **[Canned Responses](https://parlant.io/docs/concepts/customization/canned-responses)**:
85+
Use response templates to eliminate hallucinations and guarantee style consistency.
86+
87+
- **[Explainability](https://parlant.io/docs/advanced/explainability)**:
88+
Understand why and when each guideline was matched and followed.
89+
5890
<div align="center">
5991

6092
## 🚀 Get Your Agent Running in 60 Seconds
@@ -73,20 +105,32 @@ async def get_weather(context: p.ToolContext, city: str) -> p.ToolResult:
73105
# Your weather API logic here
74106
return p.ToolResult(f"Sunny, 72°F in {city}")
75107

108+
@p.tool
109+
async def get_datetime(context: p.ToolContext) -> p.ToolResult:
110+
from datetime import datetime
111+
return p.ToolResult(datetime.now())
112+
76113
async def main():
77114
async with p.Server() as server:
78115
agent = await server.create_agent(
79116
name="WeatherBot",
80117
description="Helpful weather assistant"
81118
)
82119

83-
# Define behavior with natural language
120+
# Have the agent's context be updated on every response (though
121+
# update interval is customizable) using a context variable.
122+
await agent.create_variable(name="current-datetime", tool=get_datetime)
123+
124+
# Control and guide agent behavior with natural language
84125
await agent.create_guideline(
85126
condition="User asks about weather",
86127
action="Get current weather and provide a friendly response with suggestions",
87128
tools=[get_weather]
88129
)
89130

131+
# Add other (reliably enforced) behavioral modeling elements
132+
# ...
133+
90134
# 🎉 Test playground ready at http://localhost:8800
91135
# Integrate the official React widget into your app,
92136
# or follow the tutorial to build your own frontend!
@@ -96,7 +140,7 @@ if __name__ == "__main__":
96140
asyncio.run(main())
97141
```
98142

99-
**That's it!** Your agent is running with guaranteed rule-following behavior.
143+
**That's it!** Your agent is running with ensured rule-following behavior.
100144

101145
## 🎬 See It In Action
102146

@@ -130,7 +174,7 @@ if __name__ == "__main__":
130174
<td width="50%">
131175

132176
- Define rules in natural language
133-
- **Guaranteed** rule compliance
177+
- **Ensured** rule compliance
134178
- Predictable, consistent behavior
135179
- Scale by adding guidelines
136180
- Production-ready from day one
@@ -161,18 +205,22 @@ if __name__ == "__main__":
161205
- **📱 React Widget** - [Drop-in chat UI for any web app](https://github.com/emcie-co/parlant-chat-react)
162206
- **🔍 Full Explainability** - Understand every decision your agent makes
163207

164-
## 📈 Join 1000+ Developers Building Better AI
208+
## 📈 Join 8,000+ Developers Building Better AI
165209

166210
<div align="center">
167211

168-
**Companies using Parlant in production:**
212+
**Companies using Parlant:**
169213

170214
_Financial institutions • Healthcare providers • Legal firms • E-commerce platforms_
171215

172216
[![Star History Chart](https://api.star-history.com/svg?repos=emcie-co/parlant&type=Date)](https://star-history.com/#emcie-co/parlant&Date)
173217

174218
</div>
175219

220+
## 🌟 What Developers Are Saying
221+
222+
> _"By far the most elegant conversational AI framework that I've come across! Developing with Parlant is pure joy."_ **— Vishal Ahuja, Senior Lead, Customer-Facing Conversational AI @ JPMorgan Chase**
223+
176224
## 🏃‍♂️ Quick Start Paths
177225

178226
<table border="0">
@@ -190,10 +238,6 @@ _Financial institutions • Healthcare providers • Legal firms • E-commerce
190238
</tr>
191239
</table>
192240

193-
## 🌟 What Developers Are Saying
194-
195-
> _"By far the most elegant conversational AI framework that I've come across! Developing with Parlant is pure joy."_ **— Vishal Ahuja, Senior Lead, Customer-Facing Conversational AI @ JPMorgan Chase**
196-
197241
## 🤝 Community & Support
198242

199243
- 💬 **[Discord Community](https://discord.gg/duxWqxKk6J)** - Get help from the team and community

docs/adapters/nlp/ollama.md

Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
# Ollama Service Documentation
2+
3+
The Ollama service provides local LLM capabilities for Parlant using [Ollama](https://ollama.ai/). This service supports both text generation and embeddings using various open-source models.
4+
5+
## Prerequisites
6+
7+
1. **Install Ollama**: Download and install from [ollama.ai](https://ollama.ai/)
8+
2. **Start Ollama server**: Run `ollama serve` (usually starts automatically)
9+
3. **Pull required models** (see [Recommended Models](#recommended-models) section)
10+
11+
## Environment Variables
12+
13+
Configure the Ollama service using these environment variables:
14+
15+
```bash
16+
# Ollama server URL (default: http://localhost:11434)
17+
export OLLAMA_BASE_URL="http://localhost:11434"
18+
19+
# Model size to use (default: 4b)
20+
# Options: gemma3:1b, gemma3:4b, llama3.1:8b, gemma3:12b, gemma3:27b, llama3.1:70b, llama3.1:405b
21+
export OLLAMA_MODEL="gemma3:4b"
22+
23+
# Embedding model (default: nomic-embed-text)
24+
# Options: nomic-embed-text, mxbai-embed-large
25+
export OLLAMA_EMBEDDING_MODEL="nomic-embed-text"
26+
27+
# API timeout in seconds (default: 300)
28+
export OLLAMA_API_TIMEOUT="300"
29+
```
30+
31+
### Example Configuration
32+
33+
```bash
34+
# For development (fast, good balance)
35+
export OLLAMA_MODEL="gemma3:4b"
36+
export OLLAMA_EMBEDDING_MODEL="nomic-embed-text"
37+
export OLLAMA_API_TIMEOUT="180"
38+
39+
# higher accuracy cloud
40+
export OLLAMA_MODEL="gemma3:4b"
41+
export OLLAMA_EMBEDDING_MODEL="nomic-embed-text"
42+
export OLLAMA_API_TIMEOUT="600"
43+
```
44+
45+
## Recommended Models
46+
47+
**⚠️ IMPORTANT**: Pull these models before running Parlant to avoid API timeouts during first use:
48+
49+
### Text Generation Models
50+
51+
```bash
52+
# Recommended for most use cases (good balance of speed/accuracy)
53+
ollama pull gemma3:4b-it-qat
54+
55+
# Fast but may struggle with complex schemas
56+
ollama pull gemma3:1b
57+
58+
# embedding model required for creating embeddings
59+
ollama pull nomic-embed-text
60+
```
61+
62+
### Large Models (Cloud/High-end Hardware Only)
63+
64+
```bash
65+
# Better reasoning capabilities
66+
ollama pull llama3.1:8b
67+
68+
# High accuracy for complex tasks
69+
ollama pull gemma3:12b
70+
71+
# Very high accuracy (requires more resources)
72+
ollama pull gemma3:27b-it-qat
73+
74+
# ⚠️ WARNING: Requires 40GB+ GPU memory
75+
ollama pull llama3.1:70b
76+
77+
# ⚠️ WARNING: Requires 200GB+ GPU memory (cloud-only)
78+
ollama pull llama3.1:405b
79+
```
80+
81+
### Embedding Models
82+
83+
To use custom embedding model set OLLAMA_EMBEDDING_MODEL environment value as required name
84+
Note that this implementation is tested using nomic-embed-text
85+
86+
```bash
87+
# Alternative embedding model (512 dimensions)
88+
ollama pull mxbai-embed-large:latest
89+
```
90+
91+
## Model Recommendations by Use Case
92+
93+
| Model Size | Use Case | Memory Requirements | Performance |
94+
|------------|----------|-------------------|-------------|
95+
| `1b` | Quick testing, simple tasks | ~2GB | Fast but limited accuracy |
96+
| `4b` | **Recommended for development** | ~4GB | Good balance of speed/accuracy |
97+
| `8b` | complex reasoning | ~8GB | Better reasoning than Gemma |
98+
| `12b` | High-accuracy tasks | ~12GB | High accuracy, slower |
99+
| `27b` | Complex workloads | ~27GB | Very high accuracy |
100+
| `70b` | Enterprise/cloud only | ~40GB+ | Excellent accuracy |
101+
| `405b` | Research/cloud only | ~200GB+ | State-of-the-art |
102+
103+
## Usage Example
104+
105+
```python
106+
import parlant.sdk as p
107+
from parlant.sdk import NLPServices
108+
109+
async with p.Server(nlp_service=NLPServices.ollama) as server:
110+
agent = await server.create_agent(
111+
name="Healthcare Agent",
112+
description="Is empathetic and calming to the patient.",
113+
)
114+
```
115+
116+
## Configuration Tips
117+
118+
### Development Setup
119+
```bash
120+
export OLLAMA_MODEL=gemma3:4b
121+
export OLLAMA_API_TIMEOUT=180
122+
```
123+
124+
### High-Performance Setup (Cloud)
125+
```bash
126+
export OLLAMA_MODEL=llama3.1:70b
127+
export OLLAMA_API_TIMEOUT=300
128+
```
129+
130+
### Custom / Other models
131+
```bash
132+
export OLLAMA_MODEL=llama3.2:3b
133+
export OLLAMA_API_TIMEOUT=300
134+
```
135+
136+
## Troubleshooting
137+
138+
### Common Issues
139+
140+
1. **Model Not Found Error**
141+
```
142+
Model gemma3:4b not found. Please pull it first with: ollama pull gemma3:4b
143+
```
144+
**Solution**: Run `ollama pull gemma3:4b-it-qat` before starting Parlant
145+
146+
2. **Connection Error**
147+
```
148+
Cannot connect to Ollama server at http://localhost:11434
149+
```
150+
**Solution**: Ensure Ollama is running with `ollama serve`
151+
152+
3. **Timeout Error**
153+
```
154+
Request timed out after 300s
155+
```
156+
**Solution**: Increase `OLLAMA_API_TIMEOUT` or use a smaller model
157+
158+
4. **Out of Memory**
159+
```
160+
CUDA out of memory
161+
```
162+
**Solution**: Use a smaller model size or increase GPU memory
163+
164+
### Performance Optimization
165+
166+
1. **Pre-pull models**: Always pull models before first use
167+
2. **Adjust timeout**: Increase timeout for larger models
168+
3. **Model selection**: Use smallest model that meets accuracy requirements
169+
4. **GPU memory**: Monitor GPU usage and adjust model size accordingly
170+
171+
## Available Model Classes
172+
173+
The service provides these pre-configured model classes:
174+
175+
- `OllamaGemma3_1B`: Fast, basic accuracy
176+
- `OllamaGemma3_4B`: **Recommended** - good balance
177+
- `OllamaLlama31_8B`: Better reasoning
178+
- `OllamaGemma3_12B`: High accuracy
179+
- `OllamaGemma3_27B`: Very high accuracy
180+
- `OllamaLlama31_70B`: Enterprise-grade (high memory)
181+
- `OllamaLlama31_405B`: Research-grade (very high memory)
182+
183+
## Security Notes
184+
185+
- Ollama runs locally, so no data leaves your machine
186+
- No API keys required
187+
- Models are downloaded and cached locally
188+
- Consider firewall rules if exposing Ollama server externally

0 commit comments

Comments
 (0)