emcie-co
diff --git a/‎CHANGELOG.md
Lines changed: 29 additions & 0 deletions b/‎CHANGELOG.md
Lines changed: 29 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 56 additions & 12 deletions b/‎README.md
Lines changed: 56 additions & 12 deletions
diff --git a/‎docs/adapters/nlp/ollama.md
Lines changed: 188 additions & 0 deletions b/‎docs/adapters/nlp/ollama.md
Lines changed: 188 additions & 0 deletions
@@ -6,6 +6,35 @@ All notable changes to Parlant will be documented here.
 
 TBD
 
+## [3.0.2] - 2025-08-27
+
+### Added
+
+- Added docs/\* and llms.txt
+- Added Vertex NLP service
+- Added Ollama NLP service
+- Added LiteLLM support to the SDK
+- Added Gemini support to the SDK
+- Added Journey.create_observation() helper
+- Added auth permission READ_AGENT_DESCRIPTION
+- Added optional AWS_SESSION_TOKEN to BedrockService
+- Support creating status events via the API
+
+### Changed
+
+- Moved tool call success log to DEBUG level
+- Optimized canrep to not generate a draft in strict mode if no canrep candidates found
+- Removed `acknowledged_event_offset` from status events
+- Removed `last_known_event_offset` from `LoadedContext.interaction`
+
+### Fixed
+
+- Fixed presentation of missing API keys for built-in NLP services
+- Improvements to canned response generation
+- Fixed bug with null journey paths in some cases
+- Fixed tiny bug with terminal nodes in journey node selection
+- Fixed evaluations not showing properly after version upgrade
+
 ## [3.0.1] - 2025-08-16
 
 ### Changed
 
@@ -14,6 +14,18 @@
   <a href="https://www.parlant.io/docs/quickstart/examples" target="_blank">📖 Examples</a>
 </p>
 
+<p>
+  <!-- Keep these links. Translations will automatically update with the README. -->
+  <a href="https://zdoc.app/de/emcie-co/parlant">Deutsch</a> |
+  <a href="https://zdoc.app/es/emcie-co/parlant">Español</a> |
+  <a href="https://zdoc.app/fr/emcie-co/parlant">français</a> |
+  <a href="https://zdoc.app/ja/emcie-co/parlant">日本語</a> |
+  <a href="https://zdoc.app/ko/emcie-co/parlant">한국어</a> |
+  <a href="https://zdoc.app/pt/emcie-co/parlant">Português</a> |
+  <a href="https://zdoc.app/ru/emcie-co/parlant">Русский</a> |
+  <a href="https://zdoc.app/zh/emcie-co/parlant">中文</a>
+</p>
+
 <p>
   <a href="https://pypi.org/project/parlant/"><img alt="PyPI" src="https://img.shields.io/pypi/v/parlant?color=blue"></a>
   <img alt="Python 3.10+" src="https://img.shields.io/badge/python-3.10+-blue">
@@ -39,22 +51,42 @@ You build an AI agent. It works great in testing. Then real users start talking
 
 **Sound familiar?** You're not alone. This is the #1 pain point for developers building production AI agents.
 
-## ⚡ The Solution: Teach Principles, Not Scripts
+## ⚡ The Solution: Stop Fighting Prompts, Teach Principles
 
-Parlant flips the script on AI agent development. Instead of hoping your LLM will follow instructions, **Parlant guarantees it**.
+Parlant flips the script on AI agent development. Instead of hoping your LLM will follow instructions, **Parlant ensures it**.
 
 ```python
 # Traditional approach: Cross your fingers 🤞
 system_prompt = "You are a helpful assistant. Please follow these 47 rules..."
 
-# Parlant approach: Guaranteed compliance ✅
+# Parlant approach: Ensured compliance ✅
 await agent.create_guideline(
     condition="Customer asks about refunds",
     action="Check order status first to see if eligible",
     tools=[check_order_status],
 )
 ```
 
+#### Parlant gives you all the structure you need to build customer-facing agents that behave exactly as your business requires:
+
+- **[Journeys](https://parlant.io/docs/concepts/customization/journeys)**:
+  Define clear customer journeys and how your agent should respond at each step.
+
+- **[Behavioral Guidelines](https://parlant.io/docs/concepts/customization/guidelines)**:
+  Easily craft agent behavior; Parlant will match the relevant elements contextually.
+
+- **[Tool Use](https://parlant.io/docs/concepts/customization/tools)**:
+  Attach external APIs, data fetchers, or backend services to specific interaction events.
+
+- **[Domain Adaptation](https://parlant.io/docs/concepts/customization/glossary)**:
+  Teach your agent domain-specific terminology and craft personalized responses.
+
+- **[Canned Responses](https://parlant.io/docs/concepts/customization/canned-responses)**:
+  Use response templates to eliminate hallucinations and guarantee style consistency.
+
+- **[Explainability](https://parlant.io/docs/advanced/explainability)**:
+  Understand why and when each guideline was matched and followed.
+
 <div align="center">
 
 ## 🚀 Get Your Agent Running in 60 Seconds
@@ -73,20 +105,32 @@ async def get_weather(context: p.ToolContext, city: str) -> p.ToolResult:
     # Your weather API logic here
     return p.ToolResult(f"Sunny, 72°F in {city}")
 
+@p.tool
+async def get_datetime(context: p.ToolContext) -> p.ToolResult:
+    from datetime import datetime
+    return p.ToolResult(datetime.now())
+
 async def main():
     async with p.Server() as server:
         agent = await server.create_agent(
             name="WeatherBot",
             description="Helpful weather assistant"
         )
 
-        # Define behavior with natural language
+        # Have the agent's context be updated on every response (though
+        # update interval is customizable) using a context variable.
+        await agent.create_variable(name="current-datetime", tool=get_datetime)
+
+        # Control and guide agent behavior with natural language
         await agent.create_guideline(
             condition="User asks about weather",
             action="Get current weather and provide a friendly response with suggestions",
             tools=[get_weather]
         )
 
+        # Add other (reliably enforced) behavioral modeling elements
+        # ...
+
         # 🎉 Test playground ready at http://localhost:8800
         # Integrate the official React widget into your app,
         # or follow the tutorial to build your own frontend!
@@ -96,7 +140,7 @@ if __name__ == "__main__":
     asyncio.run(main())
 ```
 
-**That's it!** Your agent is running with guaranteed rule-following behavior.
+**That's it!** Your agent is running with ensured rule-following behavior.
 
 ## 🎬 See It In Action
 
@@ -130,7 +174,7 @@ if __name__ == "__main__":
 <td width="50%">
 
 - Define rules in natural language
-- **Guaranteed** rule compliance
+- **Ensured** rule compliance
 - Predictable, consistent behavior
 - Scale by adding guidelines
 - Production-ready from day one
@@ -161,18 +205,22 @@ if __name__ == "__main__":
 - **📱 React Widget** - [Drop-in chat UI for any web app](https://github.com/emcie-co/parlant-chat-react)
 - **🔍 Full Explainability** - Understand every decision your agent makes
 
-## 📈 Join 1000+ Developers Building Better AI
+## 📈 Join 8,000+ Developers Building Better AI
 
 <div align="center">
 
-**Companies using Parlant in production:**
+**Companies using Parlant:**
 
 _Financial institutions • Healthcare providers • Legal firms • E-commerce platforms_
 
 [![Star History Chart](https://api.star-history.com/svg?repos=emcie-co/parlant&type=Date)](https://star-history.com/#emcie-co/parlant&Date)
 
 </div>
 
+## 🌟 What Developers Are Saying
+
+> _"By far the most elegant conversational AI framework that I've come across! Developing with Parlant is pure joy."_ **— Vishal Ahuja, Senior Lead, Customer-Facing Conversational AI @ JPMorgan Chase**
+
 ## 🏃‍♂️ Quick Start Paths
 
 <table border="0">
@@ -190,10 +238,6 @@ _Financial institutions • Healthcare providers • Legal firms • E-commerce
 </tr>
 </table>
 
-## 🌟 What Developers Are Saying
-
-> _"By far the most elegant conversational AI framework that I've come across! Developing with Parlant is pure joy."_ **— Vishal Ahuja, Senior Lead, Customer-Facing Conversational AI @ JPMorgan Chase**
-
 ## 🤝 Community & Support
 
 - 💬 **[Discord Community](https://discord.gg/duxWqxKk6J)** - Get help from the team and community
 
@@ -0,0 +1,188 @@
+# Ollama Service Documentation
+
+The Ollama service provides local LLM capabilities for Parlant using [Ollama](https://ollama.ai/). This service supports both text generation and embeddings using various open-source models.
+
+## Prerequisites
+
+1. **Install Ollama**: Download and install from [ollama.ai](https://ollama.ai/)
+2. **Start Ollama server**: Run `ollama serve` (usually starts automatically)
+3. **Pull required models** (see [Recommended Models](#recommended-models) section)
+
+## Environment Variables
+
+Configure the Ollama service using these environment variables:
+
+```bash
+# Ollama server URL (default: http://localhost:11434)
+export OLLAMA_BASE_URL="http://localhost:11434"
+
+# Model size to use (default: 4b)
+# Options: gemma3:1b, gemma3:4b, llama3.1:8b, gemma3:12b, gemma3:27b, llama3.1:70b, llama3.1:405b
+export OLLAMA_MODEL="gemma3:4b"
+
+# Embedding model (default: nomic-embed-text)
+# Options: nomic-embed-text, mxbai-embed-large
+export OLLAMA_EMBEDDING_MODEL="nomic-embed-text"
+
+# API timeout in seconds (default: 300)
+export OLLAMA_API_TIMEOUT="300"
+```
+
+### Example Configuration
+
+```bash
+# For development (fast, good balance)
+export OLLAMA_MODEL="gemma3:4b"
+export OLLAMA_EMBEDDING_MODEL="nomic-embed-text"
+export OLLAMA_API_TIMEOUT="180"
+
+# higher accuracy cloud
+export OLLAMA_MODEL="gemma3:4b"
+export OLLAMA_EMBEDDING_MODEL="nomic-embed-text"
+export OLLAMA_API_TIMEOUT="600"
+```
+
+## Recommended Models
+
+**⚠️ IMPORTANT**: Pull these models before running Parlant to avoid API timeouts during first use:
+
+### Text Generation Models
+
+```bash
+# Recommended for most use cases (good balance of speed/accuracy)
+ollama pull gemma3:4b-it-qat
+
+# Fast but may struggle with complex schemas
+ollama pull gemma3:1b
+
+# embedding model required for creating embeddings
+ollama pull nomic-embed-text
+```
+
+### Large Models (Cloud/High-end Hardware Only)
+
+```bash
+# Better reasoning capabilities
+ollama pull llama3.1:8b
+
+# High accuracy for complex tasks
+ollama pull gemma3:12b
+
+# Very high accuracy (requires more resources)
+ollama pull gemma3:27b-it-qat
+
+# ⚠️ WARNING: Requires 40GB+ GPU memory
+ollama pull llama3.1:70b
+
+# ⚠️ WARNING: Requires 200GB+ GPU memory (cloud-only)
+ollama pull llama3.1:405b
+```
+
+### Embedding Models
+
+To use custom embedding model set OLLAMA_EMBEDDING_MODEL environment value as required name
+Note that this implementation is tested using nomic-embed-text
+
+```bash
+# Alternative embedding model (512 dimensions)
+ollama pull mxbai-embed-large:latest
+```
+
+## Model Recommendations by Use Case
+
+| Model Size | Use Case | Memory Requirements | Performance |
+|------------|----------|-------------------|-------------|
+| `1b` | Quick testing, simple tasks | ~2GB | Fast but limited accuracy |
+| `4b` | **Recommended for development** | ~4GB | Good balance of speed/accuracy |
+| `8b` |  complex reasoning | ~8GB | Better reasoning than Gemma |
+| `12b` | High-accuracy tasks | ~12GB | High accuracy, slower |
+| `27b` | Complex workloads | ~27GB | Very high accuracy |
+| `70b` | Enterprise/cloud only | ~40GB+ | Excellent accuracy |
+| `405b` | Research/cloud only | ~200GB+ | State-of-the-art |
+
+## Usage Example
+
+```python
+import parlant.sdk as p
+from parlant.sdk import NLPServices
+
+async with p.Server(nlp_service=NLPServices.ollama) as server:
+        agent = await server.create_agent(
+            name="Healthcare Agent",
+            description="Is empathetic and calming to the patient.",
+        )
+```
+
+## Configuration Tips
+
+### Development Setup
+```bash
+export OLLAMA_MODEL=gemma3:4b
+export OLLAMA_API_TIMEOUT=180
+```
+
+### High-Performance Setup (Cloud)
+```bash
+export OLLAMA_MODEL=llama3.1:70b
+export OLLAMA_API_TIMEOUT=300
+```
+
+### Custom / Other models
+```bash
+export OLLAMA_MODEL=llama3.2:3b
+export OLLAMA_API_TIMEOUT=300
+```
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Model Not Found Error**
+   ```
+   Model gemma3:4b not found. Please pull it first with: ollama pull gemma3:4b
+   ```
+   **Solution**: Run `ollama pull gemma3:4b-it-qat` before starting Parlant
+
+2. **Connection Error**
+   ```
+   Cannot connect to Ollama server at http://localhost:11434
+   ```
+   **Solution**: Ensure Ollama is running with `ollama serve`
+
+3. **Timeout Error**
+   ```
+   Request timed out after 300s
+   ```
+   **Solution**: Increase `OLLAMA_API_TIMEOUT` or use a smaller model
+
+4. **Out of Memory**
+   ```
+   CUDA out of memory
+   ```
+   **Solution**: Use a smaller model size or increase GPU memory
+
+### Performance Optimization
+
+1. **Pre-pull models**: Always pull models before first use
+2. **Adjust timeout**: Increase timeout for larger models
+3. **Model selection**: Use smallest model that meets accuracy requirements
+4. **GPU memory**: Monitor GPU usage and adjust model size accordingly
+
+## Available Model Classes
+
+The service provides these pre-configured model classes:
+
+- `OllamaGemma3_1B`: Fast, basic accuracy
+- `OllamaGemma3_4B`: **Recommended** - good balance
+- `OllamaLlama31_8B`: Better reasoning
+- `OllamaGemma3_12B`: High accuracy
+- `OllamaGemma3_27B`: Very high accuracy
+- `OllamaLlama31_70B`: Enterprise-grade (high memory)
+- `OllamaLlama31_405B`: Research-grade (very high memory)
+
+## Security Notes
+
+- Ollama runs locally, so no data leaves your machine
+- No API keys required
+- Models are downloaded and cached locally
+- Consider firewall rules if exposing Ollama server externally