Configuration

OMM is configured via ~/.omm/config.toml. The desktop app provides a settings UI; CLI users edit the file directly.

Full Configuration Reference

config.toml

toml

# Server settings
host = "127.0.0.1"
port = 11435
max_loaded_models = 3
num_parallel = 4
gpu_layers = 35
keep_alive = "5m"
active_profile = "default"

# Engine configuration
[engine]
default_engine = "auto"   # auto, llama, rvllm
autotune = false

[engine.llama]
gpu_layers = 35           # Layers to offload to GPU
context_length = 8192     # Default context window
rope_scaling = "none"     # none, linear, yarn
flash_attention = true

[engine.rvllm]
tensor_parallel = 1       # Number of GPUs
max_batch_size = 256
kv_cache_dtype = "auto"   # auto, fp8, fp16
speculative_model = ""   # Draft model for speculative decoding

# Logging
[logging]
level = "info"            # debug, info, warn, error
file = ""                 # Log file path (empty = stderr)

# Model defaults
[defaults]
temperature = 0.7
top_p = 0.9
top_k = 40
repeat_penalty = 1.1
system_prompt = ""

Environment Variables

Variable	Default	Description
OMM_HOST	127.0.0.1	Server bind address
OMM_PORT	11435	Server port
OMM_HOME	~/.omm	Data directory
OMM_ENGINE	auto	Inference engine selection
OMM_API_KEY	(none)	Optional API authentication
OMM_GPU_LAYERS	35	GPU layer count override

Profile System

Profiles let you switch between different configurations (tools, MCPs, skills, hooks, system prompts) without editing the main config. Profiles are stored in~/.omm/profiles/.

Terminal

bash

# List profiles
omm profile list

# Switch profile
omm profile set coding

# Create a new profile
omm profile create research --model qwen2.5-72b

#Configuration

#Full Configuration Reference

#Environment Variables

#Profile System

Configuration

Full Configuration Reference

Environment Variables

Profile System