We are looking for a solution-oriented AI Engineer who can design and build production-grade multi-agent LLM systems.
The key expectation is not just writing code, but understanding the desired user outcome and engineering systems that reliably deliver it.
This role is closer to systems engineering / architecture than classical ML.
You will work independently, explore our APIs and product flows, and build agent-based solutions that integrate deeply with the platform.
Product platform with multiple internal APIs
Agents interact with platform APIs and product workflows
High variability of use cases and flows
Relatively small data volumes per agent, but high orchestration complexity
Agents must operate autonomously with tools, routing, memory and fallback strategies
Backend stack mainly Python, frontend TypeScript
Design and implement multi-agent systems, including orchestrator and worker agents
Define agent architecture, including routing, memory, tool usage, and fallback strategies
Translate product requirements into scalable and reliable agent workflows
Ensure systems are designed for robustness, maintainability, and extensibility
Agent Infrastructure and Development
Build core infrastructure for agent execution, including tool calling and state management
Develop and maintain multimodal pipelines (text and image generation)
Integrate and abstract multiple model providers (e.g., OpenAI, Anthropic, self-hosted models)
Implement custom tools and services to extend agent capabilities
Platform Integration
Integrate agents with internal platform APIs and product workflows
Analyze API specifications and existing codebases (Python backend and TypeScript frontend)
Build adapters and integration layers where necessary
Ensure reliable interaction between agents and platform systems
Reliability, Testing, and Evaluation
Implement testing strategies for agent behavior and system reliability
Build evaluation pipelines to measure LLM and agent performance
Debug complex agent workflows and improve system stability
Contribute to observability, logging, and monitoring of AI systems
Cross-functional Collaboration
Work closely with Product and Engineering teams to define use cases and system behavior
Provide input on AI architecture and system design decisions
Contribute to best practices for building production-grade AI systems
Strong Python backend development experience, including async programming and APIs
Commercial experience building multi-agent LLM systems (required)
Experience with agent frameworks such as LangChain, LangGraph, AutoGen, CrewAI, LlamaIndex, or similar
Experience designing agent orchestration, tool usage, routing, and memory systems
Experience building multimodal pipelines
Experience integrating multiple model providers (e.g., OpenAI, Anthropic, self-hosted models)
Experience working with vector databases (e.g., Qdrant, Pinecone, Weaviate)
Experience with Redis or similar systems for caching and state management
Strong understanding of production-grade development practices, including testing and debugging
Ability to design systems that prioritize reliability over experimentation
ML background or practical ML knowledge
Experience designing AI product architectures
Experience building agent evaluation / benchmarking frameworks
Experience working with large API ecosystems
Experience with AI observability tools
Strong ownership and autonomy
Ability to work from problem architecture implementation
Comfortable exploring unfamiliar codebases, APIs and product logic
Focus on engineering reliable AI systems, not just experimenting with models
Aghanim helps game developers achieve financial and creative independence by providing the solutions they need to launch, run, and grow their businesses.