AI Architect/Insurance/12 months (Extendable/Convert to Perm)
Sector:
Technology
Function:
Contact Name:
Vivian On
Expiry Date:
24-Jul-2026
Job Ref:
Date Published:
24-Jun-2026
Job Description: Technical AI Architect
Role Overview
We are seeking a Technical AI Architect to lead the design, scaling, and governance of our Enterprise Agentic RAG platform. You will move beyond basic semantic search to architect production-grade, end-to-end multi-agent products and high-performance retrieval systems.
This role demands deep technical mastery in Agentic RAG and LangGraph , strict attention to cost/token optimization , and the ability to ship resilient, production-grade products that enforce robust enterprise guardrails and security compliance.
Key Responsibilities
-
Production-Grade Agentic Architecture: Design and build end-to-end Agentic RAG products utilizing state-driven, multi-agent systems and cyclic workflows via LangGraph . Move from sequential pipelines to iterative, self-correcting reasoning loops (e.g., query decomposition, self-reflection, and dynamic context validation).
-
Enterprise-Scale Retrieval Systems: Architect high-precision, layout-aware semantic chunking pipelines. Implement enterprise hybrid search (combining dense vectors, sparse BM25 keyword matching, and Reciprocal Rank Fusion) backed by two-stage cross-encoder reranking layers.
-
Cost & Token Optimization: Drive LLM unit economics at scale. Implement advanced strategies for token optimization, context-window compression, semantic caching, and dynamic cost-based model routing (e.g., routing lookups to lightweight models and deep reasoning to frontier models).
-
AI Governance, Security & Guardrails: Deploy production-ready enterprise safety nets. Enforce secure tool execution environments, Source Access Control Lists (ACLs), data privacy/PII redacting, and automated LLM-as-a-judge evaluation frameworks (e.g., Ragas, TruLens) tracking Faithfulness, context precision, and latency SLAs.
-
Technical Leadership & DevOps: Lead, mentor, and establish best practices for a dedicated team of AI/ML engineers. Oversee containerization (Docker, Kubernetes) and inference server optimization (e.g., vLLM, PagedAttention) to achieve low-latency SLAs.
Technical Stack & Requirements
-
Orchestration & Agents: Expert-level mastery of LangGraph(critical), LangChain, or LlamaIndex for state tracking and tool use.
-
Data & Vector Infrastructure: Deep experience with enterprise vector databases (Pinecone, Milvus, Qdrant, pgvector) and robust extraction pipelines for complex enterprise documents (PDFs, financial tables).
-
Models & Deployment: Hands-on experience with commercial APIs (OpenAI, Anthropic) and deploying, fine-tuning, or quantization of open-source models (Llama, Mistral) via production engines like vLLM.
-
Core Engineering: Strong Python foundation, asynchronous programming, microservices (FastAPI), and observability infrastructure (LangSmith, Weights & Biases).
-
Experience: 10 years of software/data experience, minimum of 3+ years in AI enterprise architecture with a proven track record of shipping end-to-end, production-ready enterprise GenAI products.
Argyll Scott Asia is acting as an Employment Agency in relation to this vacancy.
Share this job
Sign up for Job alerts
Get similar jobs like these by email