OKX - Big Data Platform Engineer, AI Agent Platform
Upload My Resume
Drop here or click to browse · Tap to choose · PDF, DOCX, DOC, RTF, TXT
Requirements
• 5+ years of experience building large-scale data platforms (Hadoop/Spark/Flink or equivalent) • Deep expertise in distributed storage and compute systems (MaxCompute, Hologres, ClickHouse, Hive) • Strong software engineering skills in Java, Scala, or Python; experience with API-first design • Hands-on experience with task scheduling systems (Airflow, DolphinScheduler, or in-house equivalents) • Solid understanding of multi-cloud architectures and cost governance • Familiarity with LLM integration patterns: tool calling, RAG pipelines, context management • Experience with MCP or similar agent-tool frameworks is a strong plus • Passion for building systems that make other engineers 10x more productive • IELTS score of 7 or above in all four components. • IELTS score of 7 or above • 3+ years of work experience in English-speaking regions. • Experience in cross border e-commerce, and familiarity with multi-country, multi-language architectural design is a plus. • cross border e-commerce • multi-country, multi-language architectural design
Responsibilities
• Platform Core: Design and operate large-scale distributed data systems • Platform Core • Own the big data compute and storage infrastructure (MaxCompute/ODPS, Hologres, Spark) • Build and maintain multi-site task orchestration that dynamically selects engines and enforces policy • Drive reliability and performance improvements across batch and real-time pipelines • AI Integration: Build the AI-native platform layer • AI Integration • Develop and expose MCP (Model Context Protocol) tool interfaces so AI agents can interact with platform APIs • Build the scheduling and cost-optimization agents that auto-tune resource allocation and alert severity • Instrument platform telemetry to feed AI-driven SLA monitoring and anomaly detection • Design context retrieval pipelines (RAG / vector search) for SQL code and config knowledge bases • Tooling & DX: Evolve the developer experience • Tooling & DX • Own the internal data development platform — IDE integrations, code review automation, deployment tooling • Build APIs-first tools (backfill, ingestion automation) designed for future MCP integration • Collaborate with data warehouse and service teams to define platform contracts • Ops & Governance: Drive operational excellence • Ops & Governance • Establish SLA benchmarks, cost metrics, and latency dashboards as AI optimization targets • Build automated incident response and root-cause analysis pipelines • Define and enforce infrastructure policies across multi-cloud environments
Benefits
• L&D programs and Education subsidy for employees' growth and development • Various team building programs and company events • Wellness and meal allowances • Comprehensive healthcare schemes for employees and dependants • More that we love to tell you along the process! • All official OKX vacancies are published on this website. While roles may appear on selected third-party platforms from time to time, information on other sites may be inaccurate or outdated. If in doubt, please apply directly through our official careers website. • If in doubt, please apply directly through our official careers website. • Information collected and processed as part of the recruitment process of any job application you choose to submit is subject to OKX's Candidate Privacy Notice.
No credit card. Takes 10 seconds.