RapidFort, Inc. - Senior Systems Engineer (Distributed Systems & Performance)
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• Strong experience building distributed systems or large-scale backend infrastructure • Deep understanding of systems performance (CPU, memory, disk I/O, networking) • Experience optimizing workloads for throughput and efficiency • Programming • Strong Bash / shell scripting • Ability to implement and reason about algorithms and system-level logic • Systems Knowledge • Experience with parallel processing, distributed job execution, or large data pipelines • Familiarity with Linux systems, resource scheduling, and performance tuning • Understanding of networked systems and distributed coordination • Engineering Approach • Strong data-driven mindset with focus on measurement and experimentation • Experience building observability, metrics, and instrumentation • Ability to debug complex systems in production environments • Experience with high-performance computing (HPC) workloads • Experience with containerized environments (Docker/Kubernetes) • Background in large-scale data processing or distributed compute frameworks • Familiarity with performance profiling tools and system tracing • What You’ll Work On • Designing custom distributed compute frameworks • Building efficient algorithms to process large-scale data workloads • Optimizing compute pipelines across CPU, disk, and network resources • Developing instrumentation and performance analytics • Improving system efficiency through continuous measurement and experimentation
Responsibilities
• System Architecture • Design and implement scalable distributed systems that handle heavy CPU, disk, and network workloads. • Architect systems for high throughput, reliability, and efficient resource utilization. • Develop distributed algorithms and data processing pipelines. • Performance & Optimization • Analyze system behavior to identify bottlenecks across compute, storage, and network layers. • Optimize workloads for maximum efficiency and minimal resource waste. • Develop strategies for parallelization, batching, and workload scheduling. • Engineering & Implementation • Implement system components and tooling primarily in Python and Bash. • Build custom orchestration, automation, and distributed job execution mechanisms. • Write efficient algorithms and low-level logic to manage large-scale workloads. • Observability & Data-Driven Engineering • Build instrumentation, metrics, and telemetry to measure system performance. • Develop dashboards and analysis workflows to guide optimization decisions. • Use empirical data and experimentation to improve system behavior. • Infrastructure & Reliability • Design systems that operate reliably across distributed environments. • Implement monitoring, debugging, and recovery mechanisms for large-scale systems. • Collaborate with infrastructure and platform teams to ensure smooth deployment and operation.
Similar Jobs
No credit card. Takes 10 seconds.