M

Staff Machine Learning Engineer

My IR
Full-time
On-site
Denver, Colorado, United States
$175,000 - $185,000 USD yearly
Introduction

About Us

At IR Labs, we are on a mission to revolutionize the way businesses harness the power of data. We are not just building products; we are shaping the future of business innovation. Our mission is to create cutting-edge AI solutions that redefine industries and enhance everyday life for our customers. Our transformative AI and analytics solutions are designed to unlock new insights, drive innovation, and create competitive advantages for our customers. We are a passionate team of innovators dedicated to building groundbreaking technology. Join us as we lead the way in AI and analytics, transforming visionary ideas into impactful solutions. Together, we will redefine what it means to innovate and succeed in the digital age.



Description

Job Description

Are you a talented Machine Learning Engineer looking to make a significant impact in a rapidly evolving AI and machine learning innovation lab? Do you thrive in a fast-paced setting where your work bridges the gap between machine learning engineering, data infrastructure, and DevSecOps? If you have a passion for building scalable, high-performance ML systems that drive cutting-edge AI applications, we want you on our team!

As a Machine Learning Engineer at IR Labs, you will play a foundational role in designing, developing, and deploying core ML systems that power our products. You’ll work closely with data scientists, backend engineers and DevSecOps experts to build, scale, and optimize ML workflows, ensuring seamless production deployment and robust infrastructure. If this sounds exciting to you, then we need to talk!

What You’ll Do

  • Serve as the foundational machine learning engineer, responsible for designing, developing, and deploying the core ML systems that will power the company’s products, enabling both data science exploration and production-level ML workflows.
  • Collaborate with data scientists, backend engineers, and DevSecOps experts to build, scale, and deploy ML models into production environments, ensuring robust and efficient delivery pipelines.
  • Develop and maintain end-to-end machine learning workflows, from data preprocessing and feature engineering to model training, evaluation, and deployment.
  • Establish and automate MLOps pipelines using tools such as Flyte, Ray, and MLFlow, enabling seamless experimentation, version control, and reproducibility.
  • Design scalable and efficient data pipelines to support machine learning workflows, leveraging streaming technologies like Apache Kafka, Spark Streaming, or Flink for real-time use cases.
  • Optimize and configure NVIDIA GPUs for training and inference workloads, ensuring efficient use of hardware and high-performance computing (HPC) resources.
  • Work closely with product and engineering teams to translate business requirements into ML model designs and workflows that deliver actionable insights and value.
  • Ensure the observability and reliability of deployed ML systems by incorporating monitoring, logging, and alerting solutions with tools such as Prometheus, Grafana, and OpenTelemetry.
  • Optimize model serving and inference pipelines for low-latency and high-throughput use cases using frameworks like Triton Inference Server, Ray Serve, or TorchServe.
  • Create and maintain feature stores and centralized repositories for datasets, ensuring efficient sharing and reuse of data across the team.
  • Research and implement state-of-the-art machine learning algorithms, frameworks, and tools to improve model accuracy, scalability, and performance.


Skills And Experiences

Qualifications

Machine Learning Expertise

  • Extensive experience (8+ years) working on end-to-end machine learning projects, including data collection, preprocessing, feature engineering, model development, and production deployment.
  • Proficiency in Python for ML development, with strong expertise in PyTorch and experience with Numba/C++ for performance acceleration.
  • Strong understanding of statistical modeling, machine learning algorithms, and their tradeoffs, with hands-on experience implementing them for real-world problems.
  • Familiarity with model optimization techniques, such as quantization, pruning, and distributed training strategies, to improve performance.

MLOps & Automation

  • Proven ability to design and implement MLOps pipelines for model experimentation, versioning, deployment, and monitoring using tools like Flyte, MLFlow, and Weights & Biases.
  • Experience automating and scaling training and inference workflows on AWS, leveraging services like EC2, EKS, and S3 for efficiency.
  • Hands-on experience with containerization (Docker/Podman) and orchestration (Kubernetes, AWS EKS) for scalable ML deployment.

Data Infrastructure Integration

  • Solid experience working with data lakes (e.g., Delta Lake), data pipelines, and batch/stream processing technologies (e.g., Kafka, Spark Streaming, or Flink).
  • Ability to work with large-scale datasets, ensuring efficient data handling, preprocessing, and feature extraction at scale.
  • Familiarity with metadata and feature management tools like Unity Catalog, Tecton, or Feast.

HPC and GPU Workloads

  • Deep understanding of NVIDIA GPUs, including optimizing GPU configurations for ML training and inference workloads, as well as working with CUDA libraries.
  • Experience with high-performance computing (HPC) environments, including networking, resource allocation, and troubleshooting distributed GPU workloads.

Observability & Performance Optimization

  • Expertise in monitoring the performance of ML models in production and implementing retraining workflows triggered by data drift or model decay.
  • Proficiency in integrating observability tools (e.g., Prometheus, Grafana) into ML workflows to monitor training and inference performance.
  • Experience optimizing inference pipelines for GPU-based acceleration using frameworks like Triton Inference Server, Ray Serve, or ONNX Runtime.

Collaboration & Learning

  • Proven ability to work cross-functionally with product managers, data engineers, and infrastructure teams to develop ML solutions that align with business goals.
  • Experience mentoring junior engineers and data scientists, establishing best practices for ML development and deployment.
  • Strong communication skills to articulate complex technical concepts and tradeoffs to both technical and non-technical stakeholders.

Nice to Have’s

  • Educational Background: Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, or a related technical field.
  • CUDA and C++: Hands-on experience with CUDA programming, NCCL, and C++ for implementing custom kernels or optimizing performance-critical ML workloads.
  • Streaming Expertise: Deep knowledge of integrating ML workflows with streaming platforms for real-time inference and feedback loops.
  • Cloud and Security: Experience implementing secure AI/ML workflows compliant with SOC 2, HIPAA, or GDPR standards in cloud environments.
  • Big Data: Hands-on experience with datasets at petabyte scale, optimizing pipelines for low-latency access and high throughput.

What We Offer

  • Culture: Join a passionate, driven team that values collaboration, innovation, and having fun while making a difference.
  • Impact: Be a key player in an early-stage innovation lab where your contributions directly influence the company's success and you get to help build from the ground up.
  • Innovation: Work on cutting-edge AI solutions that solve real-world problems and shape the future of technology.
  • Growth: Opportunity for personal and professional growth as the company scales.
  • Flexible Work Culture: Benefit from a flexible work environment that promotes work-life balance and remote work.
  • Competitive Compensation: Receive a competitive salary and benefits package, with eligibility for equity.
  • Medical, Dental, Vision Insurance
  • 401k with Employer Contributions
  • Paid Time Off
  • Health Savings Account (HSA) Contributions with High Deductible Health Plan
  • Short-Term/Long-Term Disability Insurance
  • And more!

Compensation Range:

  • $175,000 - $185,000 base compensation
  • $27,000 - $37,000 variable compensation

Actual compensation offer to candidate may vary from posted hiring range based upon geographic location, work experience, education, and/or skill level. The pay ratio between base pay and target incentive (if applicable) will be finalized at offer.

At IR we celebrate, support, and thrive on difference for the benefit of our employees, our products, and our community. We are proud to be an Equal Employment Opportunity employer and encourage applications from all suitable candidates; we never discriminate based on race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, veteran, or disability status.