Type

Full-time

Experience

5+ yr

Source

Greenhouse

About this role

Why RoboForce

RoboForce is an AI robotics company developing Physical AI–powered Robo-Labor for dull, dirty, and dangerous work. The company's robots are engineered for demanding industrial environments, with a focus on real-world deployment and scalability.

We are looking for a Senior / Staff AI Research Engineer, Data Infrastructure to build the data and learning engine behind RoboForce's Physical AI stack. In this role, you will own the full pipeline — from raw teleoperation and UMI device data collection through curation, annotation, and storage, to post-training infrastructure that scores demonstrations, identifies failure patterns, and closes the loop back into model retraining.

Responsibilities

•
Design and maintain end-to-end data collection pipelines ingesting multimodal demonstration data from teleoperation devices and UMI hardware, including synchronization, versioning, and distributed storage at scale.

•
Build annotation tooling and data curation workflows — quality filtering, deduplication, episode scoring, and domain reweighting — to produce high-quality training datasets for robot policy learning.

•
Develop post-SFT reinforcement learning infrastructure: implement reward scoring on demonstrations, mine and categorize failure patterns, and feed curated failure data back into the retraining loop.

•
Build evaluation and test infrastructure to log policy rollouts on-robot, capture structured results, and surface actionable diagnostics for the research team.

•
Collaborate with ML researchers to define data schemas, episode formats, and pipeline interfaces that support rapid iteration on VLA and manipulation policy training.

•
Architect scalable storage and retrieval systems for heterogeneous robot data (vision, proprioception, action, language) across both cloud and on-prem environments.

Requirements

•
Bachelor's or Master's degree in Computer Science, Robotics, or related field with 5+ years of experience.

•
Strong proficiency in Python and experience building production-grade data pipelines and ETL systems.

•
Hands-on experience with large-scale dataset management, including versioning, deduplication, quality filtering, and distributed storage (e.g., S3, GCS, HDF5, WebDataset, Zarr).

•
Experience building or working with post-training infrastructure — SFT pipelines, reward modeling, or RL training loops (e.g., PPO, DPO, rejection sampling).

•
Familiarity with deep learning frameworks (PyTorch, JAX) and ML training workflows sufficient to collaborate tightly with research teams.

•
Requires 5 days/week in-office collaboration with the teams.

Bonus Qualifications

•
Experience with robotics data collection hardware — teleoperation devices, UMI, GELLO, or similar — and the synchronization and preprocessing challenges they introduce.

•
Familiarity with robot learning pipelines: imitation learning, behavior cloning, or VLA/VLM fine-tuning workflows.

•
Experience building evaluation or experiment tracking infrastructure (e.g., Weights & Biases, MLflow, custom rollout loggers).

•
Proven ability to design annotation tooling or human-in-the-loop labeling systems for structured or multimodal data.

Benefits

•
Competitive stock options/equity programs.

•
Health, dental, and vision insurance, 401(k) plan.

•
Visa sponsorship and green card support for qualified candidates.

•
Lunches and dinners, a fully stocked kitchen, and regular team-building events.

Tech stack

PythonPyTorch

About RoboForce

RoboForce is hiring for the senior / staff ai research engineer, data infrastructure role. NewJob aggregates active openings directly from RoboForce's applicant tracking system, so this listing is current. More jobs at RoboForce →