RL Environments Engineer - Low-Level Engineering & Kernel Inference Optimization

90 - 125 USDNetto za godzinę - B2B

BI & Data

RL Environments Engineer - Low-Level Engineering & Kernel Inference Optimization

BI & Data

San Carlos St 333, London +4 Lokalizacje

Preference Model via XOR Inc.

Wymiar pracy

Pełny etat

Typ umowy

B2B

Doświadczenie

Starszy specjalista / Senior

Tryb pracy

Praca w pełni zdalna

90 - 125 USDNetto za godzinę - B2B

Opis stanowiska

About the company

XOR is hiring exclusively on behalf of our partner Preference Model.

Preference Model is building the next generation of training data to power the future of AI. Today's models are powerful but fail to reach their potential across diverse use cases because so many of the tasks that we want to use these models for are outside of their training data distribution. Preference Model creates reinforcement learning environments that encapsulate real-world use cases, enabling AI systems to practice, adapt, and learn from feedback grounded in reality. We seek to bring the real world into distribution for the models.

Our founding team has previous experience on Anthropic’s data team building data infrastructure, tokenizers, and datasets behind the Claude model. We are partnering with leading AI labs to push AI closer to achieving its transformative potential.

The company has closed a large Seed round from Tier-1 VC’s in Silicon Valley and is working with top AI labs, informing priorities and timelines.

XOR runs the end-to-end hiring process for this role (screening, take-home, and coordination with the Preference Model team). Please apply through this posting to be considered.

Brief Description of the Role

We're hiring Low-Level Engineers to design and build RL environments that teach LLMs kernel development, hardware optimization, and systems programming. The goal is to create realistic feedback loops where models learn to write high-performance code across GPU and CPU architectures.

This is a remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.

Requirements

Minimal Qualifications

Strong Python (engineering-quality, not notebook-only)
Production mindset (debugging, reliability, iteration speed)
Clear understanding of LLMs, their current limitations
Ability to meet throughput expectations and respond quickly to feedback

You may be a good fit if one of the following applies

Deep understanding of memory hierarchies (registers, L1/L2/shared memory, HBM, system RAM) and their performance implications
Threading models, synchronization primitives, and concurrent programming (warps, thread blocks, barriers, atomics)
Cache coherence, memory access patterns, coalescing, and bank conflicts
JIT compilation frameworks (e.g., Triton, JAX/XLA, TorchInductor, Numba)
AOT compilation and optimization passes (LLVM, MLIR, TVM)
Compiler and kernel frameworks such as CUTLASS, BitBLAS, or JAX/Pallas
Modern C++, including templates, concurrency, and build systems
Assembly-level programming and low-level optimization across GPU and CPU architectures (e.g., x86, ARM, NVIDIA Hopper, NVIDIA Blackwell)
Debugging and optimizing GPU kernels using CUDA and/or HIP/ROCm
Developing PyTorch custom operators, backend extensions, or dispatcher integrations (e.g., ATen, TorchScript, or custom backends)
Customizing, extending, or optimizing c, including distributed inference workflows
GPU communication libraries and collectives, such as NVIDIA NCCL, AMD RCCL, MPI, or UCX
Mixed-precision and low-precision kernels (e.g., FP16, BF16, FP8, INT8), including numerical stability and performance trade-offs

Compensation

Hourly contractor rate: $90-$125 USD/hour (dependent on the expertise level and quality of take-home assignment).
Monthly performance bonuses
40 hours per week - fully remote independent contractor role

Process

1) Apply via the job board

Please submit your CV and add a short note on which track fits you best:

2) Short take-home assignment (form)

After you apply, XOR will share a short take-home in the format of a form with a small task.
The Preference Model technical team will review your submission.
In parallel, you can schedule a short call with XOR to learn more about the role and the company and ask questions.

3) Teamlead interview

If the take-home looks strong, we will schedule a technical interview with the Preference Model team.

4) Second take-home assignment (coding task)

Final decision is made after second take-home assignment .

Note on take-home compensation

Time spent on the take-home can be compensated if you receive an offer.

Wymagane umiejętności

GenAI

Python

CI/CD

c++

LLMs

Linux

Znajomość języków

Angielski: C1

Lokalizacja biura

RL Environments Engineer - Low-Level Engineering & Kernel Inference Optimization

90 - 125 USDNetto za godzinę - B2B

Podsumowanie oferty

RL Environments Engineer - Low-Level Engineering & Kernel Inference Optimization

San Carlos St 333, London

Preference Model via XOR Inc.

90 - 125 USDNetto za godzinę - B2B

Aplikując zgadzam się na przetwarzanie moich danych osobowych w celu przeprowadzenia procesu rekrutacyjnego. Please be informed that the data controller is XOR Inc (hereinafter "controller"). You have the right to request access to your personal data... WięcejThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Rekomendowane oferty