# Poolside — Member of Engineering (Pre-training / Data Engineering)

| Field | Value |
|---|---|
| **Date found** | 2026-05-28 |
| **Company** | Poolside |
| **Role** | Member of Engineering (Pre-training / Data Engineering) |
| **Location** | EMEA Remote |
| **Salary** | Undisclosed |
| **Job URL** | https://www.linkedin.com/jobs/view/4390569541/ |
| **Status** | New |

---

## Company Research

| Field | Value |
|---|---|
| **Headquarters** | San Francisco, CA, USA |
| **Founded** | 2023 |
| **Employees** | 51–200 |
| **LinkedIn** | linkedin.com/company/poolside-ai |
| **Website** | poolside.ai |
| **Blog** | — |

- **Product:** Frontier AI company building code generation and developer-facing LLMs trained on software-specific data
- **Customers:** Enterprise developers and software teams requiring AI-powered code assistance
- **Notable:** Backed by Jeff Bezos and Nebius; raised $500M+ at ~$3B valuation; focused on frontier model pre-training for code

---

## Job Summary

**What they do:** Poolside is a frontier AI company building specialised LLMs for software development, with backing from Jeff Bezos and Nebius.

**The role:** Senior IC data engineer supporting pre-training infrastructure — owning data pipelines, processing at scale, and training data quality for frontier model development.

**Core work:**
- Build and maintain large-scale data processing pipelines for LLM pre-training (Polars, Dask, PySpark)
- Design and operate distributed training data infrastructure (Slurm, Airflow, Dagster)
- Ensure data quality and pipeline reliability for frontier model training runs

**Stack:** Python · Slurm · Airflow · Dagster · Docker · Kubernetes · Polars · Dask · PySpark · vLLM · Prometheus · Grafana

**Work style:** Fully remote EMEA; frontier AI startup environment with equity upside

---

## Score: 63%

| Dimension | Score | Justification |
|---|---|---|
| Agentic AI depth (25%) | 10% | Pre-training data engineering role — no agentic AI, LLM orchestration, or multi-agent system work; purely data infrastructure |
| Tech fit (25%) | 50% | Python, Docker/k8s, distributed computing present; but Slurm/Polars/Dask/PySpark are not Luca's primary stack; no LangChain/LangGraph/RAG |
| Remote fit (25%) | 100% | Fully remote EMEA — perfect fit |
| Company culture fit (15%) | 90% | Frontier AI startup (51–200), backed by Bezos/Nebius, fast-moving, high-autonomy culture signals — excellent culture fit |
| IC/leadership balance (10%) | 100% | Pure IC senior engineer role with no management signals |
| **Final (weighted)** | **63%** | |

---

## Strengths

- Exceptional company: frontier AGI startup (Bezos + Nebius-backed), $3B+ valuation, cutting-edge model research
- Fully remote EMEA — perfect location fit
- Pure IC senior engineer role — no management pressure
- High culture fit: small, fast-moving, technically elite team

---

## Weaknesses & Risks

- Zero agentic AI depth — pre-training data engineering is the opposite end of the AI stack from Luca's expertise
- Stack mismatch: Slurm/Polars/Dask/PySpark/Dagster are not Luca's primary tools (though Python/Docker overlap exists)
- Salary undisclosed — no signal on compensation despite high valuation
- Role is infrastructure for model training, not building AI products or systems — significant career direction risk
- The "low agentic score" (10%) reflects a genuine skills/interest mismatch, not just stack gap

---

## Suggestions

- Only pursue if Poolside's mission (frontier model training) is itself a strong draw — the work is data infrastructure, not agentic AI
- If applying, frame ML infrastructure experience from cybersecurity and Fenergo roles (ETL, data processing, AWS)
- Consider reaching out to ask if any applied AI / agentic roles exist at Poolside beyond pre-training infra
- The company quality is excellent but this specific role is a poor fit for Luca's core specialism

---

## Interview Tracker

| Stage | Date | Notes |
|---|---|---|
| Applied | | |
| Recruiter screen | | |
| Technical interview | | |
| Final round | | |
| Offer / Outcome | | |