# Ruby Labs — AI Engineer

| Field | Value |
|---|---|
| **Date found** | 2026-06-03 |
| **Company** | Ruby Labs |
| **Role** | AI Engineer |
| **Location** | EU Remote (±4h CET) |
| **Salary** | Undisclosed (contractor/independent agreement) |
| **Job URL** | [LinkedIn](https://www.linkedin.com/jobs/view/4422661583/) |
| **Status** | New |

---

## Company Research

| Field | Value |
|---|---|
| **Headquarters** | London, UK |
| **Founded** | 2018 |
| **Employees** | 201–500 (236 on LinkedIn) |
| **LinkedIn** | [Ruby Labs](https://www.linkedin.com/company/rlabs-company/) — 30,953 followers |
| **Website** | rubylabs.com |
| **Blog** | — |

- **Product:** Consumer apps in health, education, and entertainment — seven products shipped to global markets.
- **Customers:** Mass-market consumers worldwide.
- **Notable:** 174% 2-year employee growth; operates via independent contractor agreements (not traditional employment).

Indeed rating: 4.5/5 (culture 3.4, WLB 4.0, management 3.7, comp 3.7)

---

## Job Summary

**What they do:** Ruby Labs builds and operates consumer-facing AI-powered products across health, education, and entertainment verticals.

**The role:** Senior AI Engineer (IC), owning LLM pipeline design, evaluation, and observability end-to-end for production consumer features.

**Core work:**
- Build complex LLM workflows with LangChain/LlamaIndex — prompt templates, structured outputs (Zod/JSON schemas), function calling
- Develop evaluation pipelines and LLM observability using Langfuse (tracing, scoring, feedback loops)
- Run systematic A/B tests across models via OpenRouter and make data-driven deployment decisions

**Stack:** Node.js · Next.js · TypeScript · LangChain · LlamaIndex · Langfuse · OpenRouter · Python (nice-to-have) · RAG

**Work style:** Fully remote, CET ±4h. Contractor agreement (not employment). Fast-moving, high-accountability culture.

---

## Score: 55%

| Dimension | Score | Justification |
|---|---|---|
| Agentic AI depth (25%) | 20% | Basic LLM integration: prompt engineering, eval pipelines, A/B tests. No multi-agent orchestration or autonomous workflows. |
| Tech fit (25%) | 30% | Node.js/TypeScript is the primary required stack; Python is only "nice to have" — significant gap for a Python-primary candidate. |
| Remote fit (25%) | 100% | Fully remote EU, flexible CET ±4h window. |
| Company culture fit (15%) | 55% | Fast-growing consumer tech (174% 2-year growth) but health/education apps, not AI-native in the agentic sense. |
| IC/leadership balance (10%) | 95% | Hands-on IC with full ownership of features from prototype to production. |
| **Final (weighted)** | **55%** | |

---

## Strengths

- Fully remote EU with no office commitment
- Good LLM observability stack (Langfuse, LangChain) — some stack overlap
- Hands-on IC role with direct ownership and unlimited PTO

---

## Weaknesses & Risks

- **Primary stack is TypeScript/Node.js, not Python** — hard gap; Python is explicitly listed as "nice to have" only
- **Contractor agreement** (not employment) — tax/legal complexity; no traditional employment protections
- **Agentic depth is minimal** — no multi-agent systems, just LLM integrations and evaluation pipelines
- **Consumer app domain** (health/education/entertainment) — not AI-native agentic startup Luca targets
- Salary undisclosed; independent contractor structure complicates comparison

---

## Suggestions

- Only apply if comfortable with TypeScript/Node.js as primary delivery language
- Clarify salary and contractor vs. employment status before investing time
- Emphasise evaluation pipeline experience, Langfuse/observability, and data-driven model selection work

---

## Interview Tracker

| Stage | Date | Notes |
|---|---|---|
| Expired | 2026-06-09 | Posting expired — no longer accepting applications |
| Applied | | |
| Recruiter screen | | |
| Technical interview | | |
| Final round | | |
| Offer / Outcome | | |