Segment + Dedup + Crop pipeline step (SAM3 + Recursive SAM3)

First real segmentation stage. Ports SAM3 + dedup + crop from checkin-pipeline/, adds a RecursiveSegmenter wrapper, and wires the existing testing & settings pages to live backends.

Spec Draft docs/superpowers/specs/2026-05-15-segment-dedup-crop-step-design.md
What this does

One pipeline stage, three concerns kept together: segment → dedup → crop. The segmenter is the only swappable piece; dedup and crop are always part of the same step. Two segmenter implementations ship in this slice:

  • Sam3Segmenter — single class, two backends (Segmind SaaS API and the internal DGX HuggingFace server) chosen by config.
  • RecursiveSegmenter — wraps any Segmenter; runs N passes, paints out detected masks with neutral gray before each subsequent pass, stops early when a pass finds nothing.

Configuration lives in a singleton segmenter_settings row (JSONB payload, in-code defaults). The admin testing page becomes a live tool that calls a new stateless POST /admin/api/test/segmentation. Approach B from brainstorming (feature module with a thin orchestrator entry point). A (single bulky stage function) and C (port the full pipeline_runs strategy-selection machinery) rejected.

segmentation

Only the segmentation stage changes. Orchestrator, neighboring stages, and the PipelineRun audit row are untouched.

01 Context

The destination repo's pipeline orchestrator is a stub. stages.py:22–24:

async def segmentation_stage(session: AsyncSession, deal: Deal) -> None:
    """Stub. Real implementation arrives in a later slice."""
    return None

The source — checkin-pipeline/ (a separate FastAPI/HTMX app vendored into this repo as a reference) — already has working segment, dedup, and crop, but split across three separate pipeline steps with operator-driven strategy selection between them. That extra ceremony is not wanted here. This slice imports the algorithms (SAM3 client, dedup, crop) and folds them into a single stage in the destination's simpler orchestrator (orchestrator.py:23–60, one transaction per stage, Procrastinate retry on raise).

The testing page (web/src/routes/admin.testing.segmentation.tsx) and the settings page (admin.settings.segmenter.tsx) already exist as scaffolding. The testing page renders bbox overlays from a seeded PRNG and hardcodes four fake segmenter IDs. The settings page is an AdminPlaceholder. Both become real surfaces here.

02 Scope

In scope

  • New backend package features/segmentation/ (~9 files) — Protocol, SAM3, Recursive, dedup, crop, combined pipeline, settings, config, step.
  • Alembic migration adding segmenter_settings singleton table (id=1 CHECK constraint, JSONB payload).
  • stages.py:22 rewired: segmentation_stage = step.run_segmentation_stage.
  • POST /admin/api/test/segmentation — stateless test endpoint.
  • GET/PUT /admin/api/segmenter/settings — singleton CRUD with masked API key.
  • Real testing page UI (drop sample images, add upload, recursive controls, dedup toggle, load-from-settings).
  • Real settings page form replacing the placeholder.
  • ~10 test files across backend & frontend.

Out of scope (deferred or rejected)

  • Other segmenter strategies (gemini, gemini_pro, gemini_groups, gemini_sam3, category_sam3, sam3_gemini_merge) — each is a follow-up PR.
  • /segment/classify (Gemma3 VLM captioning) — explicitly dropped; categorization is its own stage.
  • pipeline_runs strategy-selection model — orchestrator runs one segmenter per deal.
  • Per-deal segmenter override — settings are global.
  • Per-image partial success — one bad image fails the whole deal; orchestrator retries.
  • Settings history / audit table — last-write-wins.
  • Server-side overlay rendering (PIL ImageDraw PNG artifacts) — FE draws from coords.
  • Deal-detail bbox overlay outside the testing page.
  • Segmenter comparison mode (source's segmenter_compare.py).
  • Bbox-only recursion fallback — MissingMaskError instead.

03 Architecture

One package, one stage entry point, two segmenter classes. build_segmenter is the only place that knows the kind→class mapping.

Package layout

features/segmentation/
├── segmenter.py        Segmenter Protocol + Detection + SegmentationResult
├── sam3.py             Sam3Segmenter (Sam3Backend.SEGMIND | INTERNAL)
├── recursive.py        RecursiveSegmenter(inner: Segmenter, cfg)
├── dedup.py            deduplicate_boxes(...) — direct port
├── crop.py             crop_masked + crop_boxes — direct port
├── pipeline.py         run_segment_dedup_crop(image, segmenter, cfg, out_dir)
├── settings.py         SegmenterSettingsRow (DB) + get/put + DEFAULT_SETTINGS
├── config.py           Pydantic configs + build_segmenter(settings, http) factory
└── step.py             run_segmentation_stage(session, deal)

Per-image data flow

image_path ──► Segmenter.segment(path, params) (SAM3 or Recursive(SAM3)) │ └► SegmentationResult { detections, image_w/h, segmenter_id, pass_stats? } │ └► deduplicate_boxes(boxes, scores, iou, containment) ──► kept_indices │ └► crop_masked | crop_boxes (per use_mask_crops) ──► [PNG paths] │ └► persist_segmentation_result(session, deal, image, ...)
Atomic crops Staging dir, rename on success Crops are written to data/deals/{deal_id}/runs/{run_id}/staging/ during the stage, then atomically renamed to crops/ at end-of-stage. On rollback the staging dir is deleted. No DB ↔ disk drift on retry.

04 The Segmenter Protocol

class Detection(BaseModel):
    bbox: tuple[float, float, float, float]   # x1, y1, x2, y2 in image pixels
    score: float
    label: str | None = None
    mask_b64: str | None = None               # PNG-encoded full-image mask

class SegmentationResult(BaseModel):
    detections: list[Detection]
    image_width: int
    image_height: int
    segmenter_id: str                          # "sam3" or "recursive(sam3)"
    elapsed_ms: int
    pass_stats: list[dict] | None = None       # recursive only

class Segmenter(Protocol):
    id: str
    async def segment(self, image_path: Path, params: dict) -> SegmentationResult: ...
choice
Absolute-pixel bboxes
Protocol boundary speaks pixels. The web layer divides by image_width/height for its 0..1 overlay coords.
choice
mask on Detection
Riding mask_b64 on each Detection (not a parallel array) means RecursiveSegmenter unions across passes with no re-alignment.
choice
params: dict per call
Each segmenter validates its own subset. The typed SegmenterConfig is upstream; per-call overrides land here for the testing endpoint.

05 Sam3Segmenter

Direct port of checkin-pipeline/segmind_client.py:153 (call_sam3_dispatch), repackaged as a Segmenter implementation.

class Sam3Backend(str, Enum):
    SEGMIND = "segmind"
    INTERNAL = "internal"

class Sam3Config(BaseModel):
    backend: Sam3Backend
    endpoint: str
    api_key: SecretStr | None = None      # required for SEGMIND
    text_prompt: str | None = None
    threshold: float = 0.5
    mask_threshold: float = 0.5           # internal only
    return_masks: bool = True             # segmind only
    points_per_side: int = 32             # segmind only
    pred_iou_thresh: float = 0.88         # segmind only
    max_masks: int = 100                  # segmind only

class Sam3Segmenter:
    id = "sam3"
    def __init__(self, cfg: Sam3Config, http: httpx.AsyncClient): ...
    async def segment(self, image_path, params) -> SegmentationResult:
        # 1. merge params over self.cfg (params win)
        # 2. dispatch to _call_segmind or _call_internal by self.cfg.backend
        # 3. normalize both responses to SegmentationResult
No classify enable_classify dropped from scope The source's call_internal_sam3_classify compresses segment + Gemma3 VLM captioning into one round-trip. In this repo, categorization_stage owns labelling. Keeping a flag for a feature we don't use just adds dead surface — it's removed entirely from Sam3Config and not ported.

Failure modes: missing API key for SEGMIND raises at construction; HTTP non-2xx wraps in Sam3Error(status, body); empty response returns SegmentationResult(detections=[]) (not an error). httpx.AsyncClient is injected (already used by the HubSpot integration) rather than the source's requests.

06 RecursiveSegmenter

Wraps any Segmenter. Runs N passes; paints out detected masks with neutral gray before each subsequent pass; stops early when a pass yields fewer than min_new_detections.

class RecursiveSegmenterConfig(BaseModel):
    max_passes: int = 2
    min_new_detections: int = 1
    fill_color: tuple[int, int, int] = (128, 128, 128)
    fill_dilate_px: int = 3

class MissingMaskError(RuntimeError):
    """Inner segmenter returned a detection without a mask. Cannot paint out."""

class RecursiveSegmenter:
    def __init__(self, inner: Segmenter, cfg: RecursiveSegmenterConfig):
        self.inner = inner
        self.cfg = cfg

    @property
    def id(self) -> str:
        return f"recursive({self.inner.id})"

    async def segment(self, image_path: Path, params: dict) -> SegmentationResult:
        all_dets: list[Detection] = []
        pass_stats: list[dict] = []
        work_path = image_path
        tmp_files: list[Path] = []
        try:
            for pass_idx in range(self.cfg.max_passes):
                result = await self.inner.segment(work_path, params)
                pass_stats.append({"pass": pass_idx, "found": len(result.detections),
                                   "elapsed_ms": result.elapsed_ms})
                all_dets.extend(result.detections)
                if len(result.detections) < self.cfg.min_new_detections:
                    break
                if pass_idx == self.cfg.max_passes - 1:
                    break
                work_path = self._paint_out(image_path, work_path, result.detections)
                tmp_files.append(work_path)
        finally:
            for p in tmp_files:
                p.unlink(missing_ok=True)
        return SegmentationResult(..., segmenter_id=self.id, pass_stats=pass_stats)
Fail loud No bbox-only fallback If the inner segmenter returns a Detection without mask_b64, _paint_out raises MissingMaskError. Painting bare rectangles leaves edges that the next pass latches onto as phantom detections — degraded silence is worse than a visible failure here.
choice
Injection over inheritance
Inner is constructor-injected. RecursiveSegmenter(inner=Sam3Segmenter(...)) is the only construction shape — works symmetrically for any future segmenter.
choice
Original is never modified
Painted images go to lossless PNG temp files, cleaned up in finally. Downstream crop reads from the original path.
choice
Stop on absolute count
Early stop uses len(new) < min_new_detections, not a delta from previous pass. Cleaner semantics: "stop when this pass found nothing".

07 Combined segment + dedup + crop

One function, three concerns, never split. No "skip dedup" or "skip crop" flag — only the how is configurable.

class DedupConfig(BaseModel):
    iou_threshold: float = 0.5
    containment_threshold: float = 0.80

class CropConfig(BaseModel):
    use_mask_crops: bool = True
    output_format: Literal["png", "jpg"] = "png"

class CombinedConfig(BaseModel):
    dedup: DedupConfig
    crop: CropConfig

class CombinedResult(BaseModel):
    segmentation: SegmentationResult        # pre-dedup detections preserved
    kept_indices: list[int]                 # which survived dedup
    crops: list[CropOutput]                 # one per kept
    elapsed_ms: dict[str, int]              # {"segment": ..., "dedup": ..., "crop": ...}

async def run_segment_dedup_crop(
    image_path: Path, segmenter: Segmenter, cfg: CombinedConfig,
    crop_out_dir: Path, segmenter_params: dict | None = None,
) -> CombinedResult: ...
Why preserve pre-dedup Both raw + survivors on the result kept_indices (not a filtered list) and segmentation (the raw pre-dedup result) live side-by-side. The testing page renders "12 raw, 8 kept", with rejected detections faded — at zero cost. The production step ignores both fields and just uses crops.

dedup.py and crop.py are direct ports from checkin-pipeline/app/dedup.py and the two crop functions in checkin-pipeline/segment.py. No behavior changes. Crop failure on a single detection logs and skips that crop — matches crop_masked's fallback when a mask is empty.

08 Step orchestration + persisted settings

Step entry point

async def run_segmentation_stage(session: AsyncSession, deal: Deal) -> None:
    settings = await get_segmenter_settings(session)
    http     = get_http_client()
    segmenter = build_segmenter(settings, http)
    cfg       = settings.combined

    staging = staging_dir_for(deal)
    try:
        for image in deal.images:
            image_path = resolve_image_path(image)
            result = await run_segment_dedup_crop(
                image_path=image_path, segmenter=segmenter, cfg=cfg,
                crop_out_dir=staging / str(image.id),
            )
            await persist_segmentation_result(session, deal, image, result)
        staging.rename(final_crops_dir_for(deal))    # atomic on full-stage success
    except Exception:
        shutil.rmtree(staging, ignore_errors=True)
        raise
Decision: bubble up No per-image try/except One bad image fails the whole stage. The orchestrator's per-stage transaction rolls back and Procrastinate retries the whole deal. Avoids designing a per-image status surface (new column, UI to display) for a failure mode we have no data on yet. If empirical SAM3 flakiness later shows we need partial-success, it's an additive change.

Config + factory

class SegmenterKind(str, Enum):
    SAM3 = "sam3"
    RECURSIVE_SAM3 = "recursive_sam3"

class SegmenterConfig(BaseModel):
    kind: SegmenterKind
    sam3: Sam3Config
    recursive: RecursiveSegmenterConfig | None = None

    @model_validator(mode="after")
    def _recursive_requires_block(self):
        if self.kind == SegmenterKind.RECURSIVE_SAM3 and self.recursive is None:
            raise ValueError("kind=recursive_sam3 requires `recursive` config")
        return self

def build_segmenter(s: SegmenterSettings, http: httpx.AsyncClient) -> Segmenter:
    sam3 = Sam3Segmenter(s.segmenter.sam3, http)
    if s.segmenter.kind == SegmenterKind.RECURSIVE_SAM3:
        return RecursiveSegmenter(sam3, s.segmenter.recursive)
    return sam3

Persistence — singleton row + JSONB blob

class SegmenterSettingsRow(Base):
    __tablename__ = "segmenter_settings"
    id         = Column(Integer, primary_key=True, default=1)
    payload    = Column(JSONB, nullable=False)
    updated_at = Column(DateTime, default=func.now(), onupdate=func.now())
    updated_by = Column(String, nullable=True)
    __table_args__ = (CheckConstraint("id = 1", name="single_row_only"),)

async def get_segmenter_settings(session) -> SegmenterSettings:
    row = await session.get(SegmenterSettingsRow, 1)
    if row is None:
        return DEFAULT_SETTINGS
    return SegmenterSettings.model_validate(row.payload)
Defaults in code No seed migration DEFAULT_SETTINGS is a Python constant. First read with no row returns the defaults in-memory; the row is only created when someone clicks Save. Avoids "did you run the seed?" footguns and means defaults can change between deploys without a migration.

09 HTTP endpoints

POST /admin/api/test/segmentation

Stateless. No DB writes. No disk side effects. Same build_segmenter factory as production — one code path.

Request

{
  "image": { "kind": "deal_image", "deal_id": "...", "image_id": "..." }   // or
            { "kind": "upload", "data_url": "data:image/jpeg;base64,..." },
  "segmenter_config": SegmenterConfig,
  "combined_config": CombinedConfig | null,
  "params": object | null,
  "return_masks": boolean    // query param, default false
}

Response

{
  "segmenter_id": "recursive(sam3)",
  "image": { "width": 1024, "height": 720 },
  "detections": [{ "bbox": [x1,y1,x2,y2], "score": 0.91, "label": null, "mask_b64": null }],
  "kept_indices": [0, 2, 3, 5],
  "pass_stats": [{ "pass": 0, "found": 7, "elapsed_ms": 1200 }],
  "elapsed_ms": { "segment": 1340, "dedup": 4 }
}
limit
25 MB upload cap
Single constant. 413 above. Raise if a workflow needs more.
image modes
deal_image · upload
Two modes only: pick an existing deal image, or upload a data URL. No URL fetcher.
dedup
combined_config is optional
Omit → segment only, kept_indices = null. Include → dedup runs, survivors returned. Crop never runs from this endpoint.

GET / PUT /admin/api/segmenter/settings

Singleton. GET masks sam3.api_key as "****". PUT treats api_key=null/omit as "keep existing"; non-empty string replaces.

// GET response
{
  "settings": SegmenterSettings,    // api_key MASKED
  "defaults": SegmenterSettings,    // for "Reset to defaults"
  "updated_at": "...",
  "updated_by": "..."
}

Pydantic validation rejects kind=recursive_sam3 without a recursive block (422). Auth: existing admin gate.

10 Web changes

10.1 admin.testing.segmentation.tsx — mock → real

ChangeDetail
Replace SEGMENTERSTwo real options: sam3, recursive_sam3. Drop sam2.1, grounding-dino-sam, yolov8-seg.
SAM3 backend selectorRadio: Segmind / Internal. Non-active backend fields disabled (not hidden) so layout doesn't jump.
Recursive controlsShown when segmenter === "recursive_sam3": max_passes, min_new_detections, fill_dilate_px.
pass_stats displayResults panel shows "Pass 1: 7 · Pass 2: 2 · stopped".
Drop sample imagesRemove the four picsum placeholders.
Keep "Random deal"Picks a real existing deal at random — useful for quick smoke tests.
Upload imageNew button — reads local file as data URL; sets image source.
"Apply dedup" toggleOff by default. When on, sends combined_config; rejected detections rendered faded.
"Load from settings"One-click prefill from GET /admin/api/segmenter/settings.
Replace mock run()Real fetch("/admin/api/test/segmentation", ...).
Overlay adapterDivides absolute-pixel bboxes by image_width/height for the existing 0..1 overlay code.

10.2 admin.settings.segmenter.tsx — placeholder → form

  • Active segmenter — radio: SAM 3 / Recursive SAM 3.
  • SAM3 backend — radio + backend-specific fields. Segmind: endpoint, api_key (write-only with •••• (saved) placeholder when set), points_per_side, pred_iou_thresh, max_masks, return_masks. Internal: endpoint, mask_threshold. Shared: text_prompt, threshold.
  • Recursive params — collapsed when not active.
  • Combined step configiou_threshold, containment_threshold, use_mask_crops, output_format.
  • Footer — Save, Reset to defaults, "Last updated: {ts} by {user}".

Pydantic 422s surface inline next to fields. API key is the only write-only field: empty input = keep existing; typed value = replace.

11 Testing

LayerTestWhat it covers
BE unittest_dedup.pyPorted test vectors (containment suppression, IoU NMS, edges).
BE unittest_crop.pyMask-shape crop vs bbox fallback when mask empty.
BE unittest_sam3_normalize.pyBoth backend response shapes → SegmentationResult. Fixtures from real responses.
BE unittest_recursive.pyTwo-pass union; early stop on min_new_detections; MissingMaskError on no-mask detection; temp files cleaned up.
BE unittest_config_validators.pykind=recursive_sam3 without recursive block; SEGMIND backend without api_key.
BE integrationtest_segmentation_step.pyStage runs end-to-end with stub Segmenter; staging promoted on success, deleted on raise.
BE integrationtest_settings_endpoint.pyGET masks api_key; PUT preserves when omitted; PUT validates recursive block.
BE integrationtest_testing_endpoint.pyDeal-image + upload; 25 MB cap; mask omit/include via return_masks.
FE unitadmin.testing.segmentation.test.tsxBackend selector toggles disabled state; recursive controls show conditionally; dedup toggle renders rejected detections faded.
FE unitadmin.settings.segmenter.test.tsxForm submission, 422 inline, api_key write-only behavior.
FE MSWhandlers + fixturesBoth new endpoints stubbed for FE tests.
class StubSegmenter:
    id = "stub"
    def __init__(self, *, detections_per_call: list[list[Detection]]):
        self._queue = list(detections_per_call)
    async def segment(self, image_path, params):
        return SegmentationResult(detections=self._queue.pop(0), image_width=100,
                                  image_height=100, segmenter_id="stub", elapsed_ms=1)

12 Failure modes

FailureBehavior
Missing SEGMIND_API_KEYRaise at construction. PipelineRun row marked failed with clear message.
Segmind 5xxSam3Error(status, body) propagates → stage fails → retry.
Internal DGX unreachablehttpx.ConnectError propagates → stage fails → retry.
Empty segmenter responseSegmentationResult(detections=[]) — not an error. Stage continues with zero crops.
Recursive inner: no-mask detMissingMaskError → stage fails.
Recursive inner: pass 2 failsRe-raise. No partial-from-pass-1.
Crop fails for one detectionLog + skip that crop. Other detections proceed.
One image fails in a deal of NWhole stage fails, staging dir deleted, Procrastinate retries the whole deal.
Settings row missingReturn DEFAULT_SETTINGS in memory. Logged info.
Settings payload fails PydanticRaise. Step fails loud rather than running partial config.

13 Risks

RiskLikelihoodMitigation
Dilation tuned wrong, phantom detections on pass 2mediumDefault 3 px, configurable; testing page surfaces pass_stats.
25 MB cap too low for some workflowslowSingle constant; raise in a follow-up if needed.
JSONB schema drift between deployslowBackward-compatible Pydantic; unknown fields ignored, missing fields default.
Segmind rate limits under loadmediumSame risk as check-in pipeline; no new mitigation in this slice.
Internal DGX is single-pointedlowOut of scope — operational concern.

14 Decisions made

All decisions from brainstorming are resolved. Listed for the record.

  1. recursive shape Recursive is itself a Segmenter that wraps an inner one.
  2. port scope Only SAM3 + RecursiveSegmenter wrapper. Other strategies are follow-ups.
  3. sam3 backend Both backends behind one class, switchable via config.
  4. recursion mechanics Neutral-gray fill, N passes (default 2), stop when pass yields < min_new_detections.
  5. bbox feature Wire the testing page's existing overlay to real backend output; no server-side burn-in.
  6. prod config Persisted singleton row, edited via admin settings page.
  7. classify endpoint Explicitly dropped — categorization is its own stage.
  8. per-image errors None. Whole-deal retry via orchestrator.
  9. bbox-only recursion Fail loud (MissingMaskError) rather than paint rectangles.
  10. testing-page samples Drop sample images. Keep "Random deal".
  11. testing-page dedup Toggle, default off.

15 Files added / changed

Added

  • backend/src/vcc_backend/features/segmentation/__init__.py
  • features/segmentation/segmenter.py · sam3.py · recursive.py · dedup.py · crop.py · pipeline.py · settings.py · config.py · step.py
  • backend/alembic/versions/0008_add_segmenter_settings.py
  • backend/src/vcc_backend/api/routers/segmenter.py (or extend existing admin router)
  • ~10 test files across backend/tests/features/segmentation/ and web/src/routes/
  • web/src/lib/api/adapters/segmentation.ts

Changed

  • features/pipeline/stages.py:22segmentation_stage imported from segmentation.step
  • api/main.py — register new router(s)
  • web/src/routes/admin.testing.segmentation.tsx — full real-data rewrite
  • web/src/routes/admin.settings.segmenter.tsx — placeholder replaced with form
  • openapi.yaml — regenerated

Unchanged

orchestrator.py, stages.py for non-segmentation stages, PipelineRun model, process_deal task — all behavior preserved.