Segment + Dedup + Crop pipeline step (SAM3 + Recursive SAM3)
First real segmentation stage. Ports SAM3 + dedup + crop from checkin-pipeline/, adds a RecursiveSegmenter wrapper, and wires the existing testing & settings pages to live backends.
One pipeline stage, three concerns kept together: segment → dedup → crop. The segmenter is the only swappable piece; dedup and crop are always part of the same step. Two segmenter implementations ship in this slice:
Sam3Segmenter— single class, two backends (Segmind SaaS API and the internal DGX HuggingFace server) chosen by config.RecursiveSegmenter— wraps anySegmenter; runs N passes, paints out detected masks with neutral gray before each subsequent pass, stops early when a pass finds nothing.
Configuration lives in a singleton segmenter_settings row (JSONB payload, in-code defaults). The admin testing page becomes a live tool that calls a new stateless POST /admin/api/test/segmentation. Approach B from brainstorming (feature module with a thin orchestrator entry point). A (single bulky stage function) and C (port the full pipeline_runs strategy-selection machinery) rejected.
Only the segmentation stage changes. Orchestrator, neighboring stages, and the PipelineRun audit row are untouched.
01 Context
The destination repo's pipeline orchestrator is a stub. stages.py:22–24:
async def segmentation_stage(session: AsyncSession, deal: Deal) -> None:
"""Stub. Real implementation arrives in a later slice."""
return None
The source — checkin-pipeline/ (a separate FastAPI/HTMX app vendored into this repo as a reference) — already has working segment, dedup, and crop, but split across three separate pipeline steps with operator-driven strategy selection between them. That extra ceremony is not wanted here. This slice imports the algorithms (SAM3 client, dedup, crop) and folds them into a single stage in the destination's simpler orchestrator (orchestrator.py:23–60, one transaction per stage, Procrastinate retry on raise).
The testing page (web/src/routes/admin.testing.segmentation.tsx) and the settings page (admin.settings.segmenter.tsx) already exist as scaffolding. The testing page renders bbox overlays from a seeded PRNG and hardcodes four fake segmenter IDs. The settings page is an AdminPlaceholder. Both become real surfaces here.
02 Scope
In scope
- New backend package
features/segmentation/(~9 files) — Protocol, SAM3, Recursive, dedup, crop, combined pipeline, settings, config, step. - Alembic migration adding
segmenter_settingssingleton table (id=1CHECK constraint, JSONB payload). stages.py:22rewired:segmentation_stage = step.run_segmentation_stage.POST /admin/api/test/segmentation— stateless test endpoint.GET/PUT /admin/api/segmenter/settings— singleton CRUD with masked API key.- Real testing page UI (drop sample images, add upload, recursive controls, dedup toggle, load-from-settings).
- Real settings page form replacing the placeholder.
- ~10 test files across backend & frontend.
Out of scope (deferred or rejected)
- Other segmenter strategies (
gemini,gemini_pro,gemini_groups,gemini_sam3,category_sam3,sam3_gemini_merge) — each is a follow-up PR. /segment/classify(Gemma3 VLM captioning) — explicitly dropped; categorization is its own stage.pipeline_runsstrategy-selection model — orchestrator runs one segmenter per deal.- Per-deal segmenter override — settings are global.
- Per-image partial success — one bad image fails the whole deal; orchestrator retries.
- Settings history / audit table — last-write-wins.
- Server-side overlay rendering (PIL
ImageDrawPNG artifacts) — FE draws from coords. - Deal-detail bbox overlay outside the testing page.
- Segmenter comparison mode (source's
segmenter_compare.py). - Bbox-only recursion fallback —
MissingMaskErrorinstead.
03 Architecture
One package, one stage entry point, two segmenter classes. build_segmenter is the only place that knows the kind→class mapping.
Package layout
features/segmentation/ ├── segmenter.py Segmenter Protocol + Detection + SegmentationResult ├── sam3.py Sam3Segmenter (Sam3Backend.SEGMIND | INTERNAL) ├── recursive.py RecursiveSegmenter(inner: Segmenter, cfg) ├── dedup.py deduplicate_boxes(...) — direct port ├── crop.py crop_masked + crop_boxes — direct port ├── pipeline.py run_segment_dedup_crop(image, segmenter, cfg, out_dir) ├── settings.py SegmenterSettingsRow (DB) + get/put + DEFAULT_SETTINGS ├── config.py Pydantic configs + build_segmenter(settings, http) factory └── step.py run_segmentation_stage(session, deal)
Per-image data flow
data/deals/{deal_id}/runs/{run_id}/staging/ during the stage, then atomically renamed to crops/ at end-of-stage. On rollback the staging dir is deleted. No DB ↔ disk drift on retry.
04 The Segmenter Protocol
class Detection(BaseModel):
bbox: tuple[float, float, float, float] # x1, y1, x2, y2 in image pixels
score: float
label: str | None = None
mask_b64: str | None = None # PNG-encoded full-image mask
class SegmentationResult(BaseModel):
detections: list[Detection]
image_width: int
image_height: int
segmenter_id: str # "sam3" or "recursive(sam3)"
elapsed_ms: int
pass_stats: list[dict] | None = None # recursive only
class Segmenter(Protocol):
id: str
async def segment(self, image_path: Path, params: dict) -> SegmentationResult: ...
image_width/height for its 0..1 overlay coords.05 Sam3Segmenter
Direct port of checkin-pipeline/segmind_client.py:153 (call_sam3_dispatch), repackaged as a Segmenter implementation.
class Sam3Backend(str, Enum):
SEGMIND = "segmind"
INTERNAL = "internal"
class Sam3Config(BaseModel):
backend: Sam3Backend
endpoint: str
api_key: SecretStr | None = None # required for SEGMIND
text_prompt: str | None = None
threshold: float = 0.5
mask_threshold: float = 0.5 # internal only
return_masks: bool = True # segmind only
points_per_side: int = 32 # segmind only
pred_iou_thresh: float = 0.88 # segmind only
max_masks: int = 100 # segmind only
class Sam3Segmenter:
id = "sam3"
def __init__(self, cfg: Sam3Config, http: httpx.AsyncClient): ...
async def segment(self, image_path, params) -> SegmentationResult:
# 1. merge params over self.cfg (params win)
# 2. dispatch to _call_segmind or _call_internal by self.cfg.backend
# 3. normalize both responses to SegmentationResult
enable_classify dropped from scope
The source's call_internal_sam3_classify compresses segment + Gemma3 VLM captioning into one round-trip. In this repo, categorization_stage owns labelling. Keeping a flag for a feature we don't use just adds dead surface — it's removed entirely from Sam3Config and not ported.
Failure modes: missing API key for SEGMIND raises at construction; HTTP non-2xx wraps in Sam3Error(status, body); empty response returns SegmentationResult(detections=[]) (not an error). httpx.AsyncClient is injected (already used by the HubSpot integration) rather than the source's requests.
06 RecursiveSegmenter
Wraps any Segmenter. Runs N passes; paints out detected masks with neutral gray before each subsequent pass; stops early when a pass yields fewer than min_new_detections.
class RecursiveSegmenterConfig(BaseModel):
max_passes: int = 2
min_new_detections: int = 1
fill_color: tuple[int, int, int] = (128, 128, 128)
fill_dilate_px: int = 3
class MissingMaskError(RuntimeError):
"""Inner segmenter returned a detection without a mask. Cannot paint out."""
class RecursiveSegmenter:
def __init__(self, inner: Segmenter, cfg: RecursiveSegmenterConfig):
self.inner = inner
self.cfg = cfg
@property
def id(self) -> str:
return f"recursive({self.inner.id})"
async def segment(self, image_path: Path, params: dict) -> SegmentationResult:
all_dets: list[Detection] = []
pass_stats: list[dict] = []
work_path = image_path
tmp_files: list[Path] = []
try:
for pass_idx in range(self.cfg.max_passes):
result = await self.inner.segment(work_path, params)
pass_stats.append({"pass": pass_idx, "found": len(result.detections),
"elapsed_ms": result.elapsed_ms})
all_dets.extend(result.detections)
if len(result.detections) < self.cfg.min_new_detections:
break
if pass_idx == self.cfg.max_passes - 1:
break
work_path = self._paint_out(image_path, work_path, result.detections)
tmp_files.append(work_path)
finally:
for p in tmp_files:
p.unlink(missing_ok=True)
return SegmentationResult(..., segmenter_id=self.id, pass_stats=pass_stats)
Detection without mask_b64, _paint_out raises MissingMaskError. Painting bare rectangles leaves edges that the next pass latches onto as phantom detections — degraded silence is worse than a visible failure here.
RecursiveSegmenter(inner=Sam3Segmenter(...)) is the only construction shape — works symmetrically for any future segmenter.finally. Downstream crop reads from the original path.len(new) < min_new_detections, not a delta from previous pass. Cleaner semantics: "stop when this pass found nothing".07 Combined segment + dedup + crop
One function, three concerns, never split. No "skip dedup" or "skip crop" flag — only the how is configurable.
class DedupConfig(BaseModel):
iou_threshold: float = 0.5
containment_threshold: float = 0.80
class CropConfig(BaseModel):
use_mask_crops: bool = True
output_format: Literal["png", "jpg"] = "png"
class CombinedConfig(BaseModel):
dedup: DedupConfig
crop: CropConfig
class CombinedResult(BaseModel):
segmentation: SegmentationResult # pre-dedup detections preserved
kept_indices: list[int] # which survived dedup
crops: list[CropOutput] # one per kept
elapsed_ms: dict[str, int] # {"segment": ..., "dedup": ..., "crop": ...}
async def run_segment_dedup_crop(
image_path: Path, segmenter: Segmenter, cfg: CombinedConfig,
crop_out_dir: Path, segmenter_params: dict | None = None,
) -> CombinedResult: ...
kept_indices (not a filtered list) and segmentation (the raw pre-dedup result) live side-by-side. The testing page renders "12 raw, 8 kept", with rejected detections faded — at zero cost. The production step ignores both fields and just uses crops.
dedup.py and crop.py are direct ports from checkin-pipeline/app/dedup.py and the two crop functions in checkin-pipeline/segment.py. No behavior changes. Crop failure on a single detection logs and skips that crop — matches crop_masked's fallback when a mask is empty.
08 Step orchestration + persisted settings
Step entry point
async def run_segmentation_stage(session: AsyncSession, deal: Deal) -> None:
settings = await get_segmenter_settings(session)
http = get_http_client()
segmenter = build_segmenter(settings, http)
cfg = settings.combined
staging = staging_dir_for(deal)
try:
for image in deal.images:
image_path = resolve_image_path(image)
result = await run_segment_dedup_crop(
image_path=image_path, segmenter=segmenter, cfg=cfg,
crop_out_dir=staging / str(image.id),
)
await persist_segmentation_result(session, deal, image, result)
staging.rename(final_crops_dir_for(deal)) # atomic on full-stage success
except Exception:
shutil.rmtree(staging, ignore_errors=True)
raise
Config + factory
class SegmenterKind(str, Enum):
SAM3 = "sam3"
RECURSIVE_SAM3 = "recursive_sam3"
class SegmenterConfig(BaseModel):
kind: SegmenterKind
sam3: Sam3Config
recursive: RecursiveSegmenterConfig | None = None
@model_validator(mode="after")
def _recursive_requires_block(self):
if self.kind == SegmenterKind.RECURSIVE_SAM3 and self.recursive is None:
raise ValueError("kind=recursive_sam3 requires `recursive` config")
return self
def build_segmenter(s: SegmenterSettings, http: httpx.AsyncClient) -> Segmenter:
sam3 = Sam3Segmenter(s.segmenter.sam3, http)
if s.segmenter.kind == SegmenterKind.RECURSIVE_SAM3:
return RecursiveSegmenter(sam3, s.segmenter.recursive)
return sam3
Persistence — singleton row + JSONB blob
class SegmenterSettingsRow(Base):
__tablename__ = "segmenter_settings"
id = Column(Integer, primary_key=True, default=1)
payload = Column(JSONB, nullable=False)
updated_at = Column(DateTime, default=func.now(), onupdate=func.now())
updated_by = Column(String, nullable=True)
__table_args__ = (CheckConstraint("id = 1", name="single_row_only"),)
async def get_segmenter_settings(session) -> SegmenterSettings:
row = await session.get(SegmenterSettingsRow, 1)
if row is None:
return DEFAULT_SETTINGS
return SegmenterSettings.model_validate(row.payload)
DEFAULT_SETTINGS is a Python constant. First read with no row returns the defaults in-memory; the row is only created when someone clicks Save. Avoids "did you run the seed?" footguns and means defaults can change between deploys without a migration.
09 HTTP endpoints
POST /admin/api/test/segmentation
Stateless. No DB writes. No disk side effects. Same build_segmenter factory as production — one code path.
Request
{
"image": { "kind": "deal_image", "deal_id": "...", "image_id": "..." } // or
{ "kind": "upload", "data_url": "data:image/jpeg;base64,..." },
"segmenter_config": SegmenterConfig,
"combined_config": CombinedConfig | null,
"params": object | null,
"return_masks": boolean // query param, default false
}
Response
{
"segmenter_id": "recursive(sam3)",
"image": { "width": 1024, "height": 720 },
"detections": [{ "bbox": [x1,y1,x2,y2], "score": 0.91, "label": null, "mask_b64": null }],
"kept_indices": [0, 2, 3, 5],
"pass_stats": [{ "pass": 0, "found": 7, "elapsed_ms": 1200 }],
"elapsed_ms": { "segment": 1340, "dedup": 4 }
}
kept_indices = null. Include → dedup runs, survivors returned. Crop never runs from this endpoint.GET / PUT /admin/api/segmenter/settings
Singleton. GET masks sam3.api_key as "****". PUT treats api_key=null/omit as "keep existing"; non-empty string replaces.
// GET response
{
"settings": SegmenterSettings, // api_key MASKED
"defaults": SegmenterSettings, // for "Reset to defaults"
"updated_at": "...",
"updated_by": "..."
}
Pydantic validation rejects kind=recursive_sam3 without a recursive block (422). Auth: existing admin gate.
10 Web changes
10.1 admin.testing.segmentation.tsx — mock → real
| Change | Detail |
|---|---|
Replace SEGMENTERS | Two real options: sam3, recursive_sam3. Drop sam2.1, grounding-dino-sam, yolov8-seg. |
| SAM3 backend selector | Radio: Segmind / Internal. Non-active backend fields disabled (not hidden) so layout doesn't jump. |
| Recursive controls | Shown when segmenter === "recursive_sam3": max_passes, min_new_detections, fill_dilate_px. |
pass_stats display | Results panel shows "Pass 1: 7 · Pass 2: 2 · stopped". |
| Drop sample images | Remove the four picsum placeholders. |
| Keep "Random deal" | Picks a real existing deal at random — useful for quick smoke tests. |
| Upload image | New button — reads local file as data URL; sets image source. |
| "Apply dedup" toggle | Off by default. When on, sends combined_config; rejected detections rendered faded. |
| "Load from settings" | One-click prefill from GET /admin/api/segmenter/settings. |
Replace mock run() | Real fetch("/admin/api/test/segmentation", ...). |
| Overlay adapter | Divides absolute-pixel bboxes by image_width/height for the existing 0..1 overlay code. |
10.2 admin.settings.segmenter.tsx — placeholder → form
- Active segmenter — radio: SAM 3 / Recursive SAM 3.
- SAM3 backend — radio + backend-specific fields. Segmind: endpoint, api_key (write-only with
•••• (saved)placeholder when set),points_per_side,pred_iou_thresh,max_masks,return_masks. Internal: endpoint,mask_threshold. Shared:text_prompt,threshold. - Recursive params — collapsed when not active.
- Combined step config —
iou_threshold,containment_threshold,use_mask_crops,output_format. - Footer — Save, Reset to defaults, "Last updated: {ts} by {user}".
Pydantic 422s surface inline next to fields. API key is the only write-only field: empty input = keep existing; typed value = replace.
11 Testing
| Layer | Test | What it covers |
|---|---|---|
| BE unit | test_dedup.py | Ported test vectors (containment suppression, IoU NMS, edges). |
| BE unit | test_crop.py | Mask-shape crop vs bbox fallback when mask empty. |
| BE unit | test_sam3_normalize.py | Both backend response shapes → SegmentationResult. Fixtures from real responses. |
| BE unit | test_recursive.py | Two-pass union; early stop on min_new_detections; MissingMaskError on no-mask detection; temp files cleaned up. |
| BE unit | test_config_validators.py | kind=recursive_sam3 without recursive block; SEGMIND backend without api_key. |
| BE integration | test_segmentation_step.py | Stage runs end-to-end with stub Segmenter; staging promoted on success, deleted on raise. |
| BE integration | test_settings_endpoint.py | GET masks api_key; PUT preserves when omitted; PUT validates recursive block. |
| BE integration | test_testing_endpoint.py | Deal-image + upload; 25 MB cap; mask omit/include via return_masks. |
| FE unit | admin.testing.segmentation.test.tsx | Backend selector toggles disabled state; recursive controls show conditionally; dedup toggle renders rejected detections faded. |
| FE unit | admin.settings.segmenter.test.tsx | Form submission, 422 inline, api_key write-only behavior. |
| FE MSW | handlers + fixtures | Both new endpoints stubbed for FE tests. |
class StubSegmenter:
id = "stub"
def __init__(self, *, detections_per_call: list[list[Detection]]):
self._queue = list(detections_per_call)
async def segment(self, image_path, params):
return SegmentationResult(detections=self._queue.pop(0), image_width=100,
image_height=100, segmenter_id="stub", elapsed_ms=1)
12 Failure modes
| Failure | Behavior |
|---|---|
Missing SEGMIND_API_KEY | Raise at construction. PipelineRun row marked failed with clear message. |
| Segmind 5xx | Sam3Error(status, body) propagates → stage fails → retry. |
| Internal DGX unreachable | httpx.ConnectError propagates → stage fails → retry. |
| Empty segmenter response | SegmentationResult(detections=[]) — not an error. Stage continues with zero crops. |
| Recursive inner: no-mask det | MissingMaskError → stage fails. |
| Recursive inner: pass 2 fails | Re-raise. No partial-from-pass-1. |
| Crop fails for one detection | Log + skip that crop. Other detections proceed. |
| One image fails in a deal of N | Whole stage fails, staging dir deleted, Procrastinate retries the whole deal. |
| Settings row missing | Return DEFAULT_SETTINGS in memory. Logged info. |
| Settings payload fails Pydantic | Raise. Step fails loud rather than running partial config. |
13 Risks
| Risk | Likelihood | Mitigation |
|---|---|---|
| Dilation tuned wrong, phantom detections on pass 2 | medium | Default 3 px, configurable; testing page surfaces pass_stats. |
| 25 MB cap too low for some workflows | low | Single constant; raise in a follow-up if needed. |
| JSONB schema drift between deploys | low | Backward-compatible Pydantic; unknown fields ignored, missing fields default. |
| Segmind rate limits under load | medium | Same risk as check-in pipeline; no new mitigation in this slice. |
| Internal DGX is single-pointed | low | Out of scope — operational concern. |
14 Decisions made
All decisions from brainstorming are resolved. Listed for the record.
- recursive shape Recursive is itself a
Segmenterthat wraps an inner one. - port scope Only SAM3 + RecursiveSegmenter wrapper. Other strategies are follow-ups.
- sam3 backend Both backends behind one class, switchable via config.
- recursion mechanics Neutral-gray fill, N passes (default 2), stop when pass yields <
min_new_detections. - bbox feature Wire the testing page's existing overlay to real backend output; no server-side burn-in.
- prod config Persisted singleton row, edited via admin settings page.
- classify endpoint Explicitly dropped — categorization is its own stage.
- per-image errors None. Whole-deal retry via orchestrator.
- bbox-only recursion Fail loud (
MissingMaskError) rather than paint rectangles. - testing-page samples Drop sample images. Keep "Random deal".
- testing-page dedup Toggle, default off.
15 Files added / changed
Added
backend/src/vcc_backend/features/segmentation/__init__.pyfeatures/segmentation/segmenter.py · sam3.py · recursive.py · dedup.py · crop.py · pipeline.py · settings.py · config.py · step.pybackend/alembic/versions/0008_add_segmenter_settings.pybackend/src/vcc_backend/api/routers/segmenter.py(or extend existing admin router)- ~10 test files across
backend/tests/features/segmentation/andweb/src/routes/ web/src/lib/api/adapters/segmentation.ts
Changed
features/pipeline/stages.py:22—segmentation_stageimported fromsegmentation.stepapi/main.py— register new router(s)web/src/routes/admin.testing.segmentation.tsx— full real-data rewriteweb/src/routes/admin.settings.segmenter.tsx— placeholder replaced with formopenapi.yaml— regenerated
Unchanged
orchestrator.py, stages.py for non-segmentation stages, PipelineRun model, process_deal task — all behavior preserved.