architecture — ask

request flow · one round trip = 1 budget unit · all five defense layers fail open on dependency failure

request pipeline

A POST to /api/ask runs through six stages on Vercel Edge before responding with a typed SSE stream.

① session

HMAC-SHA256(uuid, SESSION_SECRET) signed cookie. HttpOnly, SameSite=Lax, 30-day Max-Age. Verified or re-issued per request.

② sanitize

Strip control chars and zero-width unicode (homoglyph / instruction-smuggling). Cap each turn at 4000 chars; keep last 20.

③ moderation

OpenAI omni-moderation-latest pre-screen on the latest user turn. Flagged content returns 400 content_policy without spending an LLM completion.

④ rate limit

Upstash Redis pipelined INCR + EXPIRE. Three windows per request — session/day (10), IP/day (50), session/minute (3).

⑤ openai

Streaming chat-completions, gpt-4.1, stream_options.include_usage, six tools declared. System prompt: ~3K-token curated knowledge base + nine absolute rules.

⑥ dispatch

Up to two tool-call rounds. Each tool_call emits a reasoning preview, then a tool_call event, then a tool_result. Output scrubber runs both mid-stream and at finalize.

tools

All six tools share the OpenAI function-calling schema and the MCP server schema — one source of truth in api/_knowledge.js.

send_contact_email

Drafts an email via /api/contact → Resend. Tagged via:chat or via:mcp.

lookup_patent

Embeds the query (text-embedding-3-small) and cosine-searches an Upstash Vector index of 90 filings. Falls back to token overlap if vector store is unavailable.

link_to_page

Closed-enum page lookup. Frontend renders the result as a clickable card.

recommend_next_page

Token-overlap score over hand-tagged page keywords. Returns 1–2 hrefs ranked by relevance.

get_career_timeline

Returns the structured 6-stop career arc. Frontend renders as a vertical timeline.

compare_engagements

Filters the six anonymized AREA/00x engagement summaries by domain or topic.

five defenses

No single layer is bulletproof. Together they raise the cost of abuse well above the value of bypassing them.

input sanitization

Strip control + zero-width unicode. Cap length. Annotate (don't refuse) on prompt-injection signal words so the model is on notice.

defensive system prompt

Nine absolute rules appended last in the prompt (highest recency). Cover prompt-leak refusal, third-person enforcement, no fabrication, scope limits.

moderation pre-screen

OpenAI moderations API. Reject abusive content before the LLM is invoked. Adds ~30ms; near-zero cost.

output scrubber

Mid-stream and final-output regex pass. Catches leaked prompt content or first-person Logan impersonation. Replaces with fallback.

rate limit

Three sliding budgets via Upstash Redis. 10/session/day, 50/IP/day, 3/session/minute.

CORS lockdown

Explicit origin allowlist. Pinned to loganlabs.ai + www.loganlabs.ai + localhost. Cross-origin requests outside the list return 403.

mcp server

The same six tools are also published over the Model Context Protocol so any MCP host (Claude Desktop, Claude Code, etc.) can attach.

transport

stdio — local Node binary at mcp-server/server.js.

Imports api/_knowledge.js directly. No duplication; tool definitions and search logic stay in sync.

resource

Exposes the curated knowledge base as ask-logan://knowledge for hosts that prefer reading over calling.

install

See the snippet on /now or the mcp-server/README.md.

stack

runtime

Vercel Edge functions · Node 18 stdlib · zero external dependencies in the API layer.

model

OpenAI gpt-4.1 for chat · text-embedding-3-small for vector · omni-moderation-latest for moderation.

state

Upstash Redis (rate limit) · Upstash Vector (patent search) · signed cookie (session id) · localStorage (client thread history).

contact

Resend transactional email · /api/contact endpoint tagged with provenance (chat / mcp / direct).

frontend

Vanilla HTML / CSS / JS · no build step · custom typed-event SSE parser · hand-rolled markdown renderer.

how charlie worksedge function, six tools, five defenses.