Table of contents
Open Table of contents
what share is
Share is Instagram for onchain assets. as Polymarket’s official partner, it’s the face of Polymarket onchain on mobile.
i own the discovery layer of that: the read path, rendering, search, explore, leaderboards, positions and notifications behind every sports-market screen in the app. it runs across two services i live in every day, a Go gateway and a Python service, talking to Postgres, Redis and external markets APIs.
owning a layer means i’m on the hook for what people feel, not just the code i merge. so i’ve organized this by what each piece actually moved. everything here is shipped and in production.
performance
this is my favorite part. when something is slow i don’t want a theory, i want to see it. so most of this is the same loop: measure, find where requests serialize or redo dumb work, remove it, prove it’s gone.
- prediction tags: >10s to ~11ms on a warm cache (~100x), and the upstream 429s gone. the cache was plain TTL, so every expiry was a hard miss that re-ran the whole multi-call fetch on whichever request was unlucky. i moved it to stale-while-revalidate: a soft TTL for freshness, a hard TTL for validity. stale reads return instantly and kick off one background refresh, shared across replicas with a Redis
SET NX EXlock so nothing stampedes on expiry. - explore / series / odds-chart: ~14s / 7s / 9s to sub-second. the killer was N+1 database queries for team colors on every row, plus a per-row DB hit building chart titles. batched the color lookups, served titles from cached team data.
- leaderboard: 10-16s to ~4.2s median (~4x). two N+1s, one per service, both found in traces. the Go side did one Redis
GETper wallet, 50 round-trips for a 50-row board, so i made it oneMGET. the Python side resolved proxy→EOA wallets one Postgres query at a time, so i made it one batch query. processing dropped to a few hundred ms. what’s left is an external API, not us. - username search: timeout to sub-second. a B-tree index, after actually benchmarking B-tree vs GIN +
pg_trgmfor the query shape instead of guessing. - found why FastAPI throughput fell apart under load. concurrent requests were serializing on the single asyncio event loop, because blocking HTTP and CPU-bound transforms were running right on it. at 100 concurrent searches the endpoint ran 8.3x slower than its own single-request time. the rule i set: I/O waits go async on the loop (httpx, HTTP/2, a wide pool), CPU work gets pushed to a thread pool. then i moved 40+ client methods to async and got the transforms off the loop. scaling went back to roughly linear.
reliability
money-adjacent code has to be correct. these are the bugs that taught me to respect the details.
- a 7ms idempotency race in prod. duplicate posts and comments showing up milliseconds apart, the classic Redis GET-then-SET. replaced it with an atomic
SETNXand a rollback. window closed. - 87% of closed positions were silently dropped (48 → 182, 3.8x). two bugs across 365 positions i went through by hand: pagination stopped early because it checked the requested limit instead of the API’s hard 50-per-page cap, and the markets fetch never passed
closed=true(it defaults to false), so every settled market vanished. the same fix cutmarket_not_foundskips by 85% (240 → 35). - proxy-wallet search was just broken. proxy and EOA addresses were stored swapped. fixed the resolution to go one direction only, gated on wallet type.
- killed a crash loop from memory growth. the service sat near 92% RAM and kept restarting, and there was no memory observability to tell a real leak from a plateau. added per-worker metrics (RSS, threads, file descriptors, GC by generation) and continuous profiling, then traced the growth to multi-worker preload fragmentation, not a leak. that pointed the fix at worker and process tuning instead of a phantom hunt.
- capped runaway concurrency. put a limit on errgroups that fan out over big arrays, so they stop spiking DB connections and CPU.
architecture
- rewrote the event and market core in Go. the prediction event and market reads used to proxy every request to the Python service. i moved that core into the Go gateway so it serves natively. the hard part is safety, so i built a three-mode rollout: proxy, shadow, native. shadow runs both at once on live traffic, diffs every field, emits a mismatch metric. you prove parity before you cut over. i did the core pieces and set the pattern, and the rest of the team is extending it.
- a backend-driven display contract for sports markets. all the per-market-type display logic used to live in the client. i moved it into a backend
displayTitle/displaySubtitlecontract with the category derived at read time, so fixing or adding a market type is a backend change, not an app release. - re-architected error handling in the Python service. prod 500s were almost impossible to debug: across 200M log lines in a week, zero tracebacks. root cause was a guessing game. i replaced HTTP-coupled exceptions with a typed error model: a small set of kinds that map to status and log severity, a free-form code for the exact cause, structured context logged untruncated, one translation boundary, one
problem+jsonshape out. now a 500 arrives carrying its own cause and trace id. hours of local repro became reading one line. - wrote the team’s engineering standards for the service: typed domain ids,
Decimalfor money and odds, UTC everywhere, typed exceptions that are never swallowed, async that never blocks the loop.
product
i also ship a lot of the surface people actually touch.
- led the Spreads / Totals / Props launch. ~40 urgent tickets in 3 weeks, 10+ sports (tennis, cricket, soccer, UFC, esports, basketball, rugby), zero rollbacks. spread sorting, UFC rounds, odd/even grouping, player props with sport-specific verb and unit, tennis 1st-set split.
- led the FIFA World Cup 2026 backend. a dedicated page, group sub-tags and team maps, per-group pages, search injection so games show up, live timestamp and status fixes, and compatibility for older iOS clients gated on App Store version.
- built the finance / crypto explore. synthetic time-period tags, a dual-call event setup, 3rd-degree nested tag AND-filtering, a unified explore-tag config.
- built @-mentions of any user or token, end to end: prefix autocomplete, structured-text storage, the data model, bulk hydration, push and pull notifications.
- built the points / rewards (airdrop) system: on-chain and in-app activity aggregation, scoring, leaderboard, schema.
- built the prediction notifications pipeline. every prediction notification routes through it.
- wired up analytics and observability: Mixpanel events and Sentry, including in Lambda.
stack
Go and Python in prod. Postgres, Redis, DynamoDB for storage. AWS (ECS, Lambda). Polymarket, Gamma and Polynode on the outside. ~10 months, ~280+ PRs across the two repos, but the lines above are the ones i’d actually point at.