Overview

UPI Ingestor is a personal finance automation pipeline built as a single Next.js application deployed on Vercel. It has no background workers or queues — the entire flow runs synchronously inside a single function invocation, triggered either by a daily cron job or a manual fetch from the dashboard.

Security: Every credential (Gmail OAuth tokens, Monarch password) is encrypted at rest using AES-256-GCM before being stored in Supabase. All tables enforce Row Level Security scoped to auth.uid() = user_id. Browser-facing routes use the publishable/anon key with SSR sessions; the cron job and pipeline use SUPABASE_SECRET_KEY via lib/supabase/admin.ts (server-only, never exposed to the client).

Pipeline Data Flow

Triggers → Core pipeline → Output
⏰ Vercel Cron + 🖥 Web UI Pipeline Gmail API
raw email bodies
Parser Registry Dedup check skip if exists
new transaction
INSERT transaction Categorizer
→ rules or mapping hit
Monarch Publisher Supabase
→ no category match
needs_review Telegram Bot
user picks · saves merchant mapping
Monarch Publisher Supabase
Trigger Core pipeline External service Data store

Pipeline Steps

1
Trigger

Vercel cron fires GET /api/cron/poll at 3:30 UTC daily (vercel.json), authenticated with CRON_SECRET. The handler lists all rows in gmail_connections using the secret-key admin client, then runs the pipeline per user. The UI can also trigger POST /api/gmail/fetch-now manually (session auth). Both call processUserTransactions(userId) in src/lib/pipeline.ts.

2
Gmail fetch — email-sources/gmail.ts

Loads the user's encrypted Gmail OAuth refresh token from gmail_connections, decrypts it, and calls the Gmail API to list messages matching the GMAIL_FETCH_LABEL (default: UPI) received within the last GMAIL_FETCH_DAYS_BACK days.

3
Parsing — parsers/hdfc.ts

Each email body is matched against the HDFC parser using regex. Extracts amount, merchantRaw, bankRefId, occurredAt, and a normalized merchant key. Parse errors are reported via Telegram if a bot is linked. The parser registry (parsers/index.ts) is additive — new bank parsers can be registered without changing the pipeline.

4
Dedup & persist

Each ParsedTransaction is checked against transactions by (user_id, bank_ref_id). Duplicates are skipped. New transactions are inserted with status: pending.

5
Categorization — categorizer/engine.ts

Pass 1 — Rules: user-defined rules in the rules table. Each rule has a match_type (regex / contains / equals), a target field (merchant / sender / body), and a pattern. Evaluated in ascending priority order; first match wins.

Pass 2 — Merchant mappings: the normalized merchant key is looked up in merchant_mappings. Mappings are learned when a user responds to a Telegram prompt. No match → human-in-the-loop flow.

6
Human-in-the-loop — Telegram Bot

Unknown transactions are marked needs_review. A Telegram inline keyboard is sent with the user's three most recently used categories plus a "Type new category" option. A row is inserted into pending_reviews (7-day TTL).

When the user taps a button, POST /api/telegram/webhook receives a callback_query with callback_data: cat:<txId>:<category>. The handler saves a new merchant_mapping and triggers republish.

7
Publish — publishers/monarch/

Decrypts the user's stored Monarch credential, calls their GraphQL createTransaction mutation, and updates the transaction row with status: published and the returned published_id. On failure, status: failed is set and the error is stored in raw_payload.publish_error for manual retry.

Module Reference

ModuleRoleKey exports
lib/pipeline.tsPipeline orchestratorprocessUserTransactions(userId)
lib/email-sources/gmail.tsGmail API client; fetches messages matching the UPI labelfetchGmailTransactions(userId)
lib/parsers/hdfc.tsRegex parser for HDFC UPI notification emailsparseHdfc(body)
lib/parsers/index.tsParser registry — tries all parsers; returns first matchparseEmail(body)
lib/categorizer/engine.tsTwo-pass categorization: rules → merchant mappings → unknowncategorizeTransaction(supabase, userId, tx)
lib/categorizer/normalize.tsMerchant name normalizationnormalizeMerchant(raw)
lib/merchant-mappings.tsCRUD helpers for the learned merchant→category mapping tablesaveMerchantMapping()
lib/publishers/monarch/Monarch GraphQL publisher; handles credential decryption and mutationpublisher.publish(userId, tx)
lib/telegram/client.tsTelegram Bot API wrapper for sending inline keyboard messagessendTelegramMessage(chatId, text, markup)
lib/crypto/encryption.tsAES-256-GCM encrypt/decrypt for credentials stored in Supabaseencrypt(plain) · decrypt(enc)
lib/supabase/admin.tsSecret-key Supabase client for cron and pipeline (bypasses RLS server-side)createAdminClient()

API Route Reference

Method + PathAuthPurpose
GET /api/cron/pollCRON_SECRET + secret keyVercel cron trigger — lists all gmail_connections, runs full pipeline per user
POST /api/gmail/fetch-nowsessionManual pipeline trigger from the dashboard
POST /api/telegram/webhookwebhook secretReceives Telegram callback queries; saves merchant mapping and republishes
GET /api/transactionssessionList transactions with optional ?status= filter
GET /api/transactions/:idsessionSingle transaction detail
POST /api/transactions/:id/republishsessionRetry a failed Monarch publish
GET/POST /api/rulessessionList / create categorization rules
DELETE /api/rules/:idsessionDelete a rule
GET /api/auth/googleInitiates Google OAuth flow
POST /api/connect/telegramsessionGenerates a Telegram link code
POST /api/connect/monarchsessionStores encrypted Monarch credential
GET /api/connect/monarch/categoriessessionFetches the user's Monarch category list

Database Schema

All tables live in the public schema on Supabase Postgres with Row Level Security. Every policy enforces auth.uid() = user_id.

transactions
iduuid PK
user_iduuid FK
bank_ref_idtextdedup key
amountnumeric(12,2)
merchant_rawtext
merchant_normalizedtext
occurred_attimestamptz
categorytext?
statustextpending · needs_review · published · failed
published_idtext?Monarch tx ID
raw_payloadjsonb
rules
match_typetextregex · contains · equals
fieldtextmerchant · sender · body
patterntext
categorytext
priorityintlower = higher priority
merchant_mappings
merchant_keytextnormalized merchant
categorytext
unique(user_id, merchant_key)
gmail_connections
user_iduuid PK
email_addresstext
refresh_token_encjsonbAES-GCM
monarch_connections
user_iduuid PK
emailtext
credential_encjsonbAES-GCM
default_account_idtext?
telegram_links
user_iduuid unique
chat_idtext?null until linked
link_codetext unique
pending_reviews
transaction_iduuid FKunique
telegram_message_idtext
expires_attimestamptz7-day TTL