Koomi Bot — What We Need to Build Phase 2

Phase 1 of the bot is live — it reads your KB and suggests replies in Periskope. Phase 2 will make it proactive: when a merchant reports an issue, the bot will have already pulled the relevant diagnostics so your CS team can respond immediately instead of digging through systems first.

To build this well, we need your help. This page explains what we are asking for, who we need it from, and how long each part takes.

What Phase 2 does for your team

Faster diagnostics

When a merchant says "printer not working", the bot has already pulled printer status, last error, and network state. Your CS agent responds with "I see it — try X" instead of "let me check and get back to you."

Catch silent failures

Some issues — overnight sync failures, delivery integration drops, mall FTP push failures — go unnoticed until they cause a bigger problem. The bot catches them early.

Spot platform-wide patterns

If the same failure is hitting multiple merchants at once, the bot flags it as a platform issue instead of leaving CS to piece it together from separate tickets.

At a glance — what we need

Everything below is detailed in the sections that follow. This table is the quick overview so you can plan who does what.

What	Why	Who / time
Screenshare with a CS agent	Watch the real support workflow so the bot mirrors how your team actually works.	CS, 60-90 min
Written Q&A (below)	Captures escalation paths, KB gaps, and success criteria that a single session can't cover.	CS, ~30 min
Architecture walkthrough	Understand where merchant data lives and what signals are available, in one conversation.	Eng lead, 60-90 min
Tech Q&A (below)	Quick written answers on monitoring, infrastructure, and system access.	Eng lead, ~25 min
Staging merchant access	Let us prototype detection safely without touching a real business.	Eng, one-time setup
Sample logs or event data	Tells us concretely what signals are available to work with.	Eng, one-time export

Screenshare session

Who we need

One CS agent, during a normal-volume window. No preparation needed — we just want to watch you work.

60-90 min. Async screen-recording also works if scheduling is hard.

Walk us through 3-5 recent merchant questions you've handled. For each one: did you already know the answer, search Notion, ask a colleague, or escalate?
We will give you 2 hypothetical questions on the spot — show us how you would search Notion for them, and tell us what is frustrating about it.
Walk through 2-3 recent "let me get back to you" cases. Open the actual tools and click through what you would click.
Show us every tool and tab you have open right now, and what each one is for.
If a multi-branch merchant messaged right now, show us how you figure out which branch is asking.
Open the Notion KB and point at sections you trust vs. sections you avoid — and why.

Written Q&A for CS

Who we need

Any CS agent(s). Where different people have different answers, capture both. Write "skip" for anything that does not apply.

~30 min total, splittable across the team.

Frequent work

Escalation

04 When you can't resolve something, how do you reach the engineering team? (Slack channel, tag, ticket, in person?) Is there a target response time?

05 How do you decide "this is an engineering bug" vs. "this is something I can configure myself"? Has that call ever been wrong in hindsight?

06 Where do engineering-bound issues get tracked — Linear, Jira, GitHub, a chat channel? Could we see one or two examples?

07 After engineering resolves a bug or ships a feature, how do you find out so you can answer related questions correctly?

KB gaps

Merchants and branches

09 Can you confirm: one Periskope group chat = one merchant, with one or more branches in the chat. Are there any exceptions?

10 Within a single merchant, can branches differ in setup (hardware, menu, payment, integrations)? Does the answer to a question ever vary by branch?

Tools and internal channels

12 Is there an internal Koomi group chat (CS-only, CS + engineering, or all-hands)? Could we get read access or an export?

A lot of tribal knowledge tends to live in chat rather than docs — this could be one of the most useful things for improving the bot.

13 Freshdesk — how often is it actually used, for what kinds of questions, and how does the workflow differ from Periskope?

KB ownership

14 Who on Koomi's side updates the Notion KB? How often does it actually get updated in practice?

15 When the product changes (new feature, bug fix, config change), what is the process for updating the KB?

16 We've reformatted your KB into structured markdown. Should this become the source of truth going forward, or should updates flow back to Notion?

What does success look like?

17 What does "the bot is working" look like to you in 4 weeks? Fewer pings to colleagues? Faster replies? Certain question types fully handled by the bot?

Technical questions

Who we need

Someone with a whole-system view — engineering lead, CTO, or technical co-founder. This is not a deep engineering questionnaire; that comes later. We just need to understand what exists at a system level.

~25 min for written answers. We will also request a 60-90 min walkthrough call afterward.

How does Koomi currently learn about merchant issues?

If you already have monitoring or alerting, the bot plugs into it. If not, we may need to help build signals — so the answer here directly affects scope.

01 When a thermal printer at a merchant's outlet goes offline, what is the first thing in your system that knows? An alert? A log entry on the POS box? Nothing — only when the merchant calls?

02 Same question for: a POS box going offline, a payment terminal disconnecting, a NoviSync failing, a GrabFood / Foodpanda integration dropping.

03 Which kinds of issues tend to go unnoticed by merchants because there is no obvious symptom?

For example: overnight sync failures that only surface at next-day open, delivery integration drops where orders just stop coming in, mall FTP push failures. What else fits this pattern, and how do those usually get discovered today?

04 When an engineer picks up an escalated issue, what tools or data do they reach for first (logs, dashboards, SSH, per-merchant DB)? Roughly how long until they have a likely cause?

Existing tools and dashboards

05 Does engineering have any internal dashboard or monitoring tool showing merchant health today? Even a half-finished one, or "a SQL query someone runs every morning", counts.

06 If yes — who built it, who maintains it, and what does it show? A screenshot would be helpful.

07 Is there a centralised data store that aggregates anything across merchants (logs, events, errors, sales)? Or is each merchant's data fully separate?

08 Are there any cases where engineering finds out about a merchant issue before CS escalates it — an automated alert, a monitoring trigger, anything? Or is the path always merchant -> CS -> engineering?

Architecture and ownership

09 Who owns the infrastructure that runs the per-merchant backends and the local POS boxes? If a backend server goes down at 2am, whose phone rings?

10 Is there a hardware or field-ops team that visits outlets for installs and repairs? How do they coordinate with engineering?

11 Are merchant systems reachable from Koomi's central network — directly, via VPN, via webhooks, or not at all?

Both the per-merchant backends and the local POS boxes at outlets.

12 When a POS box at an outlet has new data (a sale, an error, a status change), how does that data get back to Koomi central — push, pull, scheduled batch, or not at all?

13 Is there a single environment that holds all merchants' data centrally, or is each merchant a separate VM / database / island?

This is the single most important question for us — the answer determines a large part of the bot's design.

14 Are any of the systems we would potentially read from — per-merchant backends, the POS stack, the delivery middleware — currently being replaced or restructured?

We are not asking about strategy or timelines. We just want to know what is stable enough to build against.

Next steps

Once we have the written answers from both CS and engineering, we will schedule:

After those two sessions, we will have enough context to scope Phase 2 properly and share a concrete plan with timelines.

If anything in this document is unclear or if a question does not apply, just flag it and we will sort it out. The goal is to get the right information without wasting anyone's time.