How We Built a ChatGPT App to Replace Our Admin Dashboard

Every admin dashboard follows the same pattern: tables of data, a handful of filters, maybe a chart or two. Click a row to see details. Click edit to change a field. Navigate to another page to see a different table. It works. Nobody loves it.

The core problem is that dashboards force users to think in terms of the interface -- which table has the data, which filters to apply, which page to navigate to. What users actually want to do is ask questions and take actions: "show me the reviews submitted today that have not been delivered" or "what is this reviewer's average rating?" Traditional dashboards require translating those intentions into clicks and navigation. An LLM can handle the translation instead.

This is what ChatGPT Apps enable: an MCP server that gives ChatGPT access to your data and actions, React widgets that render rich UI inline in the conversation, and the model's reasoning to handle the translation between human intent and system operations.

We built one for ReviewMyHinge to replace the admin dashboard. Here is how it works, what the architecture looks like, and where the rough edges still are.

What ChatGPT Apps are

A ChatGPT App is a bundle of three things:

An MCP (Model Context Protocol) server that exposes tools -- functions the model can call to read data, write data, or trigger actions. Think of it as an API that ChatGPT can use conversationally.

React widgets that render inside the ChatGPT conversation. Instead of returning raw JSON when a user asks for "today's reviews," the tool returns a formatted table component with sorting, pagination, and action buttons. The model decides when to render a widget vs. when a text response is sufficient.

ChatGPT's reasoning layer that connects the two. The model interprets natural language queries, selects the appropriate tools, calls them with the right parameters, and presents the results using the appropriate widget.

The result is an admin interface where you type what you want in plain language and get structured, interactive responses back.

Architecture overview

The system runs on Cloudflare Workers as the MCP server, communicating with ChatGPT's infrastructure via server-sent events (SSE). The major components:

20 MCP tools cover the full range of admin operations:

User management: lookup users, view profiles, check subscription status, manage roles
Review lifecycle: list reviews by status, view review details, reassign reviewers, issue refunds
Analytics: revenue summaries, reviewer performance metrics, platform health stats
Content moderation: flag reviews, suspend users, manage reported content

Each tool is a function with a typed schema that describes its parameters and return type. ChatGPT sees these schemas and knows how to call the right tool for a given query.

8 React widgets provide rich rendering:

Data tables with sorting and filtering
User profile cards with action buttons
Review detail views with media display
Analytics charts and metric summaries
Confirmation dialogs for destructive actions

Widgets communicate with the ChatGPT host via the window.openai API, which provides methods for sending messages back to the conversation, requesting confirmations, and updating widget state.

Cloudflare D1 stores application data, R2 handles media storage, and KV provides session caching.

How MCP tools work in practice

An MCP tool definition looks like a function signature with metadata. Here is a simplified example of the pattern:

A tool named getReviewsByStatus takes a status parameter (pending, delivered, disputed) and a date range, queries D1, and returns structured data. The model sees the tool's description and parameter schema, figures out which tool to call based on the user's natural language query, and passes the results to a widget for rendering.

The key design decision: tools should be granular and composable. A single "do everything" tool is harder for the model to use correctly than a set of focused tools that each do one thing well. When a user asks "show me this week's revenue from premium reviews," the model chains a date-range tool with a revenue-by-tier tool. Composition works better than monoliths.

State management across tool calls uses a session context pattern. Each conversation maintains a context object in KV that tracks the current user being discussed, the active time range, and any filters that have been applied. Tools read from and write to this context, which means follow-up queries ("now show me just the disputed ones") work without the user re-specifying everything.

Widget communication patterns

Widgets are React components that render inside ChatGPT's UI. They communicate with the host environment through window.openai, which provides:

window.openai.sendMessage() -- send a message back to the conversation (e.g., when a user clicks "refund this review" in a widget, it triggers a new tool call)
window.openai.updateWidget() -- update the widget's state without a full re-render
window.openai.requestConfirmation() -- prompt the user before destructive actions

The pattern that works best: widgets are display-heavy and action-light. They render data beautifully and expose a small number of action buttons. Complex multi-step workflows still happen through conversation, not through widget UI. Trying to build a full CRUD interface inside a widget defeats the purpose -- at that point, you are just rebuilding the dashboard inside ChatGPT.

The dual authentication pattern

Authentication was the hardest architectural problem. The MCP server needs to verify two things: that the ChatGPT user is who they claim to be, and that they have the right permissions for the requested operation.

The solution uses a dual JWT plus database verification approach:

Step 1: OTP login. The user authenticates via a one-time password sent to their registered email. This generates a JWT containing their user ID and role.

Step 2: JWT validation on every tool call. The MCP server validates the JWT on each request. Token expiry is short (1 hour) to limit the blast radius of a leaked token.

Step 3: Database permission check. Even with a valid JWT, each tool call checks the user's current permissions against the database. This catches cases where a user's role has been changed since the token was issued.

This is more conservative than most internal tools require, but the app has access to user data and financial operations (refunds). The belt-and-suspenders approach is warranted.

An honest caveat: OAuth with OTP in the ChatGPT App context is not as smooth as it could be. The redirect flow between ChatGPT and the auth endpoint occasionally loses state on mobile. It works reliably on desktop. This is an area where the ChatGPT Apps platform is still maturing.

What works well

Natural language is a genuinely better interface for admin tasks. "Show me all reviews submitted yesterday that have not been delivered" is faster than navigating to a reviews page, setting a date filter, selecting a status filter, and scanning results. For ad-hoc queries and investigations, the conversational interface is dramatically faster.

The model handles ambiguity well. "Check on this user" with a name is sufficient -- the model calls the user lookup tool, resolves the match, and presents the profile. No need to know the user's ID or navigate to the right page.

Composability scales. Twenty tools sound like a lot, but the model navigates them effectively. Complex queries that would require three or four dashboard pages resolve in a single conversational turn.

Widgets make the output scannable. Raw JSON responses would be unusable. Rendering structured data in tables and cards with action buttons makes the conversational interface feel like a proper admin tool rather than a debugging console.

What is still rough

Latency. Tool calls add round-trip time. A query that would take 200ms in a traditional dashboard takes 2-3 seconds through the MCP pipeline (model reasoning plus tool execution plus widget rendering). For most admin tasks this is acceptable. For time-sensitive operations, it is noticeable.

Model errors. The model occasionally calls the wrong tool or misinterprets a query. This happens maybe 5% of the time and is easily corrected by rephrasing. But it means the interface requires a tolerance for imperfection that a traditional dashboard does not.

Widget limitations. The ChatGPT Apps widget API is still evolving. Some interactions that would be trivial in a standalone React app (drag-and-drop, complex modals, real-time updates) are not yet supported or are cumbersome to implement.

Auditability. Traditional dashboards produce clear audit logs: user X clicked button Y at time Z. Conversational interfaces are harder to audit because the same outcome can be reached through different natural language paths. We log every tool call with full parameters, but the audit trail is less structured than a traditional UI would produce.

When this pattern makes sense (and when it does not)

A ChatGPT App works well as an admin interface when:

The admin workflow is primarily investigative (looking things up, checking status, running ad-hoc queries)
The operation set is well-defined and mappable to discrete tools
The user base is small (internal team) and tolerant of occasional model errors
Speed of individual operations is less important than flexibility of querying

It does not work well when:

The workflow is highly repetitive and benefits from muscle memory (bulk data entry, rapid-fire moderation queues)
Latency requirements are tight (sub-second response times)
Audit compliance requires deterministic, reproducible interaction logs
The user base is non-technical and expects zero ambiguity in the interface

The takeaway

ChatGPT Apps represent a genuine shift in how internal tools can work. The combination of natural language querying, programmatic tool access, and rich widget rendering is more than a novelty -- it is a better interface for a specific class of admin work.

The technology is early. The platform has rough edges. The latency is noticeable. But for investigative admin workflows where flexibility matters more than speed, replacing a dashboard with a conversational interface backed by MCP tools is a meaningful upgrade.

The code runs on Cloudflare Workers, which means deployment is global, cold starts are minimal, and the infrastructure cost is near zero at admin-tool scale. For teams already on Cloudflare, the marginal effort to build an MCP server is surprisingly low.

If you are building internal tools and curious about conversational interfaces, the MCP server pattern is worth exploring. Start with read-only tools, add write operations once the model's accuracy on your specific domain is established, and be honest about where the conversational interface adds value versus where a traditional UI is still better.

We built this for ReviewMyHinge. If you are interested in how ChatGPT Apps or MCP-based tooling could work for your operations, Paramint builds this kind of thing.