The AI Implementation Checklist: 12 Questions to Ask Before You Build Anything

A practical pre-build checklist for AI projects: quantify the business problem, assign accountability, define workflows, and map failure modes so pilots can become production systems.

The AI Implementation Checklist: 12 Questions to Ask Before You Build Anything
A hand interacts with a futuristic “Automation” dashboard—an apt reminder that successful AI deployment requires clear ownership, controls, and failure planning before anything is built.

Run through these twelve questions before your next AI project. If you can't answer number five, stop building.

I've reviewed dozens of AI implementations—successful and failed—and the failures share a common pattern. They weren't technical failures. The models worked. The integrations functioned. The demos impressed stakeholders.

They failed because nobody asked the hard questions before building started.

The checklist that follows isn't about AI capability. It's about AI deployability. These are the questions that determine whether your pilot becomes a production system or joins the graveyard of promising proofs of concept.

Question One: What Is the Expensive Problem This Solves?

Quantify the problem in dollars or hours, not adjectives. "Inefficient" isn't an answer. "Costs $2.1 million annually in labor" is an answer. "Frustrating for users" isn't an answer. "Causes 340 hours of rework per month" is an answer.

If you can't quantify the problem, you can't calculate the return on solving it. And if you can't calculate the return, you can't make a rational case for investment—either to get the project funded or to justify scaling it later.

The Driver Tree Method I outlined previously provides a systematic approach to this quantification. But even without a full driver tree analysis, you need a number. What is this problem costing the organization today? Write it down. If you're guessing, acknowledge that and commit to validating the estimate before significant resources are deployed.

Projects launched without quantified problem statements almost never scale. There's no foundation for the business case.

Question Two: Who Is the Human Accountable for Outputs?

Every AI system needs a human owner—not a technical owner responsible for the code, but a business owner accountable for the outcomes the system produces.

This person reviews escalations. They make judgment calls on edge cases. They answer when something goes wrong. They're the one explaining to leadership why the AI made a particular decision.

If you can't name this person, your AI system has an accountability gap. In regulated industries, this gap is a compliance failure. In any industry, it's a deployment failure—when problems arise and nobody owns them, trust erodes and the system gets abandoned.

The human owner shouldn't be someone in IT or data science. They should be someone in the business function the AI serves. If you're building an AI system for customer service, the owner should be a customer service leader. If it's for financial analysis, the owner should be in finance. This ensures accountability sits where the consequences land.

Question Three: What Does the Approval Workflow Look Like?

Before an AI output reaches a customer, triggers a transaction, or influences a decision, what happens? Map the workflow explicitly.

For some systems, every output requires human review before action. For others, only outputs below a certain confidence threshold get reviewed. For still others, outputs are acted upon automatically but audited afterward. Each model has different risk profiles and resource requirements.

The approval workflow also determines your operational capacity. If every output requires review, your throughput is constrained by reviewer bandwidth. If you're expecting a thousand AI outputs daily and each review takes two minutes, you need thirty-plus hours of review capacity per day. Have you staffed for that?

Organizations that skip this question discover the constraint after launch—either the system creates unacceptable risk because nothing is reviewed, or it creates an operational bottleneck because everything is reviewed and nobody planned for the volume.

Question Four: How Will We Measure Success?

Define your success metrics before you build, not after. You need both leading indicators to track during implementation and lagging indicators to evaluate ultimate impact.

Leading indicators tell you whether the system is working as designed: accuracy rates, processing times, confidence distributions, exception rates, user adoption. These should be monitored from day one of deployment.

Lagging indicators tell you whether the system is delivering business value: cost reduction achieved, revenue impact, customer satisfaction changes, error rate improvements in downstream processes. These take longer to materialize but ultimately determine whether the project was worthwhile.

Be specific. "Improved efficiency" isn't a metric. "40% reduction in average processing time from 47 minutes to 28 minutes" is a metric. Write down the current baseline, the target, and how you'll measure actuals. If you don't have a current baseline, measuring it is your first implementation task.

Question Five: What's the Failure Mode, and What's the Blast Radius?

This is the question that stops projects cold—and should.

Every AI system will fail. Models hallucinate. Edge cases appear that weren't in training data. Integrations break. The question isn't whether failure will happen but what happens when it does.

Failure mode analysis asks: how will this system fail? Will it produce confidently wrong outputs? Will it fail silently, producing no output when one was expected? Will it degrade gradually as data distributions shift, or fail catastrophically when encountering novel inputs?

Blast radius analysis asks: when it fails, what's the impact? A failed internal summarization tool wastes employee time—annoying but contained. A failed customer-facing system damages relationships and reputation. A failed system in a regulated process creates compliance exposure. A failed system controlling financial transactions creates direct monetary loss.

Map the failure modes. Quantify the blast radius for each. Then design mitigations: human review gates, confidence thresholds, automatic fallbacks, circuit breakers that halt processing when anomalies are detected.

If you can't answer this question, you don't understand your system well enough to deploy it. Stop and do the analysis.

Question Six: Is the Underlying Process Documented and Stable?

AI systems learn from and operate within existing processes. If those processes aren't documented, the AI team will make assumptions—and assumptions become embedded in the system in ways that are hard to change later.

If those processes aren't stable—if they're actively being redesigned or vary significantly across teams—the AI system is targeting a moving target. It will be optimized for a process that no longer exists by the time deployment happens.

Before building, ensure the target process is documented clearly enough that someone unfamiliar with it could understand the steps, decision points, and exceptions. Ensure it's stable enough that it won't change significantly during your implementation timeline.

If the process needs improvement, improve it first. Automating a broken process gives you faster broken outputs.

Question Seven: What Data Do We Need, and Do We Have It?

AI systems require data—for training, for context, for operation. Be specific about what data you need and honest about whether you have it.

Training data: what historical examples will teach the system to perform the task? Do you have enough volume? Is it representative of the cases the system will encounter in production? Is it labeled or structured in a usable way?

Context data: what information does the system need at runtime to make good decisions? Customer history, product specifications, policy documents, previous interactions? Where does this data live, and can the system access it reliably?

Operational data: what inputs will flow into the system during normal operation? What format are they in? How clean and consistent are they? What happens when inputs are malformed or missing?

Data problems are the most common source of AI project delays. Teams assume data exists and is accessible, then discover it's trapped in legacy systems, scattered across spreadsheets, or formatted inconsistently. Validate data availability early.

Question Eight: Who Maintains This After Launch?

AI systems aren't fire-and-forget deployments. They require ongoing maintenance: monitoring performance, retraining on new data, handling edge cases, updating as business requirements change, and fixing issues as they emerge.

Who does this work? It shouldn't be the implementation team—they'll move on to other projects. It shouldn't be "nobody"—the system will degrade. It needs to be someone whose job description includes maintaining this system.

This question often reveals a gap. Organizations build AI systems but don't create roles or allocate capacity for maintaining them. The result is predictable: performance degrades over time, issues go unaddressed, and eventually the system is abandoned or becomes actively harmful.

Before you build, identify the maintenance owner and confirm they have capacity. If this role doesn't exist, creating it is a prerequisite for the project.

Question Nine: What's the Rollback Plan?

When something goes wrong—not if—how do you return to the pre-AI state?

Some systems have natural rollback paths. If AI is assisting humans who can perform the task manually, you can simply turn off the AI and absorb the productivity hit. Other systems don't. If you've eliminated the manual capability, if downstream systems now depend on AI outputs, if you've changed processes in ways that can't be easily reversed, rollback becomes expensive or impossible.

Understand your rollback options before deployment. Test them. Ensure the organization can continue operating if the AI system needs to be taken offline for hours, days, or permanently.

The best rollback plans are never used. But the projects that don't have them eventually wish they did.

Question Ten: How Do We Handle Edge Cases the AI Can't Process?

Every AI system has boundaries—inputs it wasn't trained for, situations it can't handle confidently, requests that fall outside its scope. What happens when these edge cases appear?

The answer shouldn't be "the AI tries anyway"—that's how you get hallucinated outputs and damaged trust. The answer should be an explicit exception-handling workflow.

Define how edge cases are identified: confidence scores, input validation, explicit scope checks. Define where they're routed: to human reviewers, to specialized handling queues, back to the requestor with an explanation. Define the SLAs for handling them: how quickly will a human address flagged cases?

Edge cases that fall through the cracks create the worst failures—the ones where the AI confidently produced nonsense and nobody caught it. Design the safety net before you need it.

Question Eleven: What's the Communication Plan for Affected Teams?

AI systems change how people work. Even well-designed implementations create disruption: new tools to learn, new workflows to follow, new expectations about output and productivity.

People resist changes they don't understand or don't believe benefit them. Without effective communication, even good AI systems face adoption problems—users route around them, undermine them, or refuse to trust them.

Your communication plan should cover what's changing and why, how it benefits the affected teams, what training and support will be provided, how feedback will be gathered and addressed, and what the timeline looks like.

Involve affected teams early. Their input improves the system design, and their early involvement builds buy-in. Surprises create resistance; participation creates ownership.

Question Twelve: What Does Done Look Like?

Finally, define the end state. Is this a pilot intended to validate feasibility, with a go/no-go decision at the end? Is it a limited deployment to a single team or process, with scaling decisions to follow? Is it a full production deployment intended to run indefinitely?

Each end state implies different success criteria, different resource commitments, and different timelines. Conflating them creates confusion—teams think they're building a pilot while leadership expects production, or vice versa.

Be explicit: this project is done when we have achieved X, validated by Y, at which point we will decide Z. Write it down. Get stakeholder agreement. Revisit it when you think you're done.

Projects without clear definitions of done tend to drift—scope creeps, timelines extend, and nobody knows when to declare victory or cut losses.

Using This Checklist

These twelve questions aren't bureaucratic hurdles. They're the questions that, when answered well, predict successful deployment—and when answered poorly or not at all, predict expensive failure.

Before your next AI initiative kicks off, work through each question. Write down the answers. Share them with stakeholders. Where you don't have good answers, either develop them or acknowledge the risks you're accepting.

The organizations escaping the Prototype Plateau aren't building better AI. They're asking better questions before they build.


Measured AI helps business leaders ask the right questions before deploying AI systems. Subscribe for weekly frameworks on crossing the gap from experimentation to enterprise value.