The No-BS Framework: How to Evaluate the AI Capabilities of a Vendor in 2025

Aug 8, 2025

how to evaluate the ai capabilities of a vendor

The year is 2025, and the term "AI-powered" is everywhere. It’s on every software vendor's website, in every sales deck, and it’s the buzzword of every conference. For maintenance and reliability leaders like you, this presents both a massive opportunity and a significant risk. The pressure from the C-suite to adopt AI and drive digital transformation is immense. The promise of predictive maintenance, optimized MRO inventory, and unprecedented operational efficiency is tantalizing.

But here’s the hard truth: not all AI is created equal. The market is flooded with "AI-washing," where simple, rule-based automation is dressed up and sold as sophisticated artificial intelligence. Choosing the wrong vendor doesn't just mean a wasted software investment; it means lost time, frustrated teams, and a major setback in your reliability journey. Your traditional vendor evaluation checklist, the one you've used for years to assess CMMS or EAM systems, is no longer sufficient. It's time for a new approach.

This is not another high-level, generic guide. This is a practical, in-the-trenches framework designed for maintenance leaders who need to cut through the marketing fluff and make a confident, data-driven decision. We'll give you the exact questions to ask, the red flags to watch for, and a structured method to scrutinize a vendor's true AI capabilities.

The Core Framework: Four Pillars for Evaluating AI Vendor Capabilities

To properly vet an AI vendor, you need to look beyond the user interface and the sales pitch. You need to evaluate them across four critical pillars. A weakness in any one of these areas can jeopardize the entire initiative.

Technology & Model Scrutiny: Do they have genuine AI, and how does it actually work?
The Data Dilemma: How do they handle your data, and are you even ready?
The Human Factor: Can they support your team through implementation and adoption?
Performance & ROI: Can they prove the value and deliver a tangible return?

Let's break down each pillar into actionable steps and the tough questions you need to ask.

Pillar 1: Technology & Model Scrutiny (The "Under the Hood" Check)

This is where you separate the real AI from the clever automation. Many vendors use the term "AI" to describe what are essentially complex "if-this-then-that" (IFTTT) rules. While useful, this is not true AI.

Differentiating True AI from Rule-Based Automation

Rule-Based Automation: Follows pre-programmed, static rules. Example: "If the vibration sensor on Pump-01 exceeds 10 mm/s, create a high-priority work order." This system cannot learn or adapt. If a failure occurs at 9.5 mm/s, it will miss it every time.
True AI (Machine Learning): Learns from historical data to identify complex patterns and make predictions. Example: An AI model analyzes vibration, temperature, pressure, and acoustic data simultaneously. It learns the unique "normal" operating signature of Pump-01. It then detects a subtle, multi-variable deviation from that signature—a pattern that has preceded past failures—and predicts a bearing failure in the next 200 operating hours, even if no single sensor has crossed a static threshold.

Your Job: You must force the vendor to explain which category they fall into.

What Kind of AI Are We Talking About?

"AI" is a broad term. You need to understand the specific technologies they employ. Here’s a quick primer for the maintenance world:

Machine Learning (ML): This is the most common type in our field. It's the engine behind most predictive maintenance (PdM) solutions. It uses algorithms to parse historical data (work orders, sensor readings, failure histories) and make predictions about future events.
Deep Learning: A more advanced subset of ML that uses "neural networks" with many layers. It's particularly powerful for interpreting complex, unstructured data like raw vibration waveforms, thermal images, or acoustic signatures to find incredibly subtle anomalies that traditional ML might miss.
Natural Language Processing (NLP): This technology allows computers to understand human language. In maintenance, its killer application is analyzing decades of technician comments in work orders. It can uncover hidden failure modes, identify recurring issues that were never properly coded, and find correlations between repair actions and subsequent failures.

Key Questions to Ask About Their Models:

Don't be afraid to get technical. A competent vendor will welcome these questions. A vendor who is "AI-washing" will deflect or give vague answers.

"Can you walk me through the specific machine learning models you use for a use case like AI Predictive Maintenance on our compressors?"
- Listen for: Specific model names (e.g., "We use a Random Forest model for classification of failure modes and an LSTM neural network for time-series forecasting of remaining useful life"). Vague answers like "our proprietary algorithm" are a red flag.
"How do you train your initial models? Do you use a generalized model, or is it trained specifically on our data?"
- Listen for: A good vendor will likely start with a "pre-trained" model based on data from similar assets but will emphasize that the real power comes from fine-tuning the model on your specific operational data. Be wary of any "one-size-fits-all" claims.
"What is your process for handling 'model drift'? How often are models retrained, and is that automated?"
- Listen for: Model drift is when the model's accuracy degrades over time as asset behavior or operating conditions change. A mature AI vendor will have a clear, proactive strategy for monitoring model performance and automatically or semi-automatically retraining them to ensure they remain accurate.
"Is your AI a 'black box,' or do you provide explainability (XAI) features?"
- Listen for: This is critical for adoption. A "black box" just says, "Trust me, this asset will fail." An explainable AI says, "I am predicting a failure because I see a 15% increase in high-frequency vibration coupled with a 5% rise in motor temperature, a pattern that has preceded 90% of similar failures in the past." This builds trust and helps your team validate the AI's recommendations.

Pillar 2: The Data Dilemma (Your Most Valuable & Vulnerable Asset)

An AI system is only as good as the data it's fed. This pillar is a two-way street: you must evaluate the vendor's data capabilities and take a hard look at your own data readiness.

Assessing Your Own AI Readiness: The Inward Look

Before you can effectively judge a vendor, you need a clear picture of your data landscape.

Data Sources: Where does your data live?
- CMMS Software: Work order history, failure codes, asset hierarchy, parts usage.
- SCADA/Historians: Real-time and historical sensor data (pressures, temperatures, flow rates, etc.).
- IoT Sensors: Vibration, ultrasonic, thermal, oil analysis data.
- ERP Systems: Parts inventory, procurement data.
Data Quality: Be brutally honest.
- Completeness: Are failure codes consistently used? Are work order close-out notes detailed?
- Accuracy: Is your asset hierarchy correct? Are sensor readings calibrated and reliable?
- Consistency: Do you use standardized naming conventions?
Data Volume: Do you have enough history? Most ML models need at least 2-3 years of operational and maintenance data, including data on failures, to learn effectively.

Create a simple checklist for your key asset classes. A vendor can't work magic with non-existent or garbage data. A good vendor will help you identify and bridge these gaps as part of the project.

Evaluating the Vendor's Data Requirements & Ingestion Process

"What specific data points, and at what frequency, do you need from our systems to make your models work for our centrifugal pumps?"
- Listen for: Specificity. They should be able to provide a detailed data schema. "Just give us all your data" is a sign of inexperience.
"Describe your data ingestion process. Is it a one-time bulk upload, or do you have established integrations for continuous data streaming from systems like Rockwell, Siemens, or our OSIsoft PI historian?"
- Listen for: A mature platform will have robust, pre-built connectors for common industrial systems. A manual, recurring CSV upload process is not scalable or sustainable for a true AI implementation.
"How does your system handle missing data, outliers, or poor-quality data from our CMMS?"
- Listen for: This question separates the pros from the amateurs. A top-tier vendor will have sophisticated data cleansing, imputation (intelligently filling in gaps), and normalization techniques built into their platform. They should be able to explain their methodology clearly.

Security, Privacy, and Ownership: The Non-Negotiables

This is a conversation you should have with your IT department, but you need to lead it.

"Where will our operational data be stored? What are your cloud security credentials and certifications?"
- Listen for: They should immediately be able to speak to their cloud provider (AWS, Azure, GCP), data encryption standards (in-transit and at-rest), and compliance with certifications like ISO 27001 or SOC 2.
"Who owns the raw data we provide? Who owns the trained model that is created using our data?"
- Listen for: This is a critical contractual point. The answer should be unequivocal: You own your raw data. Ownership of the trained model can be more complex, but you should have clear rights to it. Be wary of any vendor who claims ownership of your raw data.
"How do you align with established frameworks for managing AI-associated risks?"
- Listen for: Mentioning frameworks like the NIST AI Risk Management Framework shows a commitment to responsible, secure, and trustworthy AI development and deployment. It demonstrates a level of maturity that goes beyond just building algorithms.

Pillar 3: The Human Factor (Implementation, Adoption & Support)

The most brilliant AI algorithm is worthless if your team doesn't trust it, understand it, or use it. The vendor's role extends far beyond software. They must be a change management partner.

Beyond the Sales Pitch: Evaluating the Implementation Team

The team that sells you the software is rarely the team that implements it.

"Who, by name and title, will be on our implementation team? What is their specific background in both data science and industrial maintenance?"
- Listen for: You want a blended team. You need data scientists who understand the math, but you also need reliability engineers or former maintenance managers who understand the difference between a pump and a compressor and can speak your language.
"Can we have a technical deep-dive call with your proposed project lead and a data scientist, without a salesperson present?"
- Listen for: A confident vendor will say "yes" without hesitation. This is your chance to ask the tough, unfiltered questions we've outlined.

User Experience (UX) and Workflow Integration

How does the AI's output fit into your team's daily reality?

"Show me the exact, step-by-step workflow. An AI alert is generated—what happens next? How does a planner see it? How does it become a work order? How does a technician see the details on their mobile device?"
- Listen for: A seamless, intuitive process. If it takes 10 clicks to get from an alert to a usable work order, your team will ignore it. The information should be presented clearly, with context and evidence.
"How does this tool help a technician on the floor? Does it integrate with a Mobile CMMS to provide diagnostic guidance, link to PM procedures, or show asset history right at the asset?"
- Listen for: The goal is to make the technician's job easier, not to add another screen to look at. The AI should empower them with insights they can act upon immediately.

Training and Change Management Support

This is arguably the most critical and overlooked area. You are not just buying software; you are changing the fundamental way your team approaches maintenance.

"What does your training program look like? Is it just a one-time webinar, or is it a comprehensive program tailored to different roles (managers, planners, engineers, technicians)?"
- Listen for: A commitment to ongoing education. The best partners offer role-based training, "train the trainer" programs, and regular check-ins to ensure the team is adopting the new processes.
"How do you help us manage the cultural shift from a reactive or preventive mindset to a predictive, data-driven one?"
- Listen for: A great partner will have a change management playbook. They'll share best practices, help you identify and empower champions within your team, and provide materials to communicate the "why" behind the change to everyone from the plant floor to the executive suite. They should have experience with standards and best practices from organizations like the International Society of Automation (ISA) that govern industrial processes.

Pillar 4: Performance & ROI (The "Show Me the Money" Test)

Ultimately, this is an investment that must deliver a return. Vague promises of "improved uptime" are not enough. You need to build a concrete business case and hold the vendor accountable for delivering on it.

Defining Success: Moving Beyond Vague Promises

Before you sign anything, you and the vendor must agree on what success looks like, using specific, measurable KPIs.

Instead of: "Reduce downtime."
Ask for: "A 15% reduction in unplanned downtime hours on our 10 most critical assets within 12 months."
Instead of: "Optimize inventory."
Ask for: "A 10% reduction in carrying costs for critical spares related to the monitored assets by improving just-in-time parts forecasting."
Other Key Metrics: Increase in Mean Time Between Failures (MTBF), reduction in maintenance-related overtime costs, increase in planner efficiency.

The Proof of Concept (PoC) / Pilot Program: Your Litmus Test

Never go all-in on an enterprise-wide deployment from the start. A well-structured, paid pilot program is the single best way to validate a vendor's claims.

Scope: Choose a single production line or a small group of well-understood, critical assets. A project like implementing predictive maintenance for conveyors is a perfect example of a contained, high-value pilot.
Success Criteria: Use the KPIs you defined above. The goal of the PoC should be to prove that the technology can achieve a specific, pre-agreed-upon result.
Resources: Be clear on what's required from your team (time, data access, subject matter experts) and what the vendor will provide.
Outcome: The PoC should end with a detailed performance report from the vendor, showing how they performed against the success criteria. This report is your primary tool for making the final go/no-go decision.

Calculating the Real ROI of AI in Maintenance

Work with your finance team and the vendor to build a realistic ROI model. A simplified version looks like this:

ROI = (Financial Gain - Cost of Investment) / Cost of Investment

Financial Gains (The Benefits):
- Cost of Avoided Downtime: This is the big one. What is the cost of one hour of lost production on your key line? (Lost revenue + idle labor + potential quality issues).
- Reduced Repair Costs: The cost of a planned bearing replacement is a fraction of the cost of a catastrophic failure that takes out the motor and gearbox.
- Optimized Labor: Shifting from reactive emergency work to planned, scheduled work is far more efficient.
- Reduced MRO Inventory: Less need to hold expensive "just-in-case" critical spares.
- Extended Asset Life: Proactive maintenance can significantly extend the useful life of your equipment.
Cost of Investment (The Costs):
- Software Fees: Annual subscription (SaaS) costs.
- Implementation & Training Fees: One-time costs for setup and training.
- Hardware Costs: New sensors, gateways, or IT infrastructure if needed.
- Internal Resources: The cost of your team's time dedicated to the project.

A trustworthy vendor will work with you to build this business case, using your numbers. For more in-depth models and industry benchmarks, resources like Reliabilityweb provide a wealth of information.

Case Studies and References: The Social Proof

"Can you provide at least two case studies from companies in our industry, with similar assets, that quantify the results they achieved?"
- Listen for: Real numbers, not just fluffy quotes. Look for details that match the KPIs you care about.
"Can we speak to a current customer who has been using your AI solution for more than 18 months?"
- Listen for: A vendor's willingness to connect you with a real user. When you speak to the reference, ask the tough questions: "What was the biggest challenge during implementation? How accurate are the predictions? How responsive is their support team? What would you do differently if you started over?"

Red Flags and "AI-Washing" Tactics to Watch For

As you go through this process, keep an eye out for these common warning signs:

Vague, Buzzword-Heavy Language: If a vendor can't explain what they do without leaning on crutches like "paradigm-shifting," "revolutionary," or "state-of-the-art synergy," be skeptical.
Inability to Explain the "How": If they get defensive or evasive when you ask about their models or data cleansing techniques, they are likely hiding a lack of substance.
Overpromising on Results & Timelines: AI is powerful, but it's not magic. Claims like "50% less downtime in 60 days" are almost always unrealistic. Real results take time, data, and effort.
Lack of Domain Expertise: If the vendor's team is composed entirely of data scientists who have never set foot on a plant floor, they will struggle to understand your operational context and challenges.
Hidden Costs: Get a comprehensive quote that details every potential cost: data storage fees, API call limits, fees for model retraining, different tiers of customer support, etc.

Putting It All Together: Your AI Vendor Assessment Scorecard

Use this simple scorecard to formalize your evaluation and compare vendors side-by-side. Rate each vendor on a scale of 1 (Poor) to 5 (Excellent) for each category.

Evaluation Category	Key Questions	Vendor A (Score 1-5)	Vendor B (Score 1-5)	Notes
Pillar 1: Technology	Is it true AI? Can they explain their models? Do they offer explainability (XAI)?
Pillar 2: Data	Are data requirements clear? Is the ingestion process robust? Are security and ownership terms favorable?
Pillar 3: People	Is the implementation team experienced? Is the UX intuitive? Is the change management support strong?
Pillar 4: Performance	Are they willing to define clear KPIs? Is their PoC plan solid? Are their case studies relevant and quantified?
Commercials	Is the pricing transparent? Is the ROI case believable? Are the contract terms fair?
Overall Score

Making the Right Choice for Your Plant's Future

Choosing an AI vendor in 2025 is one of the most strategic decisions a maintenance and reliability leader can make. It's not about buying a piece of software; it's about selecting a long-term partner who will be integral to your operational success.

The vendor who wins your business shouldn't be the one with the slickest presentation, but the one who transparently answers the tough questions, demonstrates a deep understanding of both data science and your industrial reality, and commits to being a partner in your journey.

By using this four-pillar framework, you can move past the hype, systematically assess each potential partner, and make an informed, confident decision. You can select a true Manufacturing AI Software solution that will empower your team, enhance your reliability, and deliver a measurable impact on your bottom line for years to come.

Tim Cheung

Tim Cheung is the CTO and Co-Founder of Factory AI, a startup dedicated to helping manufacturers leverage the power of predictive maintenance. With a passion for customer success and a deep understanding of the industrial sector, Tim is focused on delivering transparent and high-integrity solutions that drive real business outcomes. He is a strong advocate for continuous improvement and believes in the power of data-driven decision-making to optimize operations and prevent costly downtime.