Home  /  Journal  /  Method
Method · The Cloudbase Journal

The 7-criteria rubric we use to score every AI tool

How we weigh output quality against pricing, UX against API access, and why "trust score" sits at 15%. A walkthrough of the full methodology.

LC Leo Cho · Editor · · 12 min read
RUBRIC

When we started Cloudbase the goal was simple: stop guessing which AI tool to buy. Three years and 1,342 reviews later, the rubric below is what we apply to every tool that lands in the directory. Nothing here is pay-to-play.

Why a rubric at all?

Because "vibes-based" reviews collapse the moment pricing changes or a competitor ships a better model. A written rubric forces us to defend every score, and lets readers compare a text-to-video tool from 2024 against one released last week on the same axes.

The seven criteria

Each tool is scored 0–10 on seven criteria. Scores are weighted and summed to a Cloudbase Score out of 100.

  1. Output quality — 25%. The only criterion worth more than 20%. We run a fixed battery of prompts (writing, code, image, or audio depending on category) and rate blind against peer tools.
  2. Pricing honesty — 15%. Is the advertised price the real price? Are rate-limits disclosed up front? Hidden "enterprise only" features drop this score fast.
  3. Trust & provenance — 15%. Model lineage, data-training transparency, SOC 2 status, and how the vendor responds to abuse reports.
  4. UX & onboarding — 15%. Time-to-first-useful-output from a cold signup. Anything over 10 minutes without a concrete result loses points.
  5. API & automation — 10%. Documented endpoints, SDKs, webhooks. Crucial for tools that end up in a real stack.
  6. Speed & reliability — 10%. Measured over a 14-day window. Status page incidents cost points.
  7. Support & docs — 10%. Do humans answer within 24 hours? Are the docs searchable and current?

Why trust sits at 15% (and not higher)

We debated this for weeks. Operators told us trust was table-stakes — so we moved it into the top-three weights, but we didn't let it dominate. A tool with perfect trust but weak output isn't what a working team needs; a great tool with opaque training data gets called out on the listing, but still gets to exist in the directory with the caveats readers deserve.

What we do not score

Brand, hype, VC funding, and press coverage. If a $3B-valuation tool can't hold its own against a $12/mo indie competitor on the rubric, the indie tool wins the badge. That has happened more than once.

How a tool earns the A+ badge

A+ requires a Cloudbase Score ≥ 90 and no single criterion under 7. We re-audit A+ tools every 90 days because the category moves fast. Badges can — and do — get revoked.

What changed in 2026

We added the "provenance" sub-criterion to Trust after a year of weekly reader questions about training data. We also dropped "community size" entirely — it correlated with hype, not usefulness.

See the rubric in action on our Best-of lists, or read the deep-dive on which AI agents actually ship.