How It Works

Think of It Like a Tournament

Suppose you want to find out which foods are healthiest for you. Here's the full process from start to finish:

1

Write your question

Tell NanoJudge what to judge. For example: "Which is healthier?".

2

Add your items

Type or paste the items you want ranked, or click the Generate button to have the AI create a comprehensive list based on your question. For our example, you might enter: eggs, butter, spinach, chicken, olive oil, and more.

3

Every item becomes a player

Your list of items each enter the tournament as competitors.

4

Players face off in head-to-head comparisons

The AI judges each comparison individually. For your question "Which is healthier?", we create a new comparison for every pair of items:

"Which is healthier: eggs or butter?"
"Which is healthier: eggs or spinach?"
"Which is healthier: butter or olive oil?"
"Which is healthier: spinach or chicken?"
...and thousands more.

For each comparison, the AI reasons through it and picks a winner. Every comparison gets its own fresh question following the template: "Your Question? [Item1] or [Item2]?"

5

Many comparisons run in parallel

Instead of one tournament, we run many comparisons simultaneously:

  • Strong players (items that keep winning) get compared against each other more often
  • Weak players (items that keep losing) get eliminated quickly with fewer comparisons

This ensures we're confident about the top rankings while not wasting time on obvious losers.

6

Win rates become rankings

After many comparisons, the results are clear:

  • An item that wins 90% of its comparisons ranks near the top
  • An item that wins 10% of its comparisons ranks near the bottom

The final ranking reflects actual head-to-head performance, not guesswork.

7

Complete transparency

Click in the Comparison Log to see the actual reasoning from any comparison. Why did eggs beat butter? Read the AI's explanation. This isn't a black box—every ranking decision is backed by real analysis.

Tips

Adjust accuracy slider

Choose how many comparisons per item you want. More comparisons mean higher accuracy but longer processing time. For most tasks, 20-40 comparisons per item provides a good balance.

Pause anytime

Rankings update in real-time as comparisons complete. You can pause at any time if you're satisfied with the results, or let it complete all comparisons for maximum confidence.

Frequently Asked Questions

Isn't this slow and expensive?

If it were 2023, it would be. But modern techniques to optimize AI models have progressed dramatically. We can now run efficient versions of models like ChatGPT that cost only a fraction to run, are much faster, and deliver nearly the same performance. What would have been prohibitively expensive just a few years ago is now fast and affordable.

How is this different from asking ChatGPT to rank my list?

ChatGPT generates one response with one ranking. NanoJudge runs thousands of individual comparisons and aggregates them. It's the difference between asking someone to guess the tournament results versus actually playing all the games. You get verifiable results backed by real reasoning, not a single AI's best guess.

NanoJudge's edge over ChatGPT grows dramatically as your list and criteria become larger and more complex. ChatGPT struggles to hold lots of items in context while considering nuanced criteria—it starts dropping details, conflating similar options, and making arbitrary calls. NanoJudge handles this easily because each comparison is focused: just two items and one clear question at a time.

Can I trust the rankings?

Every ranking is backed by actual comparison data. Click in the Comparison Log to read the AI's reasoning. The confidence intervals show you how certain we are about each item's position, and we show you that uncertainty rather than hiding it.

What if the AI is biased?

AI bias is a real concern, but NanoJudge's approach actually helps surface it. Because every comparison is logged and viewable, you can audit the reasoning. If the AI consistently makes questionable calls, you'll see it in the comparison history. Traditional single-response rankings hide this—NanoJudge makes it transparent.

How many items can I rank?

NanoJudge handles lists of any size. For 100 items, we might run 2,000-5,000 comparisons total. Our algorithm focuses comparisons where they matter most—spending more effort distinguishing the top contenders and less on obvious outcomes. This means even large lists rank quickly and efficiently.

Your items aren't limited to single words either—they could be full paragraphs describing product features, investment theses, or research papers. NanoJudge evaluates each pair with full attention, then synthesizes thousands of these focused judgments into a coherent ranking.