Comparison Metrics

Understanding the results from the Compare Data step

Quick Reference

Field	What it means
total_compared	Number of rows in the expected data (calculated from your CSV files)
rows_compared	Rows where we found a match in both expected AND extracted data
rows_matched	Rows that matched perfectly (found in both, zero value differences)
rows_with_differences	Rows found in both datasets but have at least one value mismatch
rows_missing	Rows in expected data with no matching row in extracted data
rows_extra	Rows in extracted data with no matching row in expected data
difference_count	Total number of individual value differences (one row can have multiple)

Common Questions

What does "rows_missing" mean?

We calculated an expected row from your raw data, but the AI could not find or extract a matching row from the PDF. This typically means:

The question or segment is missing from the PDF entirely
The AI failed to extract that particular row
The row key didn't match (e.g., question text differs slightly)

What does "rows_extra" mean?

The AI extracted a row from the PDF that we didn't expect based on our calculations. This typically means:

The PDF contains data we didn't calculate (e.g., different time period)
The AI misread something as a data row
Our expected data generation is missing something

What's the difference between "rows_with_differences" and "difference_count"?

A single row comparing 5 columns could have 3 mismatches. That would be 1 row_with_differences but 3 difference_count.

How the Numbers Add Up

The metrics always satisfy this equation:

rows_matched+rows_with_differences+rows_missing=total_compared

This ensures the numbers always make sense from your perspective.

Example

If you see these results:

total_compared: 20

rows_matched: 15

rows_with_differences: 3

rows_missing: 2

difference_count: 7

This means: Out of 20 expected rows, 15 matched perfectly, 3 were found but had value differences (totaling 7 individual mismatches), and 2 couldn't be found in the extracted data at all.