The Simplest Diff: Pixel-by-Pixel

The most straightforward approach to image comparison is iterating over every pixel and checking if the RGB values match:

def naive_diff(img_a, img_b):
    diff_count = 0
    for y in range(height):
        for x in range(width):
            if img_a[y][x] != img_b[y][x]:
                diff_count += 1
    return diff_count / total_pixels

This works, but it is naive. A single sub-pixel shift in font rendering can flag thousands of pixels as different, even though the visual appearance is virtually identical to a human observer.

Perceptual Color Distance

Human vision does not perceive all color differences equally. A shift from #FF0000 to #FF0001 is invisible, but a shift from #808080 to #818080 might be noticeable in certain contexts.

The CIELAB color space was designed to be perceptually uniform, meaning equal numerical distances correspond to roughly equal perceived differences. The Delta E metric measures the Euclidean distance between two colors in CIELAB space:

Delta E = sqrt((L2 - L1)^2 + (a2 - a1)^2 + (b2 - b1)^2)

A Delta E below 2.0 is generally considered imperceptible. Modern visual testing tools use this threshold to filter out noise while still catching meaningful color shifts.

Structural Similarity Index (SSIM)

SSIM goes beyond pixel-level comparison to evaluate structural patterns. It considers three components:

Luminance: Overall brightness comparison
Contrast: Variance comparison between regions
Structure: Correlation of pixel patterns

SSIM(x, y) = ((2*μx*μy + c1) * (2*σxy + c2)) / ((μx² + μy² + c1) * (σx² + σy² + c2))

SSIM returns a value between -1 and 1, where 1 means the images are identical. Values above 0.95 typically indicate no perceptible difference.

Why SSIM Matters for UI Testing

UI screenshots have a lot of structure: text blocks, borders, shadows, and whitespace. SSIM captures whether these structural elements are preserved even when individual pixels shift slightly due to anti-aliasing or rendering differences.

Anti-Aliasing Detection

Anti-aliasing is the single largest source of false positives in visual testing. When a browser renders a diagonal line or curved text, it blends edge pixels with the background to create a smooth appearance. But the exact blending can vary between:

Browser versions
Operating systems
GPU drivers
Display scaling factors

Smart diffing algorithms detect anti-aliased pixels by checking if a pixel sits on an edge (has significantly different neighbors) and if the color difference is within the expected anti-aliasing range.

Perceptual Hashing

For quickly determining whether two images are similar without pixel-level comparison, perceptual hashing creates compact fingerprints:

Resize the image to a small grid (e.g., 8x8)
Convert to grayscale
Compute the discrete cosine transform (DCT)
Reduce to a binary hash based on the DCT median

Two images with a Hamming distance of less than 5 in their perceptual hashes are likely visually identical. This approach is useful for pre-filtering before running expensive pixel-level comparisons.

Choosing Your Algorithm

The right algorithm depends on your requirements:

Algorithm	Speed	Accuracy	False Positive Rate
Naive Pixel	Fast	Low	High
Delta E	Medium	High	Low
SSIM	Slow	Very High	Very Low
Perceptual Hash	Very Fast	Medium	Medium

Most production visual testing pipelines use a combination: perceptual hashing for fast pre-screening, followed by Delta E or SSIM for detailed comparison of flagged images.

Continue with ScanU

If you want to apply these techniques in production, start with a focused set of pages and run baseline screenshot comparison after every meaningful UI change. You can review plans on Pricing, implementation details in the FAQ, and product capabilities on Features.

Inside Pixel-Perfect Comparison Algorithms