Analytical Explainability

Outliers and Null Semantics: When ‘Missing’ Means Something

Outliers and nulls can be meaningful; AI must interpret them correctly.

Null Semantics
Rendering diagram...
Null can mean unknown, not applicable, or missing.
Null values have different meanings that must be explicit.

TL;DR

  • Missing data is not always zero.
  • Outliers can distort explanations.

The problem (layman)

  • Null values are treated as zeros or ignored.
  • Outliers skew explanations and trends.

Why it matters

  • Misinterpreting nulls leads to wrong conclusions.
  • Outliers can hide the real story.

Symptoms

  • AI says “no change” when data is missing.
  • Explanations are driven by a few extreme points.

Root causes

  • No null semantics documented.
  • No outlier handling in measures.

What good looks like

  • Null semantics defined (unknown vs not applicable).
  • Outlier handling rules documented.

How to fix (steps)

  • Document null meaning in metadata.
  • Add measures that exclude or flag outliers.
  • Explain when data is missing.

Pitfalls

  • Treating null as zero by default.
  • Silently excluding outliers.

Checklist

  • Null meaning documented.
  • Outlier handling implemented.
  • AI responses mention missing data.