Indication of Removed Records¶

Lets-Plot provides informative messages when records are excluded during plot preparation.

These notifications specify number of dropped records and the reason for their removal, to ensure that the final visualization is interpreted correctly.

Records can be removed due to:

Sampling — reducing an oversized dataset for performance.

Statistics & Geometries — filtering out non-finite values (NaN, Inf) or out-of-bounds data.

In [1]:
import numpy as np
import pandas as pd

from lets_plot import *

LetsPlot.setup_html()

Sampling¶

Sampling helps dealing with excessively large datasets by reducing the number of data points rendered.

Learn more: Sampling in Lets-Plot.

In [2]:
np.random.seed(42)
big_df = pd.DataFrame({
    "x": np.random.normal(size=75_000),
    "y": np.random.normal(size=75_000),
})

ggplot(big_df, aes("x", "y")) + \
    geom_point(
        sampling=sampling_random(500, seed=42) + sampling_systematic(100)
    )
Out[2]:

Statistics¶

Stats drop non-finite values before computing transformations. The resulting message identifies the stat responsible for the removal.

In [3]:
df_ridges = pd.DataFrame({
    "x": [1, 2, np.nan, 4, 1, 2, 3, np.nan],
    "y": [0, 0, 0, 0, 1, 1, 1, 1],
})

ggplot(df_ridges, aes("x", "y")) + geom_area_ridges()
Out[3]:

Geometries¶

Geoms remove records that contain missing values or fall outside the scale limits.

In [4]:
df = pd.DataFrame({
    "x": [1, 2, np.nan, 4, 5, np.nan, 7, 8],
    "y": [2, np.nan, 3, 4, 5, np.nan, 7, 8],
})


ggplot(df, aes("x", "y")) + geom_point()
Out[4]:

Suppressing Messages¶

While notifications are shown by default, you can control this behavior:

Per Layer: Via na_rm parameter:

  • na_rm=false (default) — records are removed and messages are shown;
  • na_rm=true — records are removed silently.

Per Plot or Globally: Use theme(plot_message='blank') to silence all notifications.

Note: you can suppress messages globally using LetsPlot.set_theme(theme(plot_message='blank')).

In [5]:
# Suppress messages from a layer
ggplot(df, aes("x", "y")) + geom_point(na_rm=True)
Out[5]:
In [6]:
# Suppress messages from all layers
ggplot(df_ridges, aes("x", "y")) + geom_area_ridges() + theme(plot_message='blank')
Out[6]: