Working with Hypnograms#
This tutorial introduces the Hypnogram class, which is the standard way to represent
and work with sleep hypnograms in YASA (since version 0.7).
Note
A hypnogram is a time-series of sleep stage labels, one per epoch (usually 30 seconds). In
YASA, stages are stored as strings such as "WAKE", "N1", "N2", "N3", and
"REM" (not integers). This makes the data easier to read and less error-prone.
Creating a Hypnogram#
From string labels#
The simplest way to create a Hypnogram is to pass a list (or array) of stage strings.
YASA supports 2, 3, 4, and 5-stage hypnograms. Set n_stages accordingly (the
default is 5). Abbreviated spellings ("W", "R") and mixed case ("wake", "Rem")
are accepted. YASA normalizes them automatically.
The 5-stage vocabulary is: WAKE, N1, N2, N3, REM.
>>> import yasa
>>> hyp = yasa.Hypnogram(["WAKE", "WAKE", "N1", "N2", "N2", "N3", "N2", "REM", "WAKE"])
>>> hyp
<Hypnogram | 9 epochs x 30s (4.50 minutes), 5 unique stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.
>>> hyp.hypno
Epoch
0 WAKE
1 WAKE
2 N1
3 N2
4 N2
5 N3
6 N2
7 REM
8 WAKE
Name: Stage, dtype: category
Categories (7, str): ['WAKE', 'N1', 'N2', 'N3', 'REM', 'ART', 'UNS']
The 4-stage vocabulary is: WAKE, LIGHT, DEEP, REM.
>>> import yasa
>>> hyp = yasa.Hypnogram(
... ["WAKE", "WAKE", "LIGHT", "LIGHT", "DEEP", "DEEP", "REM", "WAKE"],
... n_stages=4,
... )
>>> hyp
<Hypnogram | 8 epochs x 30s (4.00 minutes), 4 unique stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.
>>> hyp.hypno
Epoch
0 WAKE
1 WAKE
2 LIGHT
3 LIGHT
4 DEEP
5 DEEP
6 REM
7 WAKE
Name: Stage, dtype: category
Categories (6, str): ['WAKE', 'LIGHT', 'DEEP', 'REM', 'ART', 'UNS']
The 2-stage vocabulary is: WAKE, SLEEP.
>>> import yasa
>>> hyp = yasa.Hypnogram(
... ["W", "W", "S", "S", "S", "W", "S"],
... n_stages=2,
... )
>>> hyp
<Hypnogram | 7 epochs x 30s (3.50 minutes), 2 unique stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.
>>> hyp.hypno
Epoch
0 WAKE
1 WAKE
2 SLEEP
3 SLEEP
4 SLEEP
5 WAKE
6 SLEEP
Name: Stage, dtype: category
Categories (4, str): ['WAKE', 'SLEEP', 'ART', 'UNS']
Note
ART (Artefact) and UNS (Unscored) are always part of the vocabulary regardless of
n_stages, but they are never required.
From integer arrays (legacy format)#
Many older pipelines store hypnograms as integer arrays. Use the
Hypnogram.from_integers class method to convert them.
The default mapping is: 0 = Wake, 1 = N1, 2 = N2, 3 = N3, 4 = REM.
>>> import numpy as np
>>> import yasa
>>> hyp = yasa.Hypnogram.from_integers(np.array([0, 0, 1, 2, 3, 2, 4, 4, 0]))
>>> hyp.hypno
Epoch
0 WAKE
1 WAKE
2 N1
3 N2
4 N3
5 N2
6 REM
7 REM
8 WAKE
Name: Stage, dtype: category
Categories (7, str): ['WAKE', 'N1', 'N2', 'N3', 'REM', 'ART', 'UNS']
To load from a file:
>>> import pandas as pd
>>> int_hypno = pd.read_csv("hypnogram.csv").squeeze().to_numpy()
>>> hyp = yasa.Hypnogram.from_integers(int_hypno, freq="30s", scorer="Expert")
Pass a custom mapping dictionary when your integer encoding differs from the default.
>>> import yasa
>>> hyp = yasa.Hypnogram.from_integers(
... [0, 0, 2, 2, 3, 3, 4, 0],
... mapping={0: "WAKE", 2: "LIGHT", 3: "DEEP", 4: "REM"},
... n_stages=4,
... )
>>> hyp.hypno
Epoch
0 WAKE
1 WAKE
2 LIGHT
3 LIGHT
4 DEEP
5 DEEP
6 REM
7 WAKE
Name: Stage, dtype: category
Categories (6, str): ['WAKE', 'LIGHT', 'DEEP', 'REM', 'ART', 'UNS']
Pass a custom mapping dictionary and set n_stages=2.
>>> import yasa
>>> hyp = yasa.Hypnogram.from_integers(
... [0, 0, 1, 1, 1, 0, 1],
... mapping={0: "W", 1: "S"},
... n_stages=2,
... )
>>> hyp.hypno
Epoch
0 WAKE
1 WAKE
2 SLEEP
3 SLEEP
4 SLEEP
5 WAKE
6 SLEEP
Name: Stage, dtype: category
Categories (4, str): ['WAKE', 'SLEEP', 'ART', 'UNS']
From a Compumedics Profusion XML file#
Hypnograms in the NSRR format can be loaded directly:
>>> hyp = yasa.Hypnogram.from_profusion("path/to/hypnogram.xml")
Simulated hypnograms#
For testing and demonstration, simulate_hypnogram generates a realistic 5-stage
hypnogram with physiologically plausible stage transitions:
>>> hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, seed=42)
>>> hyp
<Hypnogram | 960 epochs x 30s (480.00 minutes), 5 unique stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.
Adding a start time#
Attaching a recording start time turns the epoch index into a
pandas.DatetimeIndex:
>>> hyp = yasa.simulate_hypnogram(
... tib=480, n_stages=5, start="2024-01-15 23:00:00", seed=42
... )
>>> hyp.start
Timestamp('2024-01-15 23:00:00')
>>> hyp.end
Timestamp('2024-01-16 07:00:00')
The timezone can be set with the tz parameter:
>>> hyp = yasa.Hypnogram(
... ["WAKE", "N1", "N2", "N3", "REM"],
... start="2024-01-15 23:00:00",
... tz="America/New_York",
... )
Exploring the data#
The stage labels are stored as a pandas.Series and can be accessed with the
hypno property, which inherits all standard pandas methods
(.describe(), .value_counts(), .to_csv(), …):
>>> hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, seed=42)
>>> hyp.hypno.value_counts()
Stage
N2 481
WAKE 164
N1 134
N3 106
REM 75
ART 0
UNS 0
Name: count, dtype: int64
A few useful properties:
>>> hyp.n_epochs # number of 30-s epochs
960
>>> hyp.duration # total recording duration in minutes
480.0
>>> hyp.freq # epoch length as a pandas offset string
'30s'
>>> hyp.n_stages # number of sleep stages (2 / 3 / 4 / 5)
5
To get integer-encoded stages (compatible with legacy YASA functions), use as_int:
>>> hyp.as_int().head()
Epoch
0 0
1 0
2 0
3 0
4 0
Name: Stage, dtype: int16
To get a BIDS-compatible events table (onset, duration, stage) use as_events:
>>> hyp.as_events().head()
onset duration value description
epoch
0 0.0 30.0 0 WAKE
1 30.0 30.0 0 WAKE
2 60.0 30.0 0 WAKE
3 90.0 30.0 0 WAKE
4 120.0 30.0 0 WAKE
Boolean masks#
get_mask returns a boolean NumPy array marking which epochs belong to the
specified stages. This is useful for indexing EEG data arrays or computing stage-specific metrics:
>>> nrem_mask = hyp.get_mask(["N2", "N3"])
>>> nrem_mask[:5]
array([False, False, False, False, False])
>>> # How many NREM epochs?
>>> nrem_mask.sum()
587
Slicing and cropping#
You can slice a Hypnogram with Python’s standard indexing syntax. The result is
always a new Hypnogram:
>>> hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, seed=42)
>>> # First epoch
>>> hyp[0].hypno.iloc[0]
'WAKE'
>>> # Epochs 100 to 199
>>> hyp[100:200]
<Hypnogram | 100 epochs x 30s (50.00 minutes), 5 unique stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.
For time-based slicing when a start time is set, use crop:
>>> hyp = yasa.simulate_hypnogram(tib=480, start="2024-01-15 23:00:00", seed=42)
>>> hyp_night = hyp.crop("2024-01-15 23:30:00", "2024-01-16 06:00:00")
Padding#
pad lets you extend the hypnogram before or after with a fill stage,
which is handy when aligning hypnograms recorded at different times:
>>> hyp = yasa.Hypnogram(["N2", "N2", "REM"])
>>> hyp.pad(before=2, after=1, fill_value="WAKE")
<Hypnogram | 6 epochs x 30s (3.00 minutes), 5 unique stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.
Sleep statistics#
sleep_statistics returns a dictionary of standard AASM metrics: Total
Sleep Time (TST), Sleep Efficiency (SE), Wake After Sleep Onset (WASO), stage durations, and more:
>>> import pandas as pd
>>> pd.Series(hyp.sleep_statistics())
TIB 480.0000
SPT 477.5000
WASO 79.5000
TST 398.0000
SE 82.9167
SME 83.3508
SFI 0.7538
SOL 2.5000
SOL_5min 2.5000
Lat_REM 67.0000
WAKE 82.0000
N1 67.0000
N2 240.5000
N3 53.0000
REM 37.5000
%N1 16.8342
%N2 60.4271
%N3 13.3166
%REM 9.4221
dtype: float64
Stage-transition matrix#
transition_matrix returns two DataFrames: the raw transition counts and
the row-normalized probability matrix. The probability matrix answers the question: “Given that
the current epoch is stage A, how likely is the next epoch to be stage B?”
>>> counts, probs = hyp.transition_matrix()
>>> probs.round(3)
To Stage WAKE N1 N2 N3 REM
From Stage
WAKE 0.933 0.067 0.000 0.000 0.00
N1 0.045 0.739 0.216 0.000 0.00
N2 0.006 0.033 0.929 0.021 0.01
N3 0.010 0.029 0.038 0.914 0.01
REM 0.000 0.067 0.013 0.000 0.92
Sleep periods#
find_periods detects consecutive runs of a given stage that exceed a minimum
duration. This is useful for identifying stable sleep bouts, wake periods, or REM episodes:
>>> # Find all REM periods lasting at least 5 minutes
>>> hyp.find_periods(threshold="5min").query("values == 'REM'")
Merging stages#
consolidate_stages collapses a fine-grained hypnogram into a coarser one.
This is useful when a downstream analysis does not require the full 5-stage resolution:
>>> hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, seed=42)
>>> # 5-stage to 3-stage (Wake / NREM / REM)
>>> hyp3 = hyp.consolidate_stages(3)
>>> hyp3.hypno.value_counts()
Stage
NREM 721
WAKE 164
REM 75
ART 0
UNS 0
Name: count, dtype: int64
>>> # 5-stage to 2-stage (Wake / Sleep)
>>> hyp2 = hyp.consolidate_stages(2)
>>> hyp2.hypno.value_counts()
Stage
SLEEP 796
WAKE 164
ART 0
UNS 0
Name: count, dtype: int64
Visualization#
Hypnogram plot#
plot_hypnogram draws the classic staircase hypnogram:
>>> hyp.plot_hypnogram()
import matplotlib.pyplot as plt
import yasa
hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, seed=42)
fig, ax = plt.subplots(figsize=(12, 4))
hyp.plot_hypnogram(ax=ax)
fig.tight_layout()
Hypnodensity plot#
When stage probabilities are available (e.g. from SleepStaging),
plot_hypnodensity shows the per-epoch probability of each stage as a
stacked area chart, giving a more nuanced view of sleep dynamics:
>>> sls = yasa.SleepStaging(raw, eeg_name="C3-A2")
>>> hyp = sls.predict()
>>> hyp.plot_hypnodensity()
The example below uses a simulated hypnogram with synthetic stage probabilities to illustrate the output:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yasa
hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, seed=42)
stages = ["WAKE", "N1", "N2", "N3", "REM"]
rng = np.random.default_rng(42)
one_hot = (
pd.get_dummies(hyp.hypno)
.reindex(columns=stages, fill_value=0)
.to_numpy(dtype=float)
)
noise = rng.dirichlet(np.ones(5) * 0.5, size=hyp.n_epochs)
raw_proba = 0.75 * one_hot + 0.25 * noise
proba = pd.DataFrame(raw_proba / raw_proba.sum(axis=1, keepdims=True), columns=stages)
hyp_with_proba = yasa.Hypnogram(hyp.hypno, n_stages=5, proba=proba)
fig, ax = plt.subplots(figsize=(12, 4))
hyp_with_proba.plot_hypnodensity(ax=ax)
fig.tight_layout()
Aligning with EEG data#
To use a hypnogram alongside raw EEG data, YASA needs a sample-level label for every EEG sample,
not just one per 30-second epoch. upsample_to_data handles this
automatically:
>>> import mne
>>> raw = mne.io.read_raw_edf("recording.edf", preload=True)
>>> hyp = yasa.Hypnogram.from_integers(int_hypno, freq="30s")
>>> hypno_up = hyp.upsample_to_data(raw)
Tip
As of YASA 0.7, most detection functions accept a Hypnogram object directly.
No manual upsampling is needed. Just pass hypno=hyp:
>>> sp = yasa.spindles_detect(raw, hypno=hyp, include=["N2", "N3"])
Alignment modes#
The behavior of upsample_to_data depends on whether timestamp information
is available.
Length-based alignment (default)
YASA assumes the hypnogram and the recording start at the same time. Any length mismatch is
resolved by padding or cropping at the end. This mode is always used when data is a NumPy
array. It is also used when data is an mne.io.BaseRaw but either
Hypnogram.start is not set or raw.meas_date is None.
>>> hyp = yasa.Hypnogram(stages, freq="30s")
>>> hypno = hyp.upsample_to_data(raw)
Timestamp-aware alignment
Triggered automatically when both Hypnogram.start and raw.meas_date are set. YASA
computes the absolute offset between the two timestamps and selects the correct hypnogram epochs.
>>> # EDF recorded at 22:11:37 local time
>>> hyp = yasa.Hypnogram(stages, freq="30s", start="2024-11-08 22:11:37")
>>> hypno = hyp.upsample_to_data(raw)
Common scenarios#
Hypnogram and recording cover the same window
The hypnogram is upsampled and fits the data exactly. Both alignment modes give the same result.
Hypnogram is shorter than the recording
This happens when the hypnogram covers only the Lights Off to Lights On period while the PSG spans a longer window.
Length-based: the hypnogram is padded with Unscored (
UNS) at the end.Timestamp-aware: the correct number of
UNSepochs is prepended before Lights Off, and any remaining tail is also padded.
>>> hyp = yasa.Hypnogram(stages, freq="30s", start="2024-01-15 23:00:00")
>>> hypno = hyp.upsample_to_data(raw)
>>> # Epochs before Lights Off and after Lights On become UNS
Hypnogram is longer than the recording
This happens when working with a cropped segment of a full-night recording.
Length-based: the hypnogram is cropped from the end. This is only correct if the segment starts at the very beginning of the recording.
Timestamp-aware: YASA skips the correct leading epochs based on the timestamp offset and selects only the epochs that fall within the recording window.
>>> # Full-night hypnogram, but only the second half of the night is loaded
>>> hyp = yasa.Hypnogram(stages, freq="30s", start="2024-01-15 23:00:00")
>>> hypno = hyp.upsample_to_data(raw_cropped) # correct epochs selected automatically
Automatic staging with SleepStaging
When using SleepStaging, the start attribute is populated automatically
from raw.meas_date when available, so timestamp-aware alignment works out of the box:
>>> sls = yasa.SleepStaging(raw, eeg_name="C4-M1")
>>> hyp = sls.predict() # hyp.start set automatically from raw.meas_date
>>> hypno = hyp.upsample_to_data(raw_cropped)
Saving and loading#
A Hypnogram including all metadata (epoch length, start time, scorer, stage
probabilities) can be saved to a JSON file and reloaded later:
>>> hyp = yasa.simulate_hypnogram(tib=480, n_stages=5, scorer="Expert", seed=42)
>>> # Save to disk
>>> hyp.to_json("my_hypnogram.json")
>>> # Reload it later, all metadata is preserved
>>> hyp2 = yasa.Hypnogram.from_json("my_hypnogram.json")
You can also use to_dict /
from_dict, which produce a plain JSON-serializable Python dictionary in
the same format as to_json.
Comparing two hypnograms#
evaluate computes epoch-by-epoch agreement metrics between a reference and
an observer hypnogram, including Cohen’s kappa, Matthews correlation coefficient, and per-stage
F1 scores:
>>> ref = yasa.simulate_hypnogram(tib=480, n_stages=5, scorer="Expert", seed=42)
>>> obs = yasa.simulate_hypnogram(tib=480, n_stages=5, scorer="Auto", seed=99)
>>> agreement = ref.evaluate(obs)
Note
evaluate is experimental and the output format may change in future
releases.
Next steps#
Quickstart: end-to-end walkthrough with real PSG data.
API reference: full API reference for
Hypnogramand all other YASA classes.Jupyter notebooks: hands-on notebooks with downloadable example datasets.