Pretraining data cutoff?

#17

by ytsaig - opened Dec 24, 2024

Dec 24, 2024

Hello,

What is the date cutoff for the data used for pretraining the model? The associated paper says:
"Although ModernBERT showcase strong results across the board, it should be noted that an important factor in its performance is TREC-COVID (Voorhees et al., 2021), potentially showcasing the benefits of ModernBERT being trained with a more recent knowledge cutoff than most existing encoders. "

However there's no explicit mention of the cutoff date.

Thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment