jordiclive
/

flan-t5-3b-summarizer

text2text-generation

document summary

text-generation-inference

Model card Files Files and versions

jordiclive commited on Feb 5, 2023

Commit

a40b6f3

·

1 Parent(s): b179c47

Update README.md

Files changed (1) hide show

README.md +4 -6

README.md CHANGED Viewed

@@ -194,15 +194,13 @@ result = summarizer(
 If having computing constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
 - all the parameters for generation on the API here are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
-## Training and evaluation data
-- the [booksum](https://arxiv.org/abs/2105.08209) dataset (this is what adds the `bsd-3-clause` license)
-- During training, the input text was the text of the `chapter`, and the output was `summary_text`
-- Eval results can be found [here](https://huggingface.co/datasets/autoevaluate/autoeval-staging-eval-project-kmfoda__booksum-79c1c0d8-10905463) with metrics on the sidebar.
 ## Training procedure
 - Training was done in BF16, deepspeed stage 2 for 6 epochs with ROUGE2 monitored on the validation set.
 -
 ### Training hyperparameters

 If having computing constraints, try the base version [`pszemraj/led-base-book-summary`](https://huggingface.co/pszemraj/led-base-book-summary)
 - all the parameters for generation on the API here are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
 ## Training procedure
 - Training was done in BF16, deepspeed stage 2 for 6 epochs with ROUGE2 monitored on the validation set.
+## Hardware
+- GPU count	8 NVIDIA A100-SXM4-40GB
+- CPU count	48
 -
 ### Training hyperparameters