Add comprehensive model card for DisTime

by nielsr HF Staff - opened Aug 1

←

nielsr

Aug 1

This PR significantly enhances the model card by:

Adding the pipeline_tag: video-text-to-text, allowing the model to be discovered under relevant filters on the Hub.
Specifying library_name: transformers, enabling the "How to use" widget for easier inference.
Adding relevant tags such as multimodal, video-understanding, temporal-localization, and qwen for improved discoverability and context.
Linking directly to the Hugging Face paper page: DisTime: Distribution-based Time Representation for Video Large Language Models.
Providing a link to the official GitHub repository for code and further details.
Including the full abstract and a clear transformers-based usage example for quick understanding and implementation.
Adding the citation information and acknowledgements.
Removing the unnecessary "File information" section.

UserJoseph changed pull request status to merged Sep 17

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment