Engage in multimedia chat with LLMs and ML models
Answer questions using images and Chinese text
Tokenize and decode text sequences using different models