Evaluate on MS-COCO and Flickr30K for Image-to-Text and Text-to-Image tasks.

How does the 4-bit quantization affect the embedding space compared to FP16?

🌟 This model is built for speed . Your paper should lean heavily into the Efficiency-Accuracy Trade-off curve .