Vision-language model (VLM) is a core technology of modern artificial intelligence (AI), and it can be used to represent ...