format_content_with_images¶
- torchtune.data.format_content_with_images(content: str, *, image_tag: str, images: List[PIL.Image.Image]) List[Dict[str, Any]] [源代码]¶
给定一个原始文本字符串,根据指定的
image_tag
进行分割,并将其格式化为字典列表,以便在Message
的 content 字段中使用。[ { "role": "system" | "user" | "assistant", "content": [ {"type": "image", "content": <PIL.Image.Image>}, {"type": "text", "content": "This is a sample image."}, ], }, ... ]
- 参数:
- 引发:
ValueError – 如果图像数量与内容中的图像标签数量不匹配
示例
>>> content = format_content_with_images( ... "<|image|>hello <|image|>world", ... image_tag="<|image|>", ... images=[<PIL.Image.Image>, <PIL.Image.Image>] ... ) >>> print(content) [ {"type": "image", "content": <PIL.Image.Image>}, {"type": "text", "content": "hello "}, {"type": "image", "content": <PIL.Image.Image>}, {"type": "text", "content": "world"} ]