GDriveReader¶
- class torchdata.datapipes.iter.GDriveReader(source_datapipe: IterDataPipe[str], *, timeout: Optional[float] = None, skip_on_error: bool = False, **kwargs: Optional[Dict[str, Any]])¶
获取指向 GDrive 文件的 URL,并生成文件名和 IO 流的元组(功能名称:
read_from_gdrive
)。- 参数:
source_datapipe – 包含 GDrive 文件 URL 的 DataPipe
timeout – HTTP 请求的超时时间(以秒为单位)
skip_on_error – 是否跳过导致问题的 url,否则会引发异常
**kwargs – 一个字典,用于传递 requests 接受的可选参数。有关完整列表,请查看 https://docs.python-requests.org/en/master/api/
示例
from torchdata.datapipes.iter import IterableWrapper, GDriveReader gdrive_file_url = "https://drive.google.com/uc?export=download&id=SomeIDToAGDriveFile" gdrive_reader_dp = GDriveReader(IterableWrapper([gdrive_file_url])) reader_dp = gdrive_reader_dp.readlines() it = iter(reader_dp) path, line = next(it) print((path, line))
输出
('https://drive.google.com/uc?export=download&id=SomeIDToAGDriveFile', b'<First line from the GDrive File>')