I am working on archiving media from our instance with using the Python-API, but it seems the sg.download_attachment() function has a habit of getting stuck. Is there something I am doing wrong to cause this? If it threw an error I could at least handle the error and continue but this issue means that I basically have to supervise the script to restart it every so often to archive a large number of versions.
Not sure about the causes but perhaps you can try running that bit in a coroutine and giving it a timeout?
or send it as a Deadline job with a timeout?
I have not used async in Python before, so I tried to add a timeout to the download with asyncio.timeout(), but that didnt seem to work. It doesnt seem to actually stop the download at all.
Here is my download function:
async def downloadVersion(downloadLink, filepath):
async with asyncio.timeout(30):
sg.download_attachment(
{"type": downloadLink["type"], "id": downloadLink["id"]},
file_path=filepath,
)
and here is how I call it
if not os.path.exists(filePath) and versions:
try:
await downloadVersion(downloadLink, filePath)
except asyncio.TimeoutError:
logger.error(
f"Download for {versionName} timed out at {datetime.datetime.now()}. Full path: {filePath}"
)
except Exception as error:
logger.error(
f"Error {error} of type {type(error)} happened at {datetime.datetime.now()}"
)
I also am not sure what you mean by “send it as a Deadline job with a timeout?”, could you elaborate on that?
At this point I am considering just using requests inside python to get more control over the download.
Deadline is a render farm manager, maybe you run something different, but its easy to send lots of jobs to it and get your downloads done that way without having a failure of one affect everythying else.
I believe you can give your await
a timeout.
Alright. I did try that, but it didnt seem to work well either. I ended up solving this by just downloading the files directly with the requests library and building timeout and retry logic around the download myself.