spacepy.toolbox.get_url

spacepy.toolbox.get_url(url, outfile=None, reporthook=None, cached=False, keepalive=False, conn=None)[source]

Read data from a URL

Open an HTTP URL, honoring the user agent as specified in the SpacePy config file. Returns the data, optionally also writing out to a file.

This is similar to the deprecated urlretrieve.

Changed in version 0.5.0: In earlier versions of SpacePy invalid combinations of cached and outfile raised RuntimeError, changed to ValueError.

Parameters:
urlstr

The URL to open

outfilestr (optional)

Full path to file to write data to

reporthookcallable (optional)

Function for reporting progress; takes arguments of block count, block size, and total size.

cachedbool (optional)

Compare modification time of the URL to the modification time of outfile; do not retrieve (and return None) unless the URL is newer than the file. If set outfile is required.

keepalivebool (optional)

Attempt to keep the connection open to retrieve more URLs. The return becomes a tuple of (data, conn) to return the connection used so it can be used again. This mode does not support proxies. Required to be True if conn is provided. (Default False)

connhttp.client.HTTPConnection (optional)

An established http connection (HTTPS is also okay) to use with keepalive. If not provided, will attempt to make a connection.

Returns:
bytes

The HTTP data from the server.

See also

progressbar

Notes

This function honors proxy settings as described in urllib.request.getproxies(). Cryptic error messages (such as Network is unreachable) may indicate that proxy settings should be defined as appropriate for your environment (e.g. with HTTP_PROXY or HTTPS_PROXY environment variables).