spacepy.toolbox.LinkExtracter

class spacepy.toolbox.LinkExtracter(*, convert_charrefs=True)[source]

Finds all links in a HTML page, useful for crawling.

After HTML has been parsed, the links attribute contains a list of link targets.

__init__(*, convert_charrefs=True)

Initialize and reset this instance.

If convert_charrefs is True (the default), all character references are automatically converted to the corresponding Unicode characters.

Methods

handle_starttag(tag, attrs)

reset(*args, **kwargs)

Reset this instance.

Attributes

handle_starttag(tag, attrs)[source]
reset(*args, **kwargs)[source]

Reset this instance. Loses all unprocessed data.