core.crawlerCrawlerBase class for crawlers.
If you want to create a new crawler, you should inherit from this class. It provides some useful methods for crawling.
Args:
url (str): Base url for the crawler.__init____init__(url: str)
get_pageget_page(url: str = None, path: str = <class 'str'>, **kwargs) → str
Get a page from a given url. This is just a wrapper around get_response method.
get_page_soupget_page_soup(
url: str = None,
enable_cache: bool = True,
**kwargs
) → BeautifulSoup
Get a BeautifulSoup object from a given url. This is just a wrapper around get_page method.
get_responseget_response(url: str = None, path: str = <class 'str'>, **kwargs) → Response
Get a response from a given url.
Args:
url (str, optional): Url to get response from. Defaults to None.path (str, optional): Path to join with base url. Defaults to str.kwargs: Keyword arguments to pass to requests.get.Returns:
requests.Response: Response from the given url.join_urljoin_url(*args: str) → str
Join url parts.
Args:
*args (str): Url parts.Returns:
str: Joined url.This file was automatically generated via lazydocs.