catenets.datasets.network module

Utilities and helpers for retrieving the datasets

download_gdrive_if_needed(path: pathlib.Path, file_id: str) None

Helper for downloading a file from Google Drive, if it is now already on the disk.

Parameters
download_http_if_needed(path: pathlib.Path, url: str) None

Helper for downloading a file, if it is now already on the disk.

Parameters
  • path (Path) – Where to download the file.

  • url (URL string) – HTTP URL for the dataset.

download_if_needed(download_path: pathlib.Path, file_id: Optional[str] = None, http_url: Optional[str] = None, unarchive: bool = False, unarchive_folder: Optional[pathlib.Path] = None) None

Helper for retrieving online datasets.

Parameters
  • download_path (str) – Where to download the archive

  • file_id (str, optional) – Set this if you want to download from a public Google drive share

  • http_url (str, optional) – Set this if you want to download from a HTTP URL

  • unarchive (bool) – Set this if you want to try to unarchive the downloaded file

  • unarchive_folder (str) – Mandatory if you set unarchive to True.

unarchive_if_needed(path: pathlib.Path, output_folder: pathlib.Path) None

Helper for uncompressing archives. Supports .tar.gz and .tar.

Parameters
  • path (Path) – Source archive.

  • output_folder (Path) – Where to unarchive.