Skip to content

Downloadclient

Classes

FileDownloadState

The state a file can be in before/while/after downloading.

BaseExtractionTool

BaseExtractionTool(
    program_name,
    useability_check_args,
    extract_args,
    logger=logging.log,
)

Initializes a extraction tool object

PARAMETER DESCRIPTION
program_name

the name of the archive extraction program, e.g., unzip

TYPE: str

useability_check_args

the arguments of the extraction program to test if it's installed, e.g., --version

TYPE: str

extract_args

the arguments that will be passed to the program for extraction

TYPE: str

logger

optional decorated logging.log object that can be passed from the calling daemon or client.

TYPE: LoggerFunction DEFAULT: log

Functions

is_useable
is_useable()

Checks if the extraction tool is installed and usable

RETURNS DESCRIPTION
True if it is usable otherwise False
try_extraction
try_extraction(
    archive_file_path, file_to_extract, dest_dir_path
)

Calls the extraction program to extract a file from an archive

PARAMETER DESCRIPTION
archive_file_path

path to the archive

TYPE: str

file_to_extract

file name to extract from the archive

TYPE: str

dest_dir_pat

destination directory where the extracted file will be stored

RETURNS DESCRIPTION
True on success otherwise False

DownloadClient

DownloadClient(
    client=None,
    logger=None,
    tracing=True,
    check_admin=False,
    check_pcache=False,
)

Initializes the basic settings for an DownloadClient object

PARAMETER DESCRIPTION
client

Optional: rucio.client.client.Client object. If None, a new object will be created.

TYPE: Optional[Client] DEFAULT: None

logger

Optional: If None, default logger will be used.

TYPE: Optional[LoggerFunction] DEFAULT: None

external_traces

Optional: reference to a list where traces can be added

Functions

download_pfns
download_pfns(
    items,
    num_threads=2,
    trace_custom_fields=None,
    traces_copy_out=None,
    deactivate_file_download_exceptions=False,
)

Download items with a given PFN. This function can only download files, no datasets.

PARAMETER DESCRIPTION
items

List of dictionaries. Each dictionary describing a file to download. Dictionary keys: * pfn : str PFN string of this file

  • did : str DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed

  • rse : str rse name (e.g. 'CERN-PROD_DATADISK'). RSE Expressions are not allowed

  • base_dir : str, optional Base directory where the downloaded files will be stored. Default: '.'

  • no_subdir : bool, optional If true, files are written directly into base_dir. Default: False

  • adler32 : str, optional The adler32 checksum to compare the downloaded file's adler32 checksum with

  • md5 : str, optional The md5 checksum to compare the downloaded file's md5 checksum with

  • transfer_timeout : int, optional Timeout time for the download protocols. Default: None

  • check_local_with_filesize_only : bool, optional If true, already downloaded files will not be validated by checksum. Default: False

TYPE: list[dict[str, Any]]

num_threads

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

TYPE: int DEFAULT: 2

trace_custom_fields

Custom key value pairs to send with the traces

TYPE: Optional[dict[str, Any]] DEFAULT: None

traces_copy_out

Reference to an external list, where the traces should be uploaded

TYPE: Optional[list[dict[str, Any]]] DEFAULT: None

deactivate_file_download_exceptions

If file download exceptions shouldn't be raised. Default: False

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
A list of dictionaries with an entry for each file, containing the input options,

the did, and the clientState. clientState can be one of the following: ALREADY_DONE, DONE, FILE_NOT_FOUND, FAIL_VALIDATE, FAILED

RAISES DESCRIPTION
InputValidationError

If one of the input items is in the wrong format

NoFilesDownloaded

If no files could be downloaded

NotAllFilesDownloaded

If not all files could be downloaded

RucioException

If something unexpected went wrong during the download

download_dids
download_dids(
    items,
    num_threads=2,
    trace_custom_fields=None,
    traces_copy_out=None,
    deactivate_file_download_exceptions=False,
    sort=None,
)

Download items with given DIDs. This function can also download datasets and wildcarded DIDs.

PARAMETER DESCRIPTION
items

List of dictionaries. Each dictionary describing an item to download. Dictionary keys: * did : str DID string of this file (e.g. 'scope:file.name')

  • filters : dict, optional Filter to select DIDs for download. Optional if DID is given

  • rse : str, optional rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download

  • impl : str, optional name of the protocol implementation to be used to download this item

  • no_resolve_archives : bool, optional bool indicating whether archives should not be considered for download. Default: False

  • resolve_archives : bool, optional Deprecated: Use no_resolve_archives instead

  • force_scheme : str or list[str], optional force a specific scheme to download this item. Default: None

  • base_dir : str, optional base directory where the downloaded files will be stored. Default: '.'

  • no_subdir : bool, optional If true, files are written directly into base_dir. Default: False

  • nrandom : int, optional if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset

  • ignore_checksum : bool, optional If true, skips the checksum validation between the downloaded file and the rucio catalouge. Default: False

  • transfer_timeout : int, optional Timeout time for the download protocols. Default: None

  • transfer_speed_timeout : int, optional Minimum allowed transfer speed (in KBps). Ignored if transfer_timeout set. Otherwise, used to compute default timeout. Default: 500

  • check_local_with_filesize_only : bool, optional If true, already downloaded files will not be validated by checksum. Default: False

TYPE: list[dict[str, Any]]

num_threads

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

TYPE: int DEFAULT: 2

trace_custom_fields

Custom key value pairs to send with the traces

TYPE: Optional[dict[str, Any]] DEFAULT: None

traces_copy_out

Reference to an external list, where the traces should be uploaded

TYPE: Optional[list[dict[str, Any]]] DEFAULT: None

deactivate_file_download_exceptions

If file download exceptions shouldn't be raised. Default: False

TYPE: bool DEFAULT: False

sort

Select best replica by replica sorting algorithm. Available algorithms: geoip - based on src/dst IP topographical distance

TYPE: Optional[SORTING_ALGORITHMS_LITERAL] DEFAULT: None

RETURNS DESCRIPTION
A list of dictionaries with an entry for each file, containing the input options,

the did, and the clientState.

RAISES DESCRIPTION
InputValidationError

If one of the input items is in the wrong format

NoFilesDownloaded

If no files could be downloaded

NotAllFilesDownloaded

If not all files could be downloaded

RucioException

If something unexpected went wrong during the download

download_from_metalink_file(
    item,
    metalink_file_path,
    num_threads=2,
    trace_custom_fields=None,
    traces_copy_out=None,
    deactivate_file_download_exceptions=False,
)

Download items using a given metalink file.

PARAMETER DESCRIPTION
item

Dictionary describing an item to download. Dictionary keys: * base_dir : str, optional Base directory where the downloaded files will be stored. Default: '.'

  • no_subdir : bool, optional If true, files are written directly into base_dir. Default: False

  • ignore_checksum : bool, optional If true, skips the checksum validation between the downloaded file and the rucio catalogue. Default: False

  • transfer_timeout : int, optional Timeout time for the download protocols. Default: None

  • check_local_with_filesize_only : bool, optional If true, already downloaded files will not be validated by checksum. Default: False

TYPE: dict[str, Any]

num_threads

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

TYPE: int DEFAULT: 2

trace_custom_fields

Custom key value pairs to send with the traces

TYPE: Optional[dict[str, Any]] DEFAULT: None

traces_copy_out

Reference to an external list, where the traces should be uploaded

TYPE: Optional[list[dict[str, Any]]] DEFAULT: None

deactivate_file_download_exceptions

If file download exceptions shouldn't be raised. Default: False

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
A list of dictionaries with an entry for each file, containing the input options,

the did, and the clientState.

RAISES DESCRIPTION
InputValidationError

If one of the input items is in the wrong format

NoFilesDownloaded

If no files could be downloaded

NotAllFilesDownloaded

If not all files could be downloaded

RucioException

If something unexpected went wrong during the download

download_aria2c
download_aria2c(
    items,
    trace_custom_fields=None,
    filters=None,
    deactivate_file_download_exceptions=False,
    sort=None,
)

Uses aria2c to download the items with given DIDs. This function can also download datasets and wildcarded DIDs. It only can download files that are available via https/davs. Aria2c needs to be installed and X509_USER_PROXY needs to be set!

PARAMETER DESCRIPTION
items

List of dictionaries. Each dictionary describing an item to download. Dictionary keys: * did : str DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed

  • rse : str, optional rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download

  • base_dir : str, optional base directory where the downloaded files will be stored. (Default: '.')

  • no_subdir : bool, optional If true, files are written directly into base_dir. (Default: False)

  • nrandom : int, optional if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset

  • ignore_checksum : bool, optional If true, skips the checksum validation between the downloaded file and the rucio catalogue. (Default: False)

  • check_local_with_filesize_only : bool, optional If true, already downloaded files will not be validated by checksum. (Default: False)

TYPE: list[dict[str, Any]]

trace_custom_fields

Custom key value pairs to send with the traces

TYPE: Optional[dict[str, Any]] DEFAULT: None

filters

Filter to select DIDs for download

TYPE: Optional[dict[str, Any]] DEFAULT: None

deactivate_file_download_exceptions

If file download exceptions shouldn't be raised. Default: False

TYPE: bool DEFAULT: False

sort

Select best replica by replica sorting algorithm. Available algorithms: * geoip - based on src/dst IP topographical distance

TYPE: Optional[SORTING_ALGORITHMS_LITERAL] DEFAULT: None

RETURNS DESCRIPTION
A list of dictionaries with an entry for each file, containing the input options,

the did, and the clientState.

RAISES DESCRIPTION
InputValidationError

If one of the input items is in the wrong format

NoFilesDownloaded

If no files could be downloaded

NotAllFilesDownloaded

If not all files could be downloaded

RucioException

If something unexpected went wrong during the download (e.g. aria2c could not be started)

preferred_impl
preferred_impl(sources)

Finds the optimum protocol impl preferred by the client and supported by the remote RSE.

PARAMETER DESCRIPTION
sources

List of sources for a given DID

TYPE: list[dict[str, Any]]

RAISES DESCRIPTION
RucioException

General exception with msg for more details

Functions