FlowObjects Data Loader for Snowflake - command line parameters

jhillam · January 15, 2021, 11:38pm

FlowObjects Data Loader

Command Line Parameters

Purpose

The FlowObjects Data Loader is a command line interface (CLI) tool that is designed to efficiently load data to Snowflake. Some of the features include:

reduce upload time by leveraging compression
break up large files into smaller optimal chunks
produce logs to console, file, or a table to audit a data load
control auto commit
add column(s) and value(s), i.e. meta-data - - inline, without modifying the source
easily embed in your existing processes (i.e. automation scripts)

Data Loader 4.0.0, CLI switches and flags usage

usage: flowobjects_data_loader [-h] [-v] [-s] --file FILE --sf_account
                               SF_ACCOUNT --sf_username SF_USERNAME
                               --sf_password SF_PASSWORD --sf_database
                               SF_DATABASE --sf_warehouse SF_WAREHOUSE
                               --sf_role SF_ROLE --sf_schema SF_SCHEMA
                               --sf_table SF_TABLE --sf_file_format
                               SF_FILE_FORMAT --sf_stage_type SF_STAGE_TYPE
                               [--sf_stage_name SF_STAGE_NAME]
                               [--sf_stage_aws_s3 SF_STAGE_AWS_S3]
                               [--sf_stage_aws_key_id SF_STAGE_AWS_KEY_ID]
                               [--sf_stage_aws_access_key SF_STAGE_AWS_ACCESS_KEY]
                               [--sf_stage_azure_container SF_STAGE_AZURE_CONTAINER]
                               [--sf_stage_azure_conn_str SF_STAGE_AZURE_CONN_STR]
                               [--sf_autocommit SF_AUTOCOMMIT]
                               [--encoding ENCODING] [--delimiter DELIMITER]
                               [--header_rows HEADER_ROWS]
                               [--on_error ON_ERROR]
                               [--transaction_name TRANSACTION_NAME]
                               [--error_table ERROR_TABLE]
                               [--split_file_size SPLIT_FILE_SIZE]
                               [--log_path LOG_PATH]
                               [--transformation_file TRANSFORMATION_FILE]
                               [--copy_into COPY_INTO]
                               [--additional_headers [ADDITIONAL_HEADERS [ADDITIONAL_HEADERS ...]]]
                               [--additional_content [ADDITIONAL_CONTENT [ADDITIONAL_CONTENT ...]]]
                               [--no_purge] [--remove_split_dir] [--log]
                               [--log_to_error_table] [--quiet]
                               [--synchronous] [--split_file]
                               [--use_upload_dir] [--no_error_on_empty]
                               [--sf_no_warehouse_resume] [--sf_no_compress]
                               [--sf_ext_stage_aws] [--sf_ext_stage_azure]
                               [--show_timing]
                               {token} ...

The arguments below are divided into switches for which a value must be
provided and flags for which no value should be provided. Most switches can be
used as environment variables instead. Some switches are required while others
are optional. Switches must be lowercase and are shown below with -- in front
of them. Environment variable equivalents are shown next to the switch in all
uppercase. For more details, visit: https://community.flowobjects.com/t/using-
cli-switches-or-environment-variables

arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -s, --sysinfo

required switches (requires value(s) for switch):
  --file FILE           File(s) or Directory/Directories to Upload (i.e. file:
                        mydata.csv || files: *.csv || directory of files:
                        myfilesdir/). If provided file not already compressed,
                        will automatically compress (in GZIP format) before
                        uploading (see --sf_no_compress flag not to
                        automatically compress). Already compressed supported
                        file formats will pass through without being
                        compressed again.
  --sf_account SF_ACCOUNT
                        Snowflake account name.
  --sf_username SF_USERNAME
                        Snowflake user name.
  --sf_password SF_PASSWORD
                        Snowflake password.
  --sf_database SF_DATABASE
                        Snowflake database.
  --sf_warehouse SF_WAREHOUSE
                        Snowflake warehouse.
  --sf_role SF_ROLE     Snowflake role.
  --sf_schema SF_SCHEMA
                        Snowflake schema.
  --sf_table SF_TABLE   Snowflake table to place file data in (i.e.
                        DATABASE.SCHEMA.TABLENAME).
  --sf_file_format SF_FILE_FORMAT
                        File format of files to upload (i.e.
                        EXCEL_DEMO_FORMAT).
  --sf_stage_type SF_STAGE_TYPE
                        Stage type (i.e. TABLE, NAMED, USER).

optional stage switches (requires value(s) for switch):
  --sf_stage_name SF_STAGE_NAME
                        Stage name if using a named or table stage (i.e.
                        DATABASE.SCHEMA.STAGENAME) [default: None]
  --sf_stage_aws_s3 SF_STAGE_AWS_S3
                        The name of the AWS S3 Bucket. (must be used together
                        with flag --sf_ext_stage_aws and switches
                        --sf_stage_named --sf_stage_aws_key_id
                        --sf_stage_aws_access_key when the provided named
                        stage is an external AWS S3 Bucket i.e. data)
                        [default: None]
  --sf_stage_aws_key_id SF_STAGE_AWS_KEY_ID
                        The access key ID string to authenticate and connect
                        with the AWS S3 Bucket. (must be used together with
                        flag --sf_ext_stage_aws and switches --sf_stage_named
                        --sf_stage_aws_s3 --sf_stage_aws_access_key when the
                        provided named stage is an external AWS S3 Bucket).
                        [default: None]
  --sf_stage_aws_access_key SF_STAGE_AWS_ACCESS_KEY
                        The secret access key string to authenticate and
                        connect with the AWS S3 Bucket. (must be used together
                        with flag --sf_ext_stage_aws and switches
                        --sf_stage_named --sf_stage_aws_s3
                        --sf_stage_aws_key_id when the provided named stage is
                        an external AWS S3 Bucket). [default: None]
  --sf_stage_azure_container SF_STAGE_AZURE_CONTAINER
                        The container name of the Azure Blob. (must be used
                        together with flag --sf_ext_stage_azure and switches
                        --sf_stage_name --sf_stage_azure_conn_str
                        --sf_external_stage when the provided named stage is
                        an external Azure Blob Storage. i.e. data) [default:
                        None]
  --sf_stage_azure_conn_str SF_STAGE_AZURE_CONN_STR
                        The connection string to authenticate and connect with
                        the Azure Blob Storage. (must be used together with
                        flag --sf_ext_stage_azure and switches --sf_stage_name
                        --sf_stage_azure_container --sf_external_stage when
                        the provided named stage is an external Azure Blob
                        Storage. i.e. DefaultEndpointsProtocol=https;AccountNa
                        me=somename;AccountKey=somekey;EndpointSuffix=core.win
                        dows.net) [default: None]

optional other switches (requires value(s) for switch):
  --sf_autocommit SF_AUTOCOMMIT
                        Control whether on connection close all changes are
                        automatically committed [default: True]. Set to True
                        to enable autocommit or False to disable it.
  --encoding ENCODING   Encoding for file to upload [default: None]. If not
                        specified, read from Snowflake file format.
  --delimiter DELIMITER
                        Line delimiter of file to upload [default: None]. If
                        not specified, read from Snowflake file format. Not
                        applicable to JSON file type.
  --header_rows HEADER_ROWS
                        Number of header rows to skip [default: 0]. Not
                        applicable to JSON file type.
  --on_error ON_ERROR   Setting for 'COPY INTO' command if an error is
                        encountered (i.e. CONTINUE, SKIP_FILE,
                        SKIP_FILE_<num>, SKIP_FILE_<num%>, ABORT_STATEMENT).
                        [default: CONTINUE].
  --transaction_name TRANSACTION_NAME
                        Transaction name [default: FILE_UPLOAD]
  --error_table ERROR_TABLE
                        Table to write errors to (i.e.
                        DATABASE.SCHEMA.ERRORTABLENAME). If not specified,
                        will default to ERROR_TABLE and must be used together
                        with --log_to_error_table
  --split_file_size SPLIT_FILE_SIZE
                        Size of split record files in bytes. [default:
                        100000000 (100 MB)]. Not applicable to JSON file type
                        or pre-compressed file.
  --log_path LOG_PATH   Directory path where log file(s) should be stored. If
                        not specified, will default to same directory as
                        binary (or split directory when --split_file is used)
                        and must be used together with --log
  --transformation_file TRANSFORMATION_FILE
                        JSON file specifying transformations. [default: None]
  --copy_into COPY_INTO
                        JSON file with custom COPY INTO statement to be used
                        instead of auto-generated default. Optionally token
                        substitution supported if provided for the CLI switch
                        values (sf_table, stage_name, sf_file_format). i.e.
                        {"VALUE": "{sf_TABLE} FROM {stage_name} FILE_FORMAT =
                        {sf_file_format} ON_ERROR = CONTINUE PURGE = TRUE
                        RETURN_FAILED_ONLY = TRUE ENFORCE_LENGTH = FALSE"}
                        [default: None]
  --additional_headers [ADDITIONAL_HEADERS [ADDITIONAL_HEADERS ...]]
                        Additional column names for COPY INTO statement (must
                        be used together with --additional_content, the
                        argument quantity must match for both arguments, and
                        value(s) must each be enclosed in single quotes. i.e.
                        --additional_headers my_col_a my_col_g
                        --additional_content 'value_a' 'value_b'). Not
                        applicable to JSON file type or pre-compressed file.
                        [default: Empty List]
  --additional_content [ADDITIONAL_CONTENT [ADDITIONAL_CONTENT ...]]
                        Additional values for COPY INTO statement (must be
                        used together with --additional_headers, the argument
                        quantity must match for both arguments, and value(s)
                        must each be enclosed in single quotes. i.e.
                        --additional_headers my_col_a my_col_g
                        --additional_content 'value_a' 'value_b'). Not
                        applicable to JSON file type or pre-compressed file.
                        [default: Empty List]

optional flags (do not specify value(s)):
  --no_purge            Does not purge files from stage if there is a
                        successful load.
  --remove_split_dir    Removes split directory after load attempt. Not
                        applicable to JSON file type.
  --log                 Write output to log file in addition to console
                        output. Use the --quiet flag not to output to console.
  --log_to_error_table  Log error(s) to an error table. Default table name is
                        ERROR_TABLE or can be set by --error_table argument.
                        Will automatically try to create error log table if
                        doesn't exist.
  --quiet               Prevents output to console.
  --synchronous         Prevents SQL statements from being executed
                        asynchronously.
  --split_file          Enables splitting of source files. Not applicable to
                        JSON file type or already compressed files.
  --use_upload_dir      Enables uploading directory rather than individual
                        files and single COPY INTO of specified stage
                        content(s).
  --no_error_on_empty   Return an errorlevel / exit code of 0 instead of 1
                        when file is empty.
  --sf_no_warehouse_resume
                        Do not explicitly resume warehouse if suspended before
                        executing actions.
  --sf_no_compress      Do not compress uncompressed file(s) before uploading.
  --sf_ext_stage_aws    If stage type (via switch --sf_stage_type) is named
                        and is an AWS external stage. (must be used together
                        with switches: --sf_stage_name --sf_stage_aws_s3
                        --sf_stage_aws_key_id --sf_stage_aws_access_key when
                        using an external AWS S3 Bucket).
  --sf_ext_stage_azure  If stage type (via switch --sf_stage_type) is named
                        and is an Azure external stage. (must be used together
                        with switches --sf_stage_name
                        --sf_stage_azure_container --sf_stage_azure_conn_str
                        when using an external Azure Blob Storage).
  --show_timing         Executes requested actions but won't output anything
                        to console except for start, end, and duration of
                        execution timing details

optional command(s) (position sensitive, use after required switch(es) and optional flag(s)):
  {token}
    token               Use FlowObjects Account Tokenize credential token for
                        authentication. (must be used together with switches:
                        --at_tkh --at_tea --at_token). Additional action
                        command help available with: token --help

FlowObjects Data Loader for Snowflake - command line parameters

Purpose

See also: Using CLI switches or environment variables

Data Loader 4.0.0, CLI switches and flags usage