mafw.tools.toml_tools

Tools for reading, writing, and validating MAFw TOML steering files.

Author:

Bulgheroni Antonio

Description:

Utilities to generate and load TOML steering files and related helpers.

Module Attributes

ENV_PATTERN

Regex matching supported environment variable expansion patterns.

ENV_ESCAPE_SENTINEL

Sentinel used to preserve escaped variable patterns.

MAX_ENV_RESOLUTION_PASSES

Maximum number of expansion passes applied to a single string.

Functions

dump_processor_parameters_to_toml(...)

Dumps a toml file with processor parameters.

generate_steering_file(output_file, processors)

Generates a steering file.

load_steering_file(steering_file[, ...])

Load a steering file for the execution framework.

load_steering_file_legacy(steering_file)

Load a steering file without any semantic validation.

path_encoder(obj)

Encoder for PathItem.

processor_validator(processors)

Validates that all items in the list are valid processor instances or classes.

resolve_config_env(config[, env])

Resolve environment variables in every string value of a configuration dictionary.

resolve_string(value, env)

Resolve environment variables within a string value.

Classes

PathItem(t, value, original, trivia)

TOML item representing a Path

class mafw.tools.toml_tools.PathItem(t: StringType, value: str, original: str, trivia: Trivia)[source]

Bases: String

TOML item representing a Path

unwrap() Path[source]

Returns as pure python object (ppo)

mafw.tools.toml_tools._add_db_configuration(database_conf: dict[str, Any] | None = None, db_engine: str = 'sqlite', doc: TOMLDocument | None = None) TOMLDocument[source]

Add the DB configuration to the TOML document

The expected structure of the database_conf dictionary is one of these:

option1 = {
    'DBConfiguration': {
        'URL': 'sqlite:///:memory:',
        'parameters': {
            'sqlite': {
                'pragmas': {
                    'journal_mode': 'wal',
                    'cache_size': -64000,
                    'foreign_keys': 1,
                    'synchronous': 0,
                },
            },
        },
    }
}

option2 = {
    'URL': 'sqlite:///:memory:',
    'authentication': {
        'method': 'env',
        'username': 'POSTGRES_USER',
        'password': 'POSTGRES_PASS',
    },
    'parameters': {
        'postgresql': {
            'sslmode': 'require',
        },
    },
}
Parameters:
  • database_conf (dict) – A dictionary with the database configuration. See comments above. If None, then the default is used.

  • db_engine (str, Optional) – The database engine. It is used only in case the provided database configuration is invalid to retrieve the default configuration. Defaults to sqlite.

  • doc (TOMLDocument, Optional) – The TOML document to add the DB configuration. If None, one will be created.

Returns:

The modified document.

Return type:

TOMLDocument

Raises:

UnknownDBEngine – if the database_conf is invalid and the db_engine is not yet implemented.

mafw.tools.toml_tools.dump_processor_parameters_to_toml(processors: list[ProcessorClassProtocol] | ProcessorClassProtocol, output_file: Path | str) None[source]

Dumps a toml file with processor parameters.

This helper function can be used when the parameters of one or many processors have to be dumped to a TOML file. For each Processor in the processors a table in the TOML file will be added with their parameters is the shape of parameter name = value.

It must be noted that processors can be:

  • a list of processor classes (list[type[Processor]])

  • a list of processor instances (list[Processor]])

  • one single processor class (type[Processor])

  • one single processor instance (Processor)

What value of the parameters will be dumped?

Good question, have a look at this explanation.

param processors:

One or more processors for which the parameters should be dumped.

type processors:

list[type[Processor | Processor]] | type[Processor] | Processor

param output_file:

The name of the output file for the dump.

type output_file:

Path | str

raise KeyAlreadyPresent:

if an attempt to add twice, the same processor is made.

raise TypeError:

if the list contains items different from Processor classes and instances.

mafw.tools.toml_tools.generate_steering_file(output_file: Path | str, processors: list[ProcessorClassProtocol] | ProcessorClassProtocol, database_conf: dict[str, Any] | None = None, db_engine: str = 'sqlite') None[source]

Generates a steering file.

Parameters:
  • output_file (Path | str) – The output filename where the steering file will be save.

  • processors (list[type[Processor] | Processor], type[Processor], Processor) – The processors list for which the steering file will be generated.

  • database_conf (dict, Optional) – The database configuration dictionary

  • db_engine – A string representing the DB engine to be used. Possible values are: sqlite, postgresql and mysql.

Type:

str

mafw.tools.toml_tools.load_steering_file(steering_file: Path | str, validation_level: ValidationLevel | None = ValidationLevel.SEMANTIC) dict[str, Any][source]

Load a steering file for the execution framework.

Parameters:
  • steering_file (Path, str) – The path to the steering file.

  • validation_level – Requested validation tier, or None to skip validation.

Returns:

The configuration dictionary.

Return type:

dict

Raises:

mafw.mafw_errors.InvalidSteeringFile – if the validation level reports at least one issue.

mafw.tools.toml_tools.load_steering_file_legacy(steering_file: Path | str) dict[str, Any][source]

Load a steering file without any semantic validation.

Parameters:

steering_file (Path, str) – The path to the steering file.

Returns:

The parsed steering dictionary.

Return type:

dict

mafw.tools.toml_tools.path_encoder(obj: Any) Item[source]

Encoder for PathItem.

mafw.tools.toml_tools.processor_validator(processors: list[ProcessorClassProtocol]) bool[source]

Validates that all items in the list are valid processor instances or classes.

Parameters:

processors (list[type[Processor] | Processor]) – The list of items to be validated.

Returns:

True if all items are valid.

Return type:

bool

mafw.tools.toml_tools.resolve_config_env(config: dict[str, Any], env: Mapping[str, str] | None = None) dict[str, Any][source]

Resolve environment variables in every string value of a configuration dictionary.

Parameters:
  • config (dict[str, Any]) – Configuration dictionary to resolve.

  • env (Mapping[str, str] | None) – Optional environment mapping; defaults to os.environ.

Returns:

A new configuration dictionary with resolved values.

Return type:

dict[str, Any]

Raises:

ValueError – If a required variable is missing or expansion does not converge.

mafw.tools.toml_tools.resolve_string(value: str, env: Mapping[str, str]) str[source]

Resolve environment variables within a string value.

Parameters:
  • value (str) – Input string to resolve.

  • env (Mapping[str, str]) – Environment mapping to use for substitution.

Returns:

The resolved string.

Return type:

str

Raises:

ValueError – If a required variable is missing or expansion does not converge.

mafw.tools.toml_tools.ENV_ESCAPE_SENTINEL = '__MAFW_ENV_ESCAPE__{'

Sentinel used to preserve escaped variable patterns.

mafw.tools.toml_tools.ENV_PATTERN = re.compile('\n    \\$\\{                              # opening ${\n    (?P<name>[A-Za-z_][A-Za-z0-9_]*)  # variable name\n    (?:\n        (?P<op>:-|:\\?)               # operator (:- or :?)\n        (?P<value>, re.VERBOSE)

Regex matching supported environment variable expansion patterns.

mafw.tools.toml_tools.MAX_ENV_RESOLUTION_PASSES = 10

Maximum number of expansion passes applied to a single string.