mafw.tools.regexp

Module implements some basic functions involving regular expressions.

Functions

extract_protocol(url)

Extract the protocol portion from a database connection URL.

normalize_sql_spaces(sql_string)

Normalize multiple consecutive spaces in SQL string to single spaces.

parse_processor_name(processor_string)

Parse a processor name string into name and replica identifier components.

mafw.tools.regexp.extract_protocol(url: str) str | None[source]

Extract the protocol portion from a database connection URL.

The extract_protocol function takes a database connection URL string as input and extracts the protocol portion (the part before “://”). This function is useful for identifying the database type from connection strings.

Parameters:

url (str) – The url from which the protocol will be extracted.

Returns:

The protocol or None, if the extraction failed

Return type:

str | None

mafw.tools.regexp.normalize_sql_spaces(sql_string: str) str[source]

Normalize multiple consecutive spaces in SQL string to single spaces. Only handles spaces, preserves other whitespace characters.

Parameters:

sql_string (str) – The SQL string for space normalization.

Returns:

The normalized SQL command.

Return type:

str

mafw.tools.regexp.parse_processor_name(processor_string: str) tuple[str, str | None][source]

Parse a processor name string into name and replica identifier components.

Given a string in the form ‘MyProcessorName#156a’, returns a tuple (‘MyProcessorName’, ‘156a’). If the input string is ‘MyProcessorName’ only, then it returns (‘MyProcessorName’, None). If it gets ‘MyProcessorName#’, it returns (‘MyProcessorName’, None) but emits a warning informing of a possible malformed name.

The processor name must be a valid Python identifier (class name).

Parameters:

processor_string (str) – The processor name string to parse.

Returns:

A tuple of (name, replica_id) where replica_id can be None.

Return type:

tuple[str, str | None]

Raises:

UnknownProcessor – if the name part is empty or not a valid Python identifier