mafw.db.db_filter

Database filter module for MAFW.

This module provides classes and utilities for creating and managing database filters using Peewee ORM. It supports various filtering operations including simple conditions, logical combinations, and conditional filters where one field’s criteria depend on another.

The module implements a flexible filter system that can handle:
  • Simple field comparisons (equality, inequality, greater/less than, etc.)

  • Complex logical operations (AND, OR, NOT)

  • Conditional filters with dependent criteria

  • Nested logical expressions

  • Support for various data types and operations

Key components include:

The module uses a hierarchical approach to build filter expressions that can be converted into Peewee expressions for database queries. It supports both simple and complex filtering scenarios through a combination of direct field conditions and logical expressions.

Changed in version v2.0.0: Major overhaul introducing conditional filters and logical expression support.

Example usage:

from mafw.db.db_filter import ModelFilter

# Create a simple filter
flt = ModelFilter(
    'Processor.__filter__.Model',
    field1='value1',
    field2={'op': 'IN', 'value': [1, 2, 3]},
)

# Bind to a model and generate query
flt.bind(MyModel)
query = MyModel.select().where(flt.filter())

See also

peewee - The underlying ORM library used for database operations

LogicalOp - Logical operation enumerations used in filters

Module Attributes

Token

Type definition for a logical expression token

NameNode

An atom is a tuple of the literal string 'NAME' and the value

NotNode

A NOT node is a tuple of 'NOT' and a recursive node

BinaryNode

AND/OR nodes are tuples of the operator and two recursive nodes

ExprNode

The main recursive type combining all options

TOKEN_SPECIFICATION

Token specifications

MASTER_RE

Compiled regular expression to interpret the logical expression grammar

Functions

tokenize(text)

Tokenize a logical expression string into a list of tokens.

Classes

ConditionNode(field, operation, value[, name])

Represents a single condition node in a filter expression.

ConditionalFilterCondition(condition_field, ...)

Represents a conditional filter where one field's criteria depends on another.

ConditionalNode(conditional[, name])

Wraps ConditionalFilterCondition behaviour as a FilterNode.

ExprParser(text)

Recursive descent parser producing a simple Abstract Syntax Tree (AST).

FilterNode()

Abstract base for nodes.

LogicalNode(op, *children)

Logical combination of child nodes.

ModelFilter(name_, **kwargs)

Class to filter rows from a model.

ProcessorFilter([data])

A special dictionary to store all Filters in a processors.

Exceptions

ParseError

Exception raised when parsing a logical expression fails.

exception mafw.db.db_filter.ParseError[source]

Bases: ValueError

Exception raised when parsing a logical expression fails.

This exception is raised when the parser encounters invalid syntax in a logical expression string.

class mafw.db.db_filter.ConditionNode(field: str | None, operation: LogicalOp | str, value: Any, name: str | None = None)[source]

Bases: FilterNode

Represents a single condition node in a filter expression.

This class encapsulates a single filtering condition that can be applied to a model field. It supports various logical operations through the LogicalOp enumerator or string representations of operations.

Added in version v2.0.0.

Initialize a condition node.

Parameters:
  • field (str | None) – The name of the field to apply the condition to.

  • operation (LogicalOp | str) – The logical operation to perform.

  • value (Any) – The value to compare against.

  • name (str | None, Optional) – Optional name for this condition node.

to_expression(model: type[Model]) Expression[source]

Convert this condition node to a Peewee expression.

This method translates the condition represented by this node into a Peewee expression that can be used in database queries.

Parameters:

model (type[Model]) – The model class containing the field to filter.

Returns:

A Peewee expression representing this condition.

Return type:

peewee.Expression

Raises:
  • RuntimeError – If the node has no field to evaluate.

  • ValueError – If an unsupported operation is specified.

  • TypeError – If operation requirements are not met (e.g., IN operation requires list/tuple).

class mafw.db.db_filter.ConditionalFilterCondition(condition_field: str, condition_op: str | LogicalOp, condition_value: Any, then_field: str, then_op: str | LogicalOp, then_value: Any, else_field: str | None = None, else_op: str | LogicalOp | None = None, else_value: Any | None = None, name: str | None = None)[source]

Bases: object

Represents a conditional filter where one field’s criteria depends on another.

This allows expressing logic like: “IF field_a IN [x, y] THEN field_b IN [1, 2] ELSE no constraint on field_b”

Example usage:

# Filter: sample_id in [1,2] if composite_image_id in [100,101]
condition = ConditionalFilterCondition(
    condition_field='composite_image_id',
    condition_op='IN',
    condition_value=[100, 101],
    then_field='sample_id',
    then_op='IN',
    then_value=[1, 2],
)

# This generates:
# WHERE (composite_image_id IN (100, 101) AND sample_id IN (1, 2))
#    OR (composite_image_id NOT IN (100, 101))

Initialise a conditional filter condition.

Parameters:
  • condition_field (str) – The field to check for the condition

  • condition_op (str | LogicalOp) – The operation for the condition (e.g., ‘IN’, ‘==’)

  • condition_value (Any) – The value(s) for the condition

  • then_field (str) – The field to filter when condition is true

  • then_op (str | LogicalOp) – The operation to apply when condition is true

  • then_value (Any) – The value(s) for the then clause

  • else_field (str | None) – Optional field to filter when condition is false

  • else_op (str | LogicalOp | None) – Optional operation when condition is false

  • else_value (Any | None) – Optional value(s) for the else clause

  • name (str | None, Optional) – The name of this condition. Avoid name clashing with model fields. Defaults to None

to_expression(model: type[Model]) Expression[source]

Convert this conditional filter to a Peewee expression.

The resulting expression is: (condition AND then_constraint) OR (NOT condition AND else_constraint)

Which logically means:

  • When condition is true, apply then_constraint

  • When condition is false, apply else_constraint (or no constraint)

Parameters:

model (type[Model]) – The model class containing the fields

Returns:

A Peewee expression

Return type:

peewee.Expression

class mafw.db.db_filter.ConditionalNode(conditional: ConditionalFilterCondition, name: str | None = None)[source]

Bases: FilterNode

Wraps ConditionalFilterCondition behaviour as a FilterNode.

This class serves as an adapter to integrate conditional filter conditions into the filter node hierarchy, allowing them to be treated uniformly with other filter nodes during expression evaluation.

Added in version v2.0.0.

Initialize a conditional node.

Parameters:
  • conditional (ConditionalFilterCondition) – The conditional filter condition to wrap

  • name (str | None, Optional) – Optional name for this conditional node

to_expression(model: type[Model]) Expression[source]

Convert this conditional node to a Peewee expression.

This method delegates the conversion to the wrapped conditional filter condition’s to_expression() method.

Parameters:

model (type[Model]) – The model class to generate the expression for

Returns:

A Peewee expression representing this conditional node

Return type:

peewee.Expression

class mafw.db.db_filter.ExprParser(text: str)[source]

Bases: object

Recursive descent parser producing a simple Abstract Syntax Tree (AST).

The parser handles logical expressions with the following grammar:

expr    := or_expr
or_expr := and_expr ("OR" and_expr)*
and_expr:= not_expr ("AND" not_expr)*
not_expr:= "NOT" not_expr | atom
atom    := NAME | "(" expr ")"

AST nodes are tuples representing different constructs:

  • (“NAME”, “token”): A named element (field name or filter name)

  • (“NOT”, node): A negation operation

  • (“AND”, left, right): An AND operation between two nodes

  • (“OR”, left, right): An OR operation between two nodes

Added in version v2.0.0.

Initialize the expression parser with a logical expression string.

Parameters:

text (str) – The logical expression to parse

accept(*kinds: str) tuple[str, str] | None[source]

Accept and consume the next token if it matches one of the given types.

Parameters:

kinds (str) – Token types to accept

Returns:

The consumed token if matched, otherwise None

Return type:

Token | None

expect(kind: str) tuple[str, str][source]

Expect and consume a specific token type.

Parameters:

kind (str) – The expected token type

Returns:

The consumed token

Return type:

Token

Raises:

ParseError – If the expected token is not found

parse() tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode][source]

Parse the entire logical expression and return the resulting AST.

Returns:

The abstract syntax tree representation of the expression

Return type:

ExprNode

Raises:

ParseError – If the expression is malformed

parse_and() tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode][source]

Parse an AND expression.

Returns:

The parsed AND expression AST node

Return type:

ExprNode

parse_atom() tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode][source]

Parse an atomic expression (NAME or parenthesised expression).

Returns:

The parsed atomic expression AST node

Return type:

ExprNode

Raises:

ParseError – If an unexpected token is encountered

parse_not() tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode][source]

Parse a NOT expression.

Returns:

The parsed NOT expression AST node

Return type:

ExprNode

parse_or() tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode][source]

Parse an OR expression.

Returns:

The parsed OR expression AST node

Return type:

ExprNode

peek() tuple[str, str] | None[source]

Peek at the next token without consuming it.

Returns:

The next token if available, otherwise None

Return type:

Token | None

class mafw.db.db_filter.FilterNode[source]

Bases: object

Abstract base for nodes.

class mafw.db.db_filter.LogicalNode(op: str, *children: FilterNode)[source]

Bases: FilterNode

Logical combination of child nodes.

This class represents logical operations (AND, OR, NOT) applied to filter nodes. It enables building complex filter expressions by combining simpler filter nodes with logical operators.

Added in version v2.0.0.

Initialize a logical node.

Parameters:
  • op (str) – The logical operation (‘AND’, ‘OR’, ‘NOT’)

  • children (FilterNode) – Child filter nodes to combine with the logical operation

to_expression(model: type[Model]) Expression | bool[source]

Convert this logical node to a Peewee expression.

This method evaluates the logical operation on the child nodes and returns the corresponding Peewee expression.

Parameters:

model (type[Model]) – The model class to generate the expression for

Returns:

A Peewee expression representing this logical node

Return type:

peewee.Expression | bool

Raises:

ValueError – If an unknown logical operation is specified

class mafw.db.db_filter.ModelFilter(name_: str, **kwargs: Any)[source]

Bases: object

Class to filter rows from a model.

The filter object can be used to generate a where clause to be applied to Model.select().

The construction of a ModelFilter is normally done via a configuration file using the from_conf() class method. The name of the filter is playing a key role in this. If it follows a dot structure like:

ProcessorName.__filter__.ModelName

then the corresponding table from the TOML configuration object will be used.

For each processor, there might be many Filters, up to one for each Model used to get the input list. If a processor is joining together three Models when performing the input select, there will be up to three Filters collaborating on making the selection.

The filter configuration can contain the following key, value pair:

  • key / string pairs, where the key is the name of a field in the corresponding Model

  • key / numeric pairs

  • key / arrays

  • key / dict pairs with ‘op’ and ‘value’ keys for explicit operation specification

All fields from the configuration file will be added to the instance namespace, thus accessible with the dot notation. Moreover, the field names and their filter value will be added to a private dictionary to simplify the generation of the filter SQL code.

The user can use the filter object to store selection criteria. He can construct queries using the filter contents in the same way as he could use processor parameters.

If he wants to automatically generate valid filtering expression, he can use the filter() method. In order for this to work, the ModelFilter object be bound to a Model. Without this binding the ModelFilter will not be able to automatically generate expressions.

For each field in the filter, one condition will be generated according to the following scheme:

Filter field type

Logical operation

Example

Numeric, boolean

==

Field == 3.14

String

GLOB

Field GLOB ‘*ree’

List

IN

Field IN [1, 2, 3]

Dict (explicit)

op from dict

Field BIT_AND 5

All conditions will be joined with a AND logic by default, but this can be changed.

The ModelFilter also supports logical expressions to combine multiple filter conditions using AND, OR, and NOT operators. These expressions can reference named filter conditions within the same filter or even combine conditions from different filters when used with ProcessorFilter.

Conditional filters allow expressing logic like: “IF field_a IN [x, y] THEN field_b IN [1, 2] ELSE no constraint on field_b”

Consider the following example:

 1class MeasModel(MAFwBaseModel):
 2    meas_id = AutoField(primary_key=True)
 3    sample_name = TextField()
 4    successful = BooleanField()
 5    flags = IntegerField()
 6    composite_image_id = IntegerField()
 7    sample_id = IntegerField()
 8
 9
10# Traditional simplified usage
11flt = ModelFilter(
12    'MyProcessor.__filter__.MyModel',
13    sample_name='sample_00*',
14    meas_id=[1, 2, 3],
15    successful=True,
16)
17
18# New explicit operation usage
19flt = ModelFilter(
20    'MyProcessor.__filter__.MyModel',
21    sample_name={'op': 'LIKE', 'value': 'sample_00%'},
22    flags={'op': 'BIT_AND', 'value': 5},
23    meas_id={'op': 'IN', 'value': [1, 2, 3]},
24)
25
26# Logical expression usage
27flt = ModelFilter(
28    'MyProcessor.__filter__.MyModel',
29    sample_name={'op': 'LIKE', 'value': 'sample_00%'},
30    flags={'op': 'BIT_AND', 'value': 5},
31    meas_id={'op': 'IN', 'value': [1, 2, 3]},
32    __logic__='sample_name AND (flags OR meas_id)',
33)
34
35# Conditional filter usage
36flt = ModelFilter(
37    'MyProcessor.__filter__.MyModel',
38    sample_name='sample_00*',
39    composite_image_id=[100, 101],
40    sample_id=[1, 2],
41    __conditional__=[
42        {
43            'condition_field': 'composite_image_id',
44            'condition_op': 'IN',
45            'condition_value': [100, 101],
46            'then_field': 'sample_id',
47            'then_op': 'IN',
48            'then_value': [1, 2],
49        }
50    ],
51)
52
53flt.bind(MeasModel)
54filtered_query = MeasModel.select().where(flt.filter())

The explicit operation format allows for bitwise operations and other advanced filtering.

TOML Configuration Examples:

[MyProcessor.__filter__.MyModel]
sample_name = "sample_00*"  # Traditional GLOB
successful = true           # Traditional equality

# Explicit operations
flags = { op = "BIT_AND", value = 5 }
score = { op = ">=", value = 75.0 }
category = { op = "IN", value = ["A", "B", "C"] }
date_range = { op = "BETWEEN", value = ["2024-01-01", "2024-12-31"] }

# Logical expression for combining conditions
__logic__ = "sample_name AND (successful OR flags)"

# Conditional filters
[[MyProcessor.__filter__.MyModel.__conditional__]]
condition_field = "composite_image_id"
condition_op = "IN"
condition_value = [100, 101]
then_field = "sample_id"
then_op = "IN"
then_value = [1, 2]

# Nested conditions with logical expressions
[MyProcessor.__filter__.MyModel.nested_conditions]
__logic__ = "a OR b"
a = { op = "LIKE", value = "test%" }
b = { op = "IN", value = [1, 2, 3] }

See also

Constructor parameters:

Parameters:
  • name_ (str) – The name of the filter. It should be in dotted format to facilitate the configuration via the steering file. The _ is used to allow the user to have a keyword argument named name.

  • kwargs – Keyword parameters corresponding to fields and filter values.

Changed in version v1.2.0: The parameter name has been renamed as name_.

Changed in version v1.3.0: Implementation of explicit operation.

Changed in version v2.0.0: Introduction of conditional filters, logical expression and hierarchical structure. Introduction of autobinding for MAFwBaseModels

classmethod from_conf(name: str, conf: dict[str, Any]) Self[source]

Builds a Filter object from a steering file dictionary.

If the name is in dotted notation, then this should be corresponding to the table in the configuration file. If a default configuration is provided, this will be used as a starting point for the filter, and it will be updated by the actual configuration in conf.

In normal use, you would provide the specific configuration via the conf parameter.

See details in the class documentation

Parameters:
  • name (str) – The name of the filter in dotted notation.

  • conf (dict) – The configuration dictionary.

Returns:

A Filter object

Return type:

ModelFilter

static _create_condition_node_from_value(value: Any, field_name: str, node_name: str | None = None) ConditionNode[source]

Create a FilterCondition based on value type (backward compatibility).

Parameters:
  • value – The filter value

  • field_name – The field name

Returns:

A FilterCondition

_build_logical_node_from_ast(ast: tuple[Literal['NAME'], str] | tuple[Literal['NOT'], tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode]] | tuple[Literal['AND', 'OR'], tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode], tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode]], name_to_nodes: Dict[str, FilterNode], model_name_placeholder: str | None = None) FilterNode[source]

Recursively build LogicalNode from AST using a mapping name->FilterNode.

_evaluate_logic_ast(ast: tuple[Literal['NAME'], str] | tuple[Literal['NOT'], tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode]] | tuple[Literal['AND', 'OR'], tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode], tuple[Literal['NAME'], str] | tuple[Literal['NOT'], ExprNode] | tuple[Literal['AND', 'OR'], ExprNode, ExprNode]]) Expression | bool[source]

Evaluate an abstract syntax tree (AST) representing a logical expression.

This method recursively evaluates the AST nodes to produce a Peewee expression or boolean value representing the logical combination of filter conditions.

Parameters:

ast (Any) – The abstract syntax tree node to evaluate

Returns:

A Peewee expression for logical operations or boolean True/False

Return type:

peewee.Expression | bool

Raises:
  • KeyError – If a referenced condition name is not found in the filter

  • ValueError – If an unsupported AST node type is encountered

add_conditional(conditional: ConditionalFilterCondition) None[source]

Add a conditional filter.

Added in version v2.0.0.

Parameters:

conditional (ConditionalFilterCondition) – The conditional filter condition

add_conditional_from_dict(config: dict[str, Any]) None[source]

Add a conditional filter from a configuration dictionary.

Added in version v2.0.0.

Parameters:

config (dict[str, Any]) – Dictionary with conditional filter configuration

bind(model: type[Model]) None[source]

Connects a filter to a Model class.

Parameters:

model (Model) – Model to be bound.

filter(join_with: Literal['AND', 'OR'] = 'AND') Expression | bool[source]

Generates a filtering expression joining all filtering fields.

See details in the class documentation

Changed in version v1.3.0: Add the possibility to specify a join_with function

Changed in version v2.0.0: Add support for conditional filters and for logical expression

Parameters:

join_with (Literal['AND', 'OR'], default 'AND') – How to join conditions (‘AND’ or ‘OR’). Defaults to ‘AND’.

Returns:

The filtering expression.

Return type:

peewee.Expression | bool

Raises:
  • TypeError – when the field value type is not supported.

  • ValueError – when join_with is not ‘AND’ or ‘OR’.

conditional_name = '__conditional__'

The conditional keyword identifier.

This value cannot be used as field name in the filter bound model.

property is_bound: bool

Returns true if the ModelFilter has been bound to a Model

logic_name = '__logic__'

The logic keyword identifier.

This value cannot be used as field name in the filter bound model.

class mafw.db.db_filter.ProcessorFilter(data: dict[str, ModelFilter] | None = None, /, **kwargs: Any)[source]

Bases: UserDict[str, ModelFilter]

A special dictionary to store all Filters in a processors.

It contains a publicly accessible dictionary with the configuration of each ModelFilter using the Model name as keyword.

It contains a private dictionary with the global filter configuration as well. The global filter is not directly accessible, but only some of its members will be exposed via properties. In particular, the new_only flag that is relevant only at the Processor level can be accessed directly using the new_only. If not specified in the configuration file, the new_only is by default True.

It is possible to assign a logic operation string to the register that is used to join all the filters together when performing the filter_all(). If no logic operation string is provided, the register will provide a join condition using either AND (default) or OR.

Constructor parameters:

Parameters:
  • data (dict) – Initial data

  • kwargs – Keywords arguments

bind_all(models: list[type[Model]] | dict[str, type[Model]]) None[source]

Binds all filters to their models.

The models list or dictionary should contain a valid model for all the ModelFilters in the registry. In the case of a dictionary, the key value should be the model name.

Parameters:

models (list[type(Model)] | dict[str,type(Model)]) – List or dictionary of a databank of Models from which the ModelFilter can be bound.

filter_all(join_with: Literal['AND', 'OR'] = 'AND') Expression | bool[source]

Generates a where clause joining all filters.

If a logic expression is present, it will be used to combine named filters. Otherwise, fall back to the legacy behaviour using join_with.

Raises:

ValueError – If the parsing of the logical expression fails

Parameters:

join_with (Literal['AND', 'OR'], default: 'AND') – Logical function to join the filters if no logic expression is provided.

Returns:

ModelFilter expression

Return type:

peewee.Expression

property new_only: bool

The new only flag.

Returns:

True, if only new items, not already in the output database table must be processed.

Return type:

bool

mafw.db.db_filter.tokenize(text: str) list[tuple[str, str]][source]

Tokenize a logical expression string into a list of tokens.

This function breaks down a logical expression string into individual tokens based on the defined token specifications. It skips whitespace and raises a ParseError for unexpected characters.

Parameters:

text (str) – The logical expression string to tokenize

Returns:

A list of tokens represented as (token_type, token_value) tuples

Return type:

list[Token]

Raises:

ParseError – If an unexpected character is encountered in the text

mafw.db.db_filter.BinaryNode

AND/OR nodes are tuples of the operator and two recursive nodes

alias of tuple[Literal[‘AND’, ‘OR’], ExprNode, ExprNode]

mafw.db.db_filter.ExprNode

The main recursive type combining all options

This type represents the abstract syntax tree (AST) nodes used in logical expressions. It can be one of:

  • NameNode: A named element (field name or filter name)

  • NotNode: A negation operation

  • BinaryNode: An AND/OR operation between two nodes

alias of tuple[Literal[‘NAME’], str] | tuple[Literal[‘NOT’], ExprNode] | tuple[Literal[‘AND’, ‘OR’], ExprNode, ExprNode]

mafw.db.db_filter.MASTER_RE = re.compile('(?P<LPAREN>\\()|(?P<RPAREN>\\))|(?P<AND>\\bAND\\b)|(?P<OR>\\bOR\\b)|(?P<NOT>\\bNOT\\b)|(?P<NAME>[A-Za-z_][A-Za-z0-9_\\.]*(?:\\:[A-Za-z_][A-Za-z0-9_]*)?)|(?P<SKIP>[ \\t\\n\\r]+)|(?P<MISMATCH>.)')

Compiled regular expression to interpret the logical expression grammar

mafw.db.db_filter.NameNode

An atom is a tuple of the literal string ‘NAME’ and the value

alias of tuple[Literal[‘NAME’], str]

mafw.db.db_filter.NotNode

A NOT node is a tuple of ‘NOT’ and a recursive node

alias of tuple[Literal[‘NOT’], ExprNode]

mafw.db.db_filter.TOKEN_SPECIFICATION = [('LPAREN', '\\('), ('RPAREN', '\\)'), ('AND', '\\bAND\\b'), ('OR', '\\bOR\\b'), ('NOT', '\\bNOT\\b'), ('NAME', '[A-Za-z_][A-Za-z0-9_\\.]*(?:\\:[A-Za-z_][A-Za-z0-9_]*)?'), ('SKIP', '[ \\t\\n\\r]+'), ('MISMATCH', '.')]

Token specifications

mafw.db.db_filter.Token

Type definition for a logical expression token

alias of tuple[str, str]