First steps

Installation

MAFw can be installed using pip in a separated virtual environment.

Windows

c:\> python -m venv mafw-env
c:\> cd mafw-env
c:\mafw-env> Scripts\activate
(mafw-env)c:\mafw-env> pip install mafw

linux & MacOS

$ python -m venv mafw-env
$ cd mafw-env
$ source bin/activate
(mafw-env) $ pip install mafw

Requirements

MAFw has been developed using python version 3.11 and tested with newer versions up to the current stable release (3.14). Apart from some typing issues, we do not expect problems when running it with older releases. It is our intention to follow the future advancement of python and possibly use the NO-GIL option starting from version 3.14 to improve the overall performances.

Concerning dependencies, all packages required by MAFw are specified in the pyproject file and will be automatically installed by pip. Nevertheless, if you are curious to know what comes with MAFw, here is a list of direct dependencies with the indication of what their role is inside the library.

pluggy (>=1.5): to implement the plugin mechanism and let the users develop their own processors;

click (>=8.1): to implement the command line interface for the mafw execution engine;

tomlkit (>0.15): to implement the reading and writing of steering files;

peewee (>4.0): to implement the ORM database interface;

Deprecated (>1.2): to inform the user about deprecated usages;

typing-extensions (>4.13 only for python <=3.11) to have access to typing annotations.

If MAFw is installed with the additional features provided with by seaborn, then those packages will also be installed.

seaborn (>=0.13): to implement the generation of high level graphical outputs;

matplotlib (>=3.1): the low level graphics interface;

pandas[hdf5] (>=2.2): to allow the use of dataframes for data manipulations.

By default MAFw comes with an abstract plotting interface. If you want to use seaborn, then just install the optional dependency pip install mafw[seaborn]

If you also want to install the GUI for the generation of steering files, then PySide6 will also be installed. In this case be sure that you OS system is having all the required libraries (on linux mainly libgl1).

All MAFw dependencies will be automatically installed by pip.

Parallel processing

MAFw is a scientific library dealing with data analysis and as such it is expected to run as fast and as efficient as possible. Python is known not to be the fastest in the scene of programming and scripting language, but its simplicity and large number of available (optimized!) libraries are giving it a top player role despite its reduced speed.

Since a few years already, CPython has started a difficult migration process to get rid of one big obstacle on the speed race, that is the GIL. In 2024 with python 3.13 the so-called free-threading features were still experimental and two releases were created, one with and another without GIL. With python 3.14 the free-threading option is not anymore experimental and in the period 2026-2027 it is expected that there will be one single python binary with the possibility to switch on and off the GIL at runtime (enabled by default). In 2028-2030, if everything goes as expected, new releases of python will come with the GIL disabled by default [PEP-703]

MAFw supported almost from its beginning the cause of achieving higher performance via multi-threading, in particular because Processors are very often running internal loops and thus benefiting a lot from the possibility to assign each loop item to a different thread. You will learn more about parallel for loops in a dedicated section.

Here is a list of known limitations, you should be aware of when using MAFw in free-threading mode:

Python >= 3.14. MAFw is expected to give reliable results, so the authors waited for the first official release of the free-threading version before embarking into this challenging evolution. This means that MAFw parallel processing was never tested with python ~ 3.13. If you want to give it a try, be aware it was never tested in that environment.
Small UI related bug in Python <= 3.14.3. There is a small bug in python <= 3.14.3t. It is nothing serious, it will only affect the way warning messages are displayed in the rich ui. You can for sure survive to this limitation and anyhow the bug has been fixed and it will be included in 3.14.4 during 2026.
Missing libraries. Some of the libraries used by MAFw are based on binaries. It means that you do not have to compile them by your own and pip/hatch/uv will install the pre-built packages making your life easier. At the time of writing (March 2026), PySide6 does not have a pre-built installation package that is compatible with 3.14t. This means that you won’t be able to use the steering-gui tool to generate your steering files. You can in any case install MAFw in two environments, one with 3.14 that you use to operate steering-gui and one with 3.14t where you execute mafw at full speed. Much more relevant is the missing binary installation package for the PostgreSQL driver (both version 2 and 3). So if you want to use PostgreSQL with free-threading and you do not want to build the pyscopg package from source, you will have to go for the pure python installation that is much slower, not really very smart.
No in-memory databases. MySQL and Sqlite are good to go in multi-threaded loop (in theory also PostgreSQL, if you compile the library by yourself). There is only one limitation, you cannot use :memory: Sqlite database when working with multi-threads. Each thread will create its own copy of the memory database resulting in a fresh new database everytime a thread is starting the job. In our opinion this should not affect much your analysis strategy because it is quite unusual to have real production databases operating in memory.

Testing

MAFw comes with an extensive unit test suite of more than 1000 test cases for an overall code coverage of 99%.

Tests have been coded using pytest best practice and are aiming to prove the functionality of each unit of code taken individually. Given the high level of interoperability of MAFw with other libraries (toml, peewee and seaborn just to name a few), unit tests rely heavily on patched objects to assure reproducibility.

Nevertheless full integration tests are also included in the test suite. These tests will cover all relevant aspects of MAFw, including:

Installation of MAFw and of a Plugin project in a isolated environment
Use of MAFw executable to create some data files and analyse them to create a graphical output.
Use of a database to store the collected data.
Check the database trigger functionalities to avoid repeating useless analysis steps, for example when a new file is added, removed or changed.

If you plan to collaborate in the development of MAFw, you must include unit tests for your contributions.

As already mentioned, MAFw is using hatch as project management. In the pyproject.toml file, hatch is configured to have a matrix of test environment in order to run the whole test suite with the supported versions of python (3.11, 3.12 and 3.13).

Running the suite is very easy. Navigate to the folder where you have your local copy of MAFw and type hatch test. Hatch will take care of installing the proper environment and run the tests. Should one or more test(s) fail, then the slow integration tests will be skipped to spare some time.

Have a look at the hatch test options, in particular the -a, to test over all the environments in the matrix and the -c to generate coverage data for the production of a coverage report.

Citing MAFw

If you used MAFw in your research and you would like to acknoledge the project in your academic publication we suggest citing the following paper:

Bulgheroni et al., (2025). MAFw: A Modular Analysis Framework for Streamlining and Optimizing Data Analysis Workflows. Journal of Open Source Software, 10(114), 8449, https://doi.org/10.21105/joss.08449

or as BibTeX format:

@article{Bulgheroni2025,
    doi = {10.21105/joss.08449},
    url = {https://doi.org/10.21105/joss.08449},
    year = {2025},
    publisher = {The Open Journal},
    volume = {10},
    number = {114},
    pages = {8449},
    author = {Bulgheroni, Antonio and Krachler, Michael},
    title = {MAFw: A Modular Analysis Framework for Streamlining and Optimizing Data Analysis Workflows},
    journal = {Journal of Open Source Software}
}