Some notes on setting up pypomp and a related workflow¶
These are based on discussions in the Fall, 2024, pypomp group meetings.
Why ipynb?¶
These documents are developed in ipynb format, written using Jupyter Lab. Sofware-focused projects might use sphinx and readthedocs. At this point, pypomp is primarily being developed as a research tool, with software engineering principles being applied to strengthen the research agenda. The statistics community is familiar with ipynb, and it is suitable for data analysis and methodology research projects. Thus, we are using it also for tutorials.
Python distribution¶
Using an up-to-date Anaconda distribution is a standard data science approach, and that is what we do here. For working on the University of Michigan greatlakes cluster, it turned out to be better to use a Python virtual environment.
from sys import version
sys.version
'3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 10:07:17) [Clang 14.0.6 ]'
Some packages we use (pandas, numpy, matplotlib, seaborn, scipy, pytest) come with Anaconda. Others we must install ourselves. This test is run in pympomp 0.0.2 which has a tensorflow dependency that is removed in 0.0.3.
%%capture
%pip install tensorflow tensorflow_probability jax
This is appropriate for testing in a CPU environment.
To use NVIDIA GPUs on a Linux machine, we woupd need
pip install -U "jax[cuda12]"
and something similar for tensorflow and tensorflow_probability.
A subsequent document on GPU setup should be linked here.
And last, but not least,
%%capture
%pip install pypomp
An initial test of pypomp¶
Within the pypomp project directory, we can run
pytest test
to run all the tests in the test directory.
!pytest ~/git/pypomp/test
============================= test session starts ============================== platform darwin -- Python 3.12.4, pytest-7.4.4, pluggy-1.0.0 rootdir: /Users/ionides/git plugins: anyio-4.2.0, xdist-3.6.1 collected 185 items ../pypomp/test/test_fit.py ......... [ 4%] ../pypomp/test/test_fit_internal.py .................................... [ 24%] ....... [ 28%] ../pypomp/test/test_mop.py ... [ 29%] ../pypomp/test/test_mop_internal.py .......................... [ 43%] ../pypomp/test/test_perfilter.py ... [ 45%] ../pypomp/test/test_perfilter_internal.py ............................ [ 60%] ../pypomp/test/test_pfilter.py ... [ 62%] ../pypomp/test/test_pfilter_internal.py .......................... [ 76%] ../pypomp/test/test_pfilter_pf.py ... [ 77%] ../pypomp/test/test_pfilter_pf_internal.py ......................... [ 91%] ../pypomp/test/test_pompclass.py ................ [100%] ======================== 185 passed in 77.61s (0:01:17) ========================
This pypomp version runs with no warnings.
The warnings on the previous version all related to installed dependencies, not pypomp directly.
It seems that no action is required apart from routine updating all packages.
Speeding up testing with parallelization¶
%%capture
%pip install pytest-xdist
!pytest -n auto ~/git/pypomp/test
============================= test session starts ============================== platform darwin -- Python 3.12.4, pytest-7.4.4, pluggy-1.0.0 rootdir: /Users/ionides/git plugins: anyio-4.2.0, xdist-3.6.1 10 workers [185 items] 1m ........................................................................ [ 38%] ........................................................................ [ 77%] ......................................... [100%] ============================= 185 passed in 15.47s =============================
On a 10-processor machine, here we get a 5-fold improvement in speed by using the xdist plugin for pytest.
We lose some details for the test output.
A self-contained test of pypomp can follow the workflow used in
pypomp/.github/workflows/test-package.yml
as follows:
conda create --name test
conda activate test
conda install python
python -m pip install --upgrade pip
python -m pip install pytest
cd ~/git/pypomp
pip install -r requirements.txt
pytest
# tidy up afterward
conda deactivate
conda remove -n test --all
A similar workflow arises if you make a fork of pypomp and push to activate the GitHub action tests.
Testing a rebuild of the package¶
A bit of care is needed to check whether you are using the local version of the package from the source files, or from the local build. Also, there is a danger that you might accidentally find a version from PyPI when you are trying to use a local version.
If you work outside the package directory, python cannot accidentally use the source files.
Here is a workflow:
# stay in the home directory, assuming the repo is cloned in ~/git
cd
conda create --name test python
conda activate test
pip install build
python -m build ~/git/pypomp
pip install git/pypomp/dist/pypomp-0.0.3-py3-none-any.whl
pip install jax
pip install tqdm
pip install pytest
pytest ~/git/pypomp
# tidy up afterward
conda deactivate
conda remove -n test --all