Python User Site Packages Disabled | Generated by AI
The package is already installed in your user directory (~/.local/lib/python3.13/site-packages), but when you run the script with python3 ./scripts/train/extract_wiki.py, Python is not looking in the user site-packages directory for some reason.
This is a very common issue with Python 3.11+ on many Linux distributions (Ubuntu 22.04/24.04, Debian, Arch, etc.) because the default python3 often has the environment variable PYTHONNOUSERSITE=1 set or the -I (isolated) flag enabled by the system package manager, which disables the user site-packages path completely.
Quick ways to verify this is the problem
Run these two commands:
python3 -c "import sys, pathlib; print('user site:', pathlib.Path(sys.path[-1]).resolve() if 'site-packages' in sys.path[-1] else 'DISABLED')"
python3 -c "import site; print(site.ENABLE_USER_SITE)"
If you see DISABLED or False or None, that’s exactly the issue.
Solutions (pick one)
Option 1 – Use python3 -m pip and run with the same Python (recommended)
# Always use the module form so you’re 100% sure which Python you’re installing to
python3 -m pip install --user --upgrade mediawiki-dump
# Then run your script with the same Python
python3 ./scripts/train/extract_wiki.py
If that still fails, force user site on:
python3 -s ./scripts/train/extract_wiki.py
# or
PYTHONNOUSERSITE=0 python3 ./scripts/train/extract_wiki.py
Option 2 – Use a virtual environment (best practice anyway)
cd /home/lzw/projects/blog-source
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install mediawiki-dump # will go into the venv, no permission problems
python ./scripts/train/extract_wiki.py # now it will definitely find it
Option 3 – Force-enable user site permanently (not recommended for system Python)
Add this to your ~/.bashrc or ~/.zshrc:
export PYTHONNOUSERSITE=
or create ~/.config/pythonstartup.py with:
import site
import os
os.environ["PYTHONNOUSERSITE"] = ""
Fast one-liner test right now
Just run this in your current shell:
python3 -s -c "from mediawiki_dump.tokenizer import WikiTokenizer; print('Works!')"
If that prints “Works!”, then the fix is simply to always use python3 -s or create a venv.
Do the venv route (Option 2) — it will save you endless headaches with every other package too. Let me know if you hit any new error after that!