-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add portable deployment setup and cross-platform build tools
- Update .gitignore: exclude test data, workspace files, and Windows path artifacts - Add portable_setup/ directory with deployment scripts and guides for USB drive distribution - Add Windows build scripts: setup_windows_build.ps1, WINDOWS_BUILD_QUICKSTART.txt - Add cross-platform build documentation: BUILD_WINDOWS_EXECUTABLE.md, PORTABLE_TESTING_APPROACH.md - Add src/utils/ module with import_helper and secure_fs for platform compatibility - Add test volume creation scripts: create_hathitrust_volumes.py, create_test_volumes.py, test_sequence.py - Update core modules with cross-platform improvements and bug fixes - Add CODE_REVIEW_ACTION_ITEMS.md for tracking technical debt
- Loading branch information
Showing
31 changed files
with
3,946 additions
and
69 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
================================================================================ | ||
QUICK START: BUILD WINDOWS EXECUTABLE | ||
================================================================================ | ||
|
||
YOU ARE HERE: WSL built Linux executable, need Windows .exe for testing | ||
SOLUTION: Build on Windows using Python on Windows | ||
|
||
================================================================================ | ||
FASTEST PATH (30 minutes total) | ||
================================================================================ | ||
|
||
STEP 1: Install Python on Windows (10 min) | ||
------------------------------------------- | ||
1. Download: https://www.python.org/downloads/windows/ | ||
Get: python-3.12.7-amd64.exe | ||
2. Run installer | ||
3. ✓ CHECK: "Add Python 3.12 to PATH" | ||
4. Click "Install Now" | ||
|
||
STEP 2: Copy Project to Windows (2 min) | ||
---------------------------------------- | ||
Open Windows File Explorer: | ||
From: \\wsl$\Ubuntu\home\schipp0\Digitization\HathiTrust | ||
To: C:\HathiTrust | ||
|
||
Or in PowerShell: | ||
xcopy \\wsl$\Ubuntu\home\schipp0\Digitization\HathiTrust C:\HathiTrust /E /I /H | ||
|
||
STEP 3: Run Setup Script (10 min) | ||
---------------------------------- | ||
1. Open PowerShell in C:\HathiTrust | ||
2. If execution policy error: | ||
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser | ||
3. Run: | ||
.\setup_windows_build.ps1 | ||
4. When prompted, answer 'y' to build now | ||
|
||
STEP 4: Copy to Flash Drive (2 min) | ||
------------------------------------ | ||
xcopy C:\HathiTrust\dist\HathiTrust-Automation D:\HathiTrust-Automation /E /I /H | ||
|
||
STEP 5: Test (1 min) | ||
-------------------- | ||
D:\RUN_ME.bat | ||
|
||
DONE! ✓ | ||
|
||
================================================================================ | ||
WHAT THE SETUP SCRIPT DOES | ||
================================================================================ | ||
|
||
1. Checks Python installed | ||
2. Creates virtual environment (venv) | ||
3. Activates venv | ||
4. Installs: PyQt6, Pillow, PyYAML, pytesseract, pyinstaller | ||
5. Offers to build executable immediately | ||
|
||
ALL AUTOMATIC! | ||
|
||
================================================================================ | ||
MANUAL METHOD (if script fails) | ||
================================================================================ | ||
|
||
In PowerShell at C:\HathiTrust: | ||
|
||
python -m venv venv | ||
.\venv\Scripts\Activate.ps1 | ||
python -m pip install --upgrade pip | ||
pip install PyQt6 Pillow PyYAML pytesseract pyinstaller | ||
python build_scripts/build_windows.py | ||
|
||
================================================================================ | ||
FILES CREATED | ||
================================================================================ | ||
|
||
After build: | ||
C:\HathiTrust\dist\HathiTrust-Automation\HathiTrust-Automation.exe ← Windows! | ||
|
||
After copy: | ||
D:\HathiTrust-Automation\HathiTrust-Automation.exe ← Ready to test! | ||
|
||
================================================================================ | ||
TROUBLESHOOTING | ||
================================================================================ | ||
|
||
"python is not recognized" | ||
→ Reinstall Python, check "Add to PATH" | ||
|
||
"cannot be loaded because running scripts is disabled" | ||
→ Run: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser | ||
|
||
Build fails | ||
→ Check all dependencies installed: pip list | ||
→ Try: pip install pyinstaller --force-reinstall | ||
|
||
.exe crashes when run | ||
→ Install Tesseract on Windows | ||
→ Run from command line to see errors | ||
|
||
================================================================================ | ||
WHY THIS IS NECESSARY | ||
================================================================================ | ||
|
||
PyInstaller creates executables for the OS it runs on: | ||
- WSL (Linux) → Linux executable (no .exe) | ||
- Windows → Windows executable (.exe) | ||
- macOS → macOS app bundle | ||
|
||
You can't cross-compile. Must build on target OS. | ||
|
||
================================================================================ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
""" | ||
Generate HathiTrust-compliant test volumes with barcode identifiers | ||
""" | ||
|
||
import os | ||
from PIL import Image, ImageDraw | ||
|
||
def create_hathitrust_test_volumes(base_dir='test_volumes_barcode'): | ||
"""Create test volumes with proper HathiTrust barcode identifiers""" | ||
|
||
os.makedirs(base_dir, exist_ok=True) | ||
print(f"Creating HathiTrust test volumes in: {os.path.abspath(base_dir)}") | ||
print() | ||
|
||
# Use realistic barcode identifiers (14-digit barcodes) | ||
volumes = [ | ||
('39015012345678', 5, 'Small volume - barcode format'), | ||
('39015087654321', 10, 'Medium volume - barcode format'), | ||
('39015011111111', 15, 'Large volume - barcode format'), | ||
] | ||
|
||
for barcode, num_pages, description in volumes: | ||
vol_dir = os.path.join(base_dir, barcode) | ||
os.makedirs(vol_dir, exist_ok=True) | ||
|
||
print(f"Creating {barcode}: {description}") | ||
|
||
for page_num in range(1, num_pages + 1): | ||
# Create test image | ||
img = Image.new('RGB', (800, 1000), color='white') | ||
draw = ImageDraw.Draw(img) | ||
|
||
# Draw border | ||
draw.rectangle([50, 50, 750, 950], outline='black', width=3) | ||
|
||
# Add text content | ||
text_lines = [ | ||
f'BARCODE: {barcode}', | ||
f'Page {page_num} of {num_pages}', | ||
'', | ||
'HathiTrust Package Test Volume', | ||
'=' * 40, | ||
'', | ||
'Sample Text for OCR Testing', | ||
'', | ||
'Lorem ipsum dolor sit amet, consectetur', | ||
'adipiscing elit. Sed do eiusmod tempor', | ||
'incididunt ut labore et dolore magna.', | ||
'', | ||
'This volume uses proper HathiTrust', | ||
'naming conventions with a 14-digit', | ||
'barcode identifier.', | ||
'', | ||
f'Sequence: {page_num:08d}', | ||
] | ||
|
||
y_pos = 120 | ||
for line in text_lines: | ||
draw.text((100, y_pos), line, fill='black') | ||
y_pos += 45 | ||
|
||
# Add page number at bottom | ||
draw.text((350, 920), f'- {page_num} -', fill='black') | ||
|
||
# Save with HathiTrust naming: 00000001.tif (just the sequence number) | ||
filename = f'{page_num:08d}.tif' | ||
filepath = os.path.join(vol_dir, filename) | ||
img.save(filepath, 'TIFF', compression='none') | ||
|
||
print(f" ✓ Created {num_pages} pages: {barcode}/00000001.tif to {barcode}/{num_pages:08d}.tif") | ||
|
||
print() | ||
print("=" * 60) | ||
print("HathiTrust test volumes created successfully!") | ||
print("=" * 60) | ||
print(f"Location: {os.path.abspath(base_dir)}") | ||
print(f"Total volumes: {len(volumes)}") | ||
print(f"Total pages: {sum(v[1] for v in volumes)}") | ||
print() | ||
print("Volume structure (HathiTrust compliant):") | ||
for barcode, num_pages, _ in volumes: | ||
print(f" {barcode}/ (barcode identifier)") | ||
print(f" 00000001.tif") | ||
print(f" 00000002.tif") | ||
print(f" ...") | ||
print(f" {num_pages:08d}.tif") | ||
print() | ||
print("ZIP package names will be:") | ||
for barcode, _, _ in volumes: | ||
print(f" {barcode}.zip") | ||
print() | ||
|
||
if __name__ == '__main__': | ||
create_hathitrust_test_volumes() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
""" | ||
Generate test TIFF volumes for HathiTrust testing | ||
Creates multiple volumes with properly named TIFF files | ||
""" | ||
|
||
import os | ||
from PIL import Image, ImageDraw | ||
|
||
def create_test_volumes(base_dir='test_volumes'): | ||
"""Create test volumes with TIFF files""" | ||
|
||
# Create base directory | ||
os.makedirs(base_dir, exist_ok=True) | ||
print(f"Creating test volumes in: {os.path.abspath(base_dir)}") | ||
print() | ||
|
||
# Volume configurations: (volume_name, num_pages, description) | ||
volumes = [ | ||
('volume_1_small', 5, 'Small volume - 5 pages'), | ||
('volume_2_medium', 10, 'Medium volume - 10 pages'), | ||
('volume_3_large', 15, 'Large volume - 15 pages'), | ||
] | ||
|
||
for vol_name, num_pages, description in volumes: | ||
vol_dir = os.path.join(base_dir, vol_name) | ||
os.makedirs(vol_dir, exist_ok=True) | ||
|
||
print(f"Creating {vol_name}: {description}") | ||
|
||
for page_num in range(1, num_pages + 1): | ||
# Create test image (800x1000 pixels - typical book page) | ||
img = Image.new('RGB', (800, 1000), color='white') | ||
draw = ImageDraw.Draw(img) | ||
|
||
# Draw border | ||
draw.rectangle([50, 50, 750, 950], outline='black', width=3) | ||
|
||
# Add text content | ||
text_lines = [ | ||
f'{vol_name.upper()}', | ||
f'Page {page_num} of {num_pages}', | ||
'', | ||
'Sample Text for OCR Testing', | ||
'=' * 40, | ||
'', | ||
'Lorem ipsum dolor sit amet, consectetur', | ||
'adipiscing elit. Sed do eiusmod tempor', | ||
'incididunt ut labore et dolore magna.', | ||
'', | ||
'This is a test page generated for', | ||
'HathiTrust Package Automation testing.', | ||
'', | ||
'Page number: ' + str(page_num).zfill(8), | ||
] | ||
|
||
y_pos = 120 | ||
for line in text_lines: | ||
draw.text((100, y_pos), line, fill='black') | ||
y_pos += 50 | ||
|
||
# Add page number at bottom | ||
draw.text((350, 920), f'- {page_num} -', fill='black') | ||
|
||
# Save with proper naming: 00000001.tif, 00000002.tif, etc. | ||
filename = f'{page_num:08d}.tif' | ||
filepath = os.path.join(vol_dir, filename) | ||
|
||
# Save as uncompressed TIFF | ||
img.save(filepath, 'TIFF', compression='none') | ||
|
||
print(f" ✓ Created {num_pages} pages in {vol_name}/") | ||
|
||
print() | ||
print("=" * 60) | ||
print("Test volumes created successfully!") | ||
print("=" * 60) | ||
print(f"Location: {os.path.abspath(base_dir)}") | ||
print(f"Total volumes: {len(volumes)}") | ||
print(f"Total pages: {sum(v[1] for v in volumes)}") | ||
print() | ||
print("Volume structure:") | ||
for vol_name, num_pages, _ in volumes: | ||
print(f" {vol_name}/") | ||
print(f" 00000001.tif") | ||
print(f" 00000002.tif") | ||
print(f" ...") | ||
print(f" {num_pages:08d}.tif") | ||
print() | ||
|
||
if __name__ == '__main__': | ||
create_test_volumes() |
Oops, something went wrong.