Skip to content

Commit

Permalink
Add portable deployment setup and cross-platform build tools
Browse files Browse the repository at this point in the history
- Update .gitignore: exclude test data, workspace files, and Windows path artifacts
- Add portable_setup/ directory with deployment scripts and guides for USB drive distribution
- Add Windows build scripts: setup_windows_build.ps1, WINDOWS_BUILD_QUICKSTART.txt
- Add cross-platform build documentation: BUILD_WINDOWS_EXECUTABLE.md, PORTABLE_TESTING_APPROACH.md
- Add src/utils/ module with import_helper and secure_fs for platform compatibility
- Add test volume creation scripts: create_hathitrust_volumes.py, create_test_volumes.py, test_sequence.py
- Update core modules with cross-platform improvements and bug fixes
- Add CODE_REVIEW_ACTION_ITEMS.md for tracking technical debt
  • Loading branch information
schipp0 committed Oct 8, 2025
1 parent 0eb986c commit 72deb4f
Show file tree
Hide file tree
Showing 31 changed files with 3,946 additions and 69 deletions.
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,13 @@ logs/
!temp/.gitkeep
!logs/.gitkeep

# Test Data
test_volumes_barcode/

# Windows Path Artifacts (from cross-platform testing)
C:\Users\*
C:/Users/*

# =====================================
# Generated Files
# =====================================
Expand All @@ -107,6 +114,7 @@ processing_report_*.json
*.tmPreferences.cache
*.stTheme.cache
*.code-workspace
HathiTrust.code-workspace

# =====================================
# OS-Specific Files
Expand Down
111 changes: 111 additions & 0 deletions WINDOWS_BUILD_QUICKSTART.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
================================================================================
QUICK START: BUILD WINDOWS EXECUTABLE
================================================================================

YOU ARE HERE: WSL built Linux executable, need Windows .exe for testing
SOLUTION: Build on Windows using Python on Windows

================================================================================
FASTEST PATH (30 minutes total)
================================================================================

STEP 1: Install Python on Windows (10 min)
-------------------------------------------
1. Download: https://www.python.org/downloads/windows/
Get: python-3.12.7-amd64.exe
2. Run installer
3. ✓ CHECK: "Add Python 3.12 to PATH"
4. Click "Install Now"

STEP 2: Copy Project to Windows (2 min)
----------------------------------------
Open Windows File Explorer:
From: \\wsl$\Ubuntu\home\schipp0\Digitization\HathiTrust
To: C:\HathiTrust

Or in PowerShell:
xcopy \\wsl$\Ubuntu\home\schipp0\Digitization\HathiTrust C:\HathiTrust /E /I /H

STEP 3: Run Setup Script (10 min)
----------------------------------
1. Open PowerShell in C:\HathiTrust
2. If execution policy error:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
3. Run:
.\setup_windows_build.ps1
4. When prompted, answer 'y' to build now

STEP 4: Copy to Flash Drive (2 min)
------------------------------------
xcopy C:\HathiTrust\dist\HathiTrust-Automation D:\HathiTrust-Automation /E /I /H

STEP 5: Test (1 min)
--------------------
D:\RUN_ME.bat

DONE! ✓

================================================================================
WHAT THE SETUP SCRIPT DOES
================================================================================

1. Checks Python installed
2. Creates virtual environment (venv)
3. Activates venv
4. Installs: PyQt6, Pillow, PyYAML, pytesseract, pyinstaller
5. Offers to build executable immediately

ALL AUTOMATIC!

================================================================================
MANUAL METHOD (if script fails)
================================================================================

In PowerShell at C:\HathiTrust:

python -m venv venv
.\venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install PyQt6 Pillow PyYAML pytesseract pyinstaller
python build_scripts/build_windows.py

================================================================================
FILES CREATED
================================================================================

After build:
C:\HathiTrust\dist\HathiTrust-Automation\HathiTrust-Automation.exe ← Windows!

After copy:
D:\HathiTrust-Automation\HathiTrust-Automation.exe ← Ready to test!

================================================================================
TROUBLESHOOTING
================================================================================

"python is not recognized"
→ Reinstall Python, check "Add to PATH"

"cannot be loaded because running scripts is disabled"
→ Run: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Build fails
→ Check all dependencies installed: pip list
→ Try: pip install pyinstaller --force-reinstall

.exe crashes when run
→ Install Tesseract on Windows
→ Run from command line to see errors

================================================================================
WHY THIS IS NECESSARY
================================================================================

PyInstaller creates executables for the OS it runs on:
- WSL (Linux) → Linux executable (no .exe)
- Windows → Windows executable (.exe)
- macOS → macOS app bundle

You can't cross-compile. Must build on target OS.

================================================================================
94 changes: 94 additions & 0 deletions create_hathitrust_volumes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
"""
Generate HathiTrust-compliant test volumes with barcode identifiers
"""

import os
from PIL import Image, ImageDraw

def create_hathitrust_test_volumes(base_dir='test_volumes_barcode'):
"""Create test volumes with proper HathiTrust barcode identifiers"""

os.makedirs(base_dir, exist_ok=True)
print(f"Creating HathiTrust test volumes in: {os.path.abspath(base_dir)}")
print()

# Use realistic barcode identifiers (14-digit barcodes)
volumes = [
('39015012345678', 5, 'Small volume - barcode format'),
('39015087654321', 10, 'Medium volume - barcode format'),
('39015011111111', 15, 'Large volume - barcode format'),
]

for barcode, num_pages, description in volumes:
vol_dir = os.path.join(base_dir, barcode)
os.makedirs(vol_dir, exist_ok=True)

print(f"Creating {barcode}: {description}")

for page_num in range(1, num_pages + 1):
# Create test image
img = Image.new('RGB', (800, 1000), color='white')
draw = ImageDraw.Draw(img)

# Draw border
draw.rectangle([50, 50, 750, 950], outline='black', width=3)

# Add text content
text_lines = [
f'BARCODE: {barcode}',
f'Page {page_num} of {num_pages}',
'',
'HathiTrust Package Test Volume',
'=' * 40,
'',
'Sample Text for OCR Testing',
'',
'Lorem ipsum dolor sit amet, consectetur',
'adipiscing elit. Sed do eiusmod tempor',
'incididunt ut labore et dolore magna.',
'',
'This volume uses proper HathiTrust',
'naming conventions with a 14-digit',
'barcode identifier.',
'',
f'Sequence: {page_num:08d}',
]

y_pos = 120
for line in text_lines:
draw.text((100, y_pos), line, fill='black')
y_pos += 45

# Add page number at bottom
draw.text((350, 920), f'- {page_num} -', fill='black')

# Save with HathiTrust naming: 00000001.tif (just the sequence number)
filename = f'{page_num:08d}.tif'
filepath = os.path.join(vol_dir, filename)
img.save(filepath, 'TIFF', compression='none')

print(f" ✓ Created {num_pages} pages: {barcode}/00000001.tif to {barcode}/{num_pages:08d}.tif")

print()
print("=" * 60)
print("HathiTrust test volumes created successfully!")
print("=" * 60)
print(f"Location: {os.path.abspath(base_dir)}")
print(f"Total volumes: {len(volumes)}")
print(f"Total pages: {sum(v[1] for v in volumes)}")
print()
print("Volume structure (HathiTrust compliant):")
for barcode, num_pages, _ in volumes:
print(f" {barcode}/ (barcode identifier)")
print(f" 00000001.tif")
print(f" 00000002.tif")
print(f" ...")
print(f" {num_pages:08d}.tif")
print()
print("ZIP package names will be:")
for barcode, _, _ in volumes:
print(f" {barcode}.zip")
print()

if __name__ == '__main__':
create_hathitrust_test_volumes()
91 changes: 91 additions & 0 deletions create_test_volumes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
"""
Generate test TIFF volumes for HathiTrust testing
Creates multiple volumes with properly named TIFF files
"""

import os
from PIL import Image, ImageDraw

def create_test_volumes(base_dir='test_volumes'):
"""Create test volumes with TIFF files"""

# Create base directory
os.makedirs(base_dir, exist_ok=True)
print(f"Creating test volumes in: {os.path.abspath(base_dir)}")
print()

# Volume configurations: (volume_name, num_pages, description)
volumes = [
('volume_1_small', 5, 'Small volume - 5 pages'),
('volume_2_medium', 10, 'Medium volume - 10 pages'),
('volume_3_large', 15, 'Large volume - 15 pages'),
]

for vol_name, num_pages, description in volumes:
vol_dir = os.path.join(base_dir, vol_name)
os.makedirs(vol_dir, exist_ok=True)

print(f"Creating {vol_name}: {description}")

for page_num in range(1, num_pages + 1):
# Create test image (800x1000 pixels - typical book page)
img = Image.new('RGB', (800, 1000), color='white')
draw = ImageDraw.Draw(img)

# Draw border
draw.rectangle([50, 50, 750, 950], outline='black', width=3)

# Add text content
text_lines = [
f'{vol_name.upper()}',
f'Page {page_num} of {num_pages}',
'',
'Sample Text for OCR Testing',
'=' * 40,
'',
'Lorem ipsum dolor sit amet, consectetur',
'adipiscing elit. Sed do eiusmod tempor',
'incididunt ut labore et dolore magna.',
'',
'This is a test page generated for',
'HathiTrust Package Automation testing.',
'',
'Page number: ' + str(page_num).zfill(8),
]

y_pos = 120
for line in text_lines:
draw.text((100, y_pos), line, fill='black')
y_pos += 50

# Add page number at bottom
draw.text((350, 920), f'- {page_num} -', fill='black')

# Save with proper naming: 00000001.tif, 00000002.tif, etc.
filename = f'{page_num:08d}.tif'
filepath = os.path.join(vol_dir, filename)

# Save as uncompressed TIFF
img.save(filepath, 'TIFF', compression='none')

print(f" ✓ Created {num_pages} pages in {vol_name}/")

print()
print("=" * 60)
print("Test volumes created successfully!")
print("=" * 60)
print(f"Location: {os.path.abspath(base_dir)}")
print(f"Total volumes: {len(volumes)}")
print(f"Total pages: {sum(v[1] for v in volumes)}")
print()
print("Volume structure:")
for vol_name, num_pages, _ in volumes:
print(f" {vol_name}/")
print(f" 00000001.tif")
print(f" 00000002.tif")
print(f" ...")
print(f" {num_pages:08d}.tif")
print()

if __name__ == '__main__':
create_test_volumes()
Loading

0 comments on commit 72deb4f

Please sign in to comment.