diff --git a/brodie_code/README.md b/brodie_code/README.md index 237ee09c..de2cf1f5 100644 --- a/brodie_code/README.md +++ b/brodie_code/README.md @@ -1,2 +1,96 @@ -# ePubs_AccProj -docs.lib.purdue.edu remediation project +# PDF Processor for Screen Readable Documents + +Tool for processing PDFs for accessibility workflows and generating detailed reports. + +## Requirements + +- Python 3.11+ +- Tesseract OCR +- MuPDF + +### System Dependencies + +#### macOS +```bash +brew install tesseract +brew install mupdf +``` + +#### Ubuntu/Debian +```bash +sudo apt-get install tesseract-ocr +sudo apt-get install mupdf +``` + +#### Windows +1. Install Tesseract OCR: + - Download installer from [UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki) + - Add to PATH: `C:\Program Files\Tesseract-OCR` + +2. Install MuPDF: + - Download from [MuPDF website](https://mupdf.com/releases/index.html) + - Add installation directory to PATH + +## Installation + +1. Clone repository: +```bash +git clone [repository-url] +cd [repository-name] +``` + +2. Create virtual environment: +```bash +# macOS/Linux +python -m venv venv +source venv/bin/activate + +# Windows +python -m venv venv +venv\Scripts\activate +``` + +3. Install Python dependencies: +```bash +pip install -r requirements.txt +``` + +## Project Structure +``` +. +├── input/ # Place PDFs here for processing +├── output/ # Processed files and reports +├── src/ +│ └── accessibility_checker/ +├── config.yaml # Configuration settings +└── requirements.txt +``` + +## Usage + +1. Place PDFs in the `input` directory +2. Run the processor: +```bash +# macOS/Linux +python src/main.py + +# Windows +python src\main.py +``` + +## Output + +The tool generates: +- Processed PDFs with enhanced accessibility +- Accessibility violation reports +- OCR results +- Processing statistics + +Results are organized in the `output` directory structure. + +## Troubleshooting + +### Windows-Specific Issues +- If Tesseract isn't found: Verify PATH includes `C:\Program Files\Tesseract-OCR` +- If MuPDF isn't found: Add MuPDF installation directory to PATH +- Command prompt might require admin privileges for first run