Start with gentle denoising
Run a 1px median filter to knock out salt-and-pepper noise without softening real edges. Follow with adaptive thresholding to keep ink dark and backgrounds bright.
For pencil marks, duplicate the layer and blend using "Multiply" before thresholding. This preserves faint graphite strokes so they are not lost in the binarization step.
Straighten and crop before OCR
Use the rotation handles in Image to Text or your favorite photo editor to level the baseline of the text. Cropping out margins improves focus and cuts processing time.
Batch assets with similar skew together. Applying the same rotation and crop settings reduces manual adjustments when you process dozens of pages.
Add an AI proofing pass
After OCR, run the transcript through the "Clean and clarify" prompt in LocalKit. The AI suggests replacements for low-confidence words and flags sections that need review.
Save custom prompts for brand-specific terminology. Teams processing regulated content can auto-flag words that require legal approval before release.