LocalKit
Visit LocalKit.app
← Back to all tips

Pre-processing recipes that clean up noisy scans

Use browser-friendly filters and AI prompts to prep difficult assets before you run Image to Text.

ocr preprocessing guideclean up scanned documentsremove noise before ocr

Start with gentle denoising

Run a 1px median filter to knock out salt-and-pepper noise without softening real edges. Follow with adaptive thresholding to keep ink dark and backgrounds bright.

For pencil marks, duplicate the layer and blend using "Multiply" before thresholding. This preserves faint graphite strokes so they are not lost in the binarization step.

Straighten and crop before OCR

Use the rotation handles in Image to Text or your favorite photo editor to level the baseline of the text. Cropping out margins improves focus and cuts processing time.

Batch assets with similar skew together. Applying the same rotation and crop settings reduces manual adjustments when you process dozens of pages.

Add an AI proofing pass

After OCR, run the transcript through the "Clean and clarify" prompt in LocalKit. The AI suggests replacements for low-confidence words and flags sections that need review.

Save custom prompts for brand-specific terminology. Teams processing regulated content can auto-flag words that require legal approval before release.

Frequently asked questions

Should I sharpen before or after OCR?
Always sharpen before OCR. Post-processing sharpen filters can exaggerate recognition mistakes, whereas pre-OCR sharpening enhances edges the model relies on.
Can I automate these recipes?
Yes. LocalKit Pro includes batch recipes that apply denoise, rotate, and prompt clean-up steps in one pass. Use them on drop folders or Cloudflare Worker queues.