Converting image-based text (scans, photos, fax PDFs) into searchable, editable content is foundational for archives, legal records, back-office automation, and analytics. This is Optical Character Recognition (OCR)—and in .NET, IronOCR gives you a high-quality engine with a friendly API.
This post walks through a production-ready C# console app that runs OCR reliably. We’ll cover clean setup, secure licensing, resource management, accuracy tips, and professional error handling. A complete code sample is included.
TL;DR
- Set your IronOCR license (don’t just validate it).
- Validate the input file before work begins.
- Dispose OCR resources (
IronTesseract,OcrInput) withusingto prevent leaks. - (Optionally) preprocess scans (
DeNoise,DeSkew) for accuracy. - Handle exceptions specifically (permissions, missing files) and generally (unexpected issues).
Prerequisites
- .NET 6+ (or .NET Framework)
- NuGet package:
IronOcrdotnet add package IronOcr - A sample image/PDF containing clearly legible text
- A valid IronOCR license or trial key
🔐 Security tip: Never hardcode secrets in source. Use environment variables or a secure secrets store in real apps.
Section 1: Setup, Licensing, and Preparation
A reliable OCR tool begins with predictable setup and early exits when something’s wrong.
1.1 Applying and Validating the License Key 🔑
// 1) SET the license (don’t only validate)
const string licenseKey = "YOUR-LICENSE-OR-TRIAL-KEY";
IronOcr.License.LicenseKey = licenseKey;
// Optional: also check validity for a friendly message
if (!IronOcr.License.IsValidLicense(licenseKey))
{
Console.WriteLine("License is invalid or expired. Please verify your IronOCR key.");
return;
}
IronOcr.License.LicenseKey = licenseKey;activates the engine.- Validation lets you fail early with a clear message.
- Best practice: store keys in an environment variable (e.g.,
IRONOCR_LICENSE_KEY) instead of source.
1.2 Input Validation and Time Logging ⏱️
// 2) Validate input file
var path = @"D:\D Folder\Sample OCR Text\Demo 1 OCR Image.png";
if (!File.Exists(path))
{
Console.WriteLine($"File not found: {path}");
return;
}
Console.WriteLine("OCR Starting here !! " + DateTime.Now);
File.Existsavoids a noisy crash if the path is wrong.- Timestamps help you measure performance across runs.
Section 2: The Core OCR Execution
This is where IronOCR does the heavy lifting—your job is to be a good steward of resources.
2.1 Essential Resource Management (the using statement)
// 3) Dispose resources properly
using var ocr = new IronTesseract();
using var input = new OcrInput(path);
// Optional: input.DeNoise(); input.DeSkew();
Both IronTesseract and OcrInput are IDisposable. using ensures buffers, file handles, and native allocations are released—even if errors occur.
2.2 Advanced Image Pre-processing for Accuracy
input.DeNoise()removes speckles/compression artifacts.input.DeSkew()straightens rotated scans.
These two often give the biggest ROI on noisy office scans.
2.3 Reading the Text and Logging Completion
var result = ocr.Read(input);
Console.WriteLine("OCR Completes here !! " + DateTime.Now);
var text = result.Text;
if (!string.IsNullOrWhiteSpace(text))
Console.WriteLine(text);
else
Console.WriteLine("Seems OCR Text is NULL");
ocr.Read(input)runs the full recognition pipeline.- Log end time to compute duration.
- A whitespace check avoids printing empty output.
Section 3: Professional Error Handling
Handle the common, noisy failures first; catch-all last.
catch (UnauthorizedAccessException uae)
{
Console.WriteLine("Permission error reading the file or folder.");
Console.WriteLine(uae.Message);
}
catch (FileNotFoundException fnf)
{
Console.WriteLine("The specified file was not found.");
Console.WriteLine(fnf.Message);
}
catch (Exception ex)
{
Console.WriteLine("Unexpected error during OCR:");
Console.WriteLine(ex.Message);
}
- Specific errors → better guidance (“check permissions”, “path not found”).
- General catch → no unhandled crashes in production.
Complete C# Code Snippet
using System;
using System.IO;
using IronOcr;
class Program
{
static void Main()
{
try
{
// 1) SET the license (don’t only validate)
// Prefer an environment variable in real apps
const string licenseKey = "YOUR-LICENSE-OR-TRIAL-KEY";
IronOcr.License.LicenseKey = licenseKey;
// Optional: also check validity for a friendly message
if (!IronOcr.License.IsValidLicense(licenseKey))
{
Console.WriteLine("License is invalid or expired. Please verify your IronOCR key.");
return;
}
// 2) Validate input file
var path = @"D:\D Folder\Sample OCR Text\Demo 1 OCR Image.png";
if (!File.Exists(path))
{
Console.WriteLine($"File not found: {path}");
return;
}
Console.WriteLine("OCR Starting here !! " + DateTime.Now);
// 3) Dispose resources properly
using var ocr = new IronTesseract();
using var input = new OcrInput(path);
// Optional preprocessing to improve accuracy on scans
// input.DeNoise();
// input.DeSkew();
var result = ocr.Read(input);
Console.WriteLine("OCR Completes here !! " + DateTime.Now);
var text = result.Text;
if (!string.IsNullOrWhiteSpace(text))
Console.WriteLine(text);
else
Console.WriteLine("Seems OCR Text is NULL");
}
catch (UnauthorizedAccessException uae)
{
Console.WriteLine("Permission error reading the file or folder.");
Console.WriteLine(uae.Message);
}
catch (FileNotFoundException fnf)
{
Console.WriteLine("The specified file was not found.");
Console.WriteLine(fnf.Message);
}
catch (Exception ex)
{
Console.WriteLine("Unexpected error during OCR:");
Console.WriteLine(ex.Message);
}
}
}
Accuracy Boosters (Quick Wins)
- Resolution: Aim for ~300 DPI for documents.
- Lighting: Avoid glare/shadows on camera shots.
- Language: Set the right language(s) for better models:
// Example: // ocr.Language = OcrLanguage.FromLanguageCode("eng"); // ocr.Language = OcrLanguage.EnglishBest + OcrLanguage.SpanishBest; // mixed docs - Preprocess:
DeNoise,DeSkew; considerToGrayScaleandContrast.
Troubleshooting
- Empty output → try preprocessing, check DPI/clarity, set language correctly.
- “File in use” → ensure you dispose previous readers/writers; run as admin if needed.
- “License invalid” → verify key, expiry, and that you set the key (not just validate).
- Slow on huge PDFs → batch by pages or run on a worker with more memory/CPU.
Next Steps
- Add CLI args (
--input,--out,--lang,--preprocess) for flexibility. - Save searchable PDFs (enable PDF rendering in configuration).
- Parse structured fields (invoice numbers, dates) with regex after OCR.
- Batch a folder, or integrate with queues/cloud storage for automation.
Conclusion
Reliable OCR isn’t about a single API call—it’s about disciplined setup, resource hygiene, and clear error handling. With IronOCR and the patterns above, you can ship a small tool that performs like an enterprise-grade service: predictable, maintainable, and ready for real documents.
