Archive for February, 2024
Objective C OCR detection using macOS’ Vision Framework
by admin on Feb.01, 2024, under News
To utilize the macOS OCR capabilities provided by the Vision framework, we can write an Objective-C program. The Vision framework offers powerful image analysis capabilities, including text recognition.
First, we need to ensure that we have the Vision framework available in our environment. It’s included in macOS 10.15 (Catalina) and later.
Below is an example of how to use Vision framework in Objective-C to perform OCR on an image (ocr.m):
#import <Foundation/Foundation.h> #import <Vision/Vision.h> #import <AppKit/AppKit.h> int main(int argc, const char * argv[]) { @autoreleasepool { if (argc != 2) { NSLog(@"Usage: ./ocr <image-path>"); return 1; } NSString *imagePath = [NSString stringWithUTF8String:argv[1]]; NSImage *image = [[NSImage alloc] initWithContentsOfFile:imagePath]; if (!image) { NSLog(@"Failed to load image"); return 1; } VNImageRequestHandler *handler = [[VNImageRequestHandler alloc] initWithData:[image TIFFRepresentation] options:@{}]; VNRecognizeTextRequest *textRequest = [[VNRecognizeTextRequest alloc] initWithCompletionHandler:^(VNRequest * _Nonnull request, NSError * _Nullable error) { if (error) { NSLog(@"Error recognizing text: %@", error.localizedDescription); return; } for (VNRecognizedTextObservation *observation in request.results) { NSArray<VNRecognizedText *> *candidates = [observation topCandidates:1]; if (candidates.count > 0) { VNRecognizedText *topCandidate = candidates[0]; NSLog(@"Recognized text: %@", topCandidate.string); } } }]; NSError *error = nil; [handler performRequests:@[textRequest] error:&error]; if (error) { NSLog(@"Failed to perform text recognition: %@", error.localizedDescription); return 1; } } return 0; }
We may compile this using clang with the following command:
clang -fobjc-arc -framework Foundation -framework Vision -framework AppKit ocr.m -o ocr
We may now use this program for actual OCR detection:
./ocr test-bill.jpg
Worth noting it doesn’t just support images (such as png, jpg, …), but also pdf documents.