beta.blog

Objective C OCR detection using macOS’ Vision Framework

by on Feb.01, 2024, under News

To utilize the macOS OCR capabilities provided by the Vision framework, we can write an Objective-C program. The Vision framework offers powerful image analysis capabilities, including text recognition.

First, we need to ensure that we have the Vision framework available in our environment. It’s included in macOS 10.15 (Catalina) and later.

Below is an example of how to use Vision framework in Objective-C to perform OCR on an image (ocr.m):

#import <Foundation/Foundation.h>
#import <Vision/Vision.h>
#import <AppKit/AppKit.h>

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        if (argc != 2) {
            NSLog(@"Usage: ./ocr <image-path>");
            return 1;
        }

        NSString *imagePath = [NSString stringWithUTF8String:argv[1]];
        NSImage *image = [[NSImage alloc] initWithContentsOfFile:imagePath];
        if (!image) {
            NSLog(@"Failed to load image");
            return 1;
        }

        VNImageRequestHandler *handler = [[VNImageRequestHandler alloc] initWithData:[image TIFFRepresentation] options:@{}];

        VNRecognizeTextRequest *textRequest = [[VNRecognizeTextRequest alloc] initWithCompletionHandler:^(VNRequest * _Nonnull request, NSError * _Nullable error) {
            if (error) {
                NSLog(@"Error recognizing text: %@", error.localizedDescription);
                return;
            }

            for (VNRecognizedTextObservation *observation in request.results) {
                NSArray<VNRecognizedText *> *candidates = [observation topCandidates:1];
                if (candidates.count > 0) {
                    VNRecognizedText *topCandidate = candidates[0];
                    NSLog(@"Recognized text: %@", topCandidate.string);
                }
            }
        }];

        NSError *error = nil;
        [handler performRequests:@[textRequest] error:&error];
        if (error) {
            NSLog(@"Failed to perform text recognition: %@", error.localizedDescription);
            return 1;
        }
    }
    return 0;
}

We may compile this using clang with the following command:

clang -fobjc-arc -framework Foundation -framework Vision -framework AppKit ocr.m -o ocr

We may now use this program for actual OCR detection:

./ocr test-bill.jpg

Worth noting it doesn’t just support images (such as png, jpg, …), but also pdf documents.


Leave a Reply

*

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!