md/writeup/swift_ocr_example.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

title:Swift OCR example
keywords:swift,ocr,macos

# Swift OCR example

## Intro

After getting new M1 whanted to try some fancy machine learning stuff that avaliable.
Got inspired when seen new post on reddit about OCRImage tool realse. As
it was written in Object-C https://www.turbozen.com/sourceCode/ocrImage/ ,so I
decided that is good learning task to read the code and reimplement all in Swift.

## OCR

The whole working prototype can fit just in few lines of code. All you need to create
is text recognition requester and handle that collect results. There is few example on 
apple doc page.

```siwft
func recognizeImageUrl(_ url:URL, _ error: Error?) {
        var pieces:[TextPiece] = []
        var err:NSError?
        
        var textRequest = VNRecognizeTextRequest(completionHandler: recognizeTextHandler(request:error:))
        
        var handler = VNImageRequestHandler(url: url)
        do {
            try handler.perform([textRequest])
        } catch {
            print("Cannot perform request error:\(error)")
        }
    }
    
    func recognizeTextHandler(request: VNRequest, error: Error?) {
        print("Start recognize handler")
        guard let observations =
                    request.results as? [VNRecognizedTextObservation] else {
                return
        }
        
        let recognizedStrings = observations.compactMap { observation in
                // Return the string of the top VNRecognizedText instance.
                return observation.topCandidates(1).first?.string
            }
        print("\(recognizedStrings)")
    }
```

## How to use

Point to the image file with text

```bash
OCRImage --input-name sample.png
```

Stdout output will be recognized text

## Source code

Example XCode project is located at http://git.main.lv/cgit.cgi/OCRImage.git/  
clone it and launch with XCode
```bash
git clone http://git.main.lv/cgit.cgi/OCRImage.git/  
```

## Future developments

Could be nice to add sorting text, and some option to import location of text to json file with coordinates.

Drawing boxes around original image with detected text is next thing to add.

And there is alot of things that could be added to make it full featured OCR software.


## Links

http://git.main.lv/cgit.cgi/OCRImage.git/  
https://www.turbozen.com/sourceCode/ocrImage/  
https://turbozen.com/sourceCode/ocrImage/ocrImage_src/  
https://developer.apple.com/documentation/vision/recognizing_text_in_images  
http://git.main.lv/cgit.cgi/OCRImage.git/tree/OCRImage/main.swift?h=main