Milan Todorovic
Let your iOS app read texts
#1about 4 minutes
Introduction to the Vision framework for text recognition
The Vision framework simplifies incorporating optical character recognition (OCR) into iOS and macOS applications using Swift.
#2about 4 minutes
Understanding the core Vision request workflow
The fundamental process involves creating an image request handler, defining a request, and then performing the handler to get results.
#3about 2 minutes
Simplifying text recognition with VNRecognizedTextRequest
The modern API streamlines text recognition by using the VNRecognizedTextRequest class, which returns candidate strings directly.
#4about 3 minutes
Choosing between fast and accurate recognition modes
A comparison of the 'fast' mode, which uses character detection, and the 'accurate' mode, which uses a neural network for whole-word recognition.
#5about 4 minutes
Implementing the full workflow with advanced options
A complete code walkthrough shows how to set up the request, handle completion, and improve results with language correction and custom lexicons.
#6about 6 minutes
Live demo of scanning printed text from a book
A practical demonstration using a sample app to scan a page from a printed book, showing the high accuracy of the Vision framework.
#7about 3 minutes
Demonstrating business card and receipt scanning
The demo continues by scanning a business card and a multi-language receipt, highlighting both successes and potential challenges with complex layouts.
#8about 3 minutes
Recognizing handwritten text and a brief code overview
The final demo shows the framework's capability to recognize handwritten text, followed by a quick look at the relevant Swift code in the sample project.
#9about 5 minutes
Resources and other capabilities of the Vision framework
Learn where to find documentation and tutorials, and discover other Vision features like hand and body pose detection or image classification.
#10about 3 minutes
On-device processing and cross-platform considerations
The benefits of on-device processing for speed, security, and privacy are discussed, along with potential alternatives for Android and Flutter developers.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
01:38 MIN
Understanding the capabilities of Apple's Vision framework
Detect Hand Pose with Vision
Unlock full access
Log in or set up an account to access this feature and more.
05:33 MIN
Exploring Apple's core machine learning frameworks
Harnessing Apple Intelligence: Live Coding with Swift for iOS
Unlock full access
Log in or set up an account to access this feature and more.
03:41 MIN
Understanding the capabilities of Apple Intelligence
Harnessing Apple Intelligence: Live Coding with Swift for iOS
Unlock full access
Log in or set up an account to access this feature and more.
01:21 MIN
Using the on-device model for text extraction
Harnessing Apple Intelligence: Live Coding with Swift for iOS
Unlock full access
Log in or set up an account to access this feature and more.
03:09 MIN
Showcasing computer vision project examples
Computer Vision from the Edge to the Cloud done easy
Unlock full access
Log in or set up an account to access this feature and more.
02:11 MIN
Using Vision Camera and frame processors for real-time video
Building Better Apps with React Native
Unlock full access
Log in or set up an account to access this feature and more.
05:56 MIN
Defining computer vision and its real-world applications
Computer Vision from the Edge to the Cloud done easy
Unlock full access
Log in or set up an account to access this feature and more.
04:08 MIN
Exploring the Foundation Models documentation and opportunities
Harnessing Apple Intelligence: Live Coding with Swift for iOS
Unlock full access
Log in or set up an account to access this feature and more.
Featured Partners
Related Videos
Detect Hand Pose with Vision
Milan Todorovic
Harnessing Apple Intelligence: Live Coding with Swift for iOS
MIlan Todorović
Mobile at a Crossroads: The Declarative UI App Revolution
Peter Steinberger
From Zero to Mobile Developer in 45 Minutes With SwiftUI
Andrew Morgan
AR Kit intro - placing 3D objects in a scene and interacting with them in real-time
Nermin Sehic
Apple Vision Pro: Proven Development Methods Meet the Latest Technology
Mario Petricevic
Smart, Connected, Unexpected: The Wild Side of IoT and AI
Pawel Skiba
Create DSL (Domain Specific Language) on top of Swift
Milan Todorović
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Apple Inc
Zürich, Switzerland
€80-103K
Senior
NumPy
Pandas
PyTorch
Tensorflow
+2

Vicoland
Frankfurt, Germany
Intermediate
Node.js


Eye Vision Technology GmbH
Karlsruhe, Germany
Remote
€45K
GIT
Computer Vision


TechBiz Global GmbH
Remote
Junior
PyTorch
Tensorflow
Computer Vision

TechBiz Global GmbH
Remote
Junior
PyTorch
Tensorflow
Computer Vision

TechBiz Global GmbH
Remote
Junior
PyTorch
Tensorflow
Computer Vision

TechBiz Global GmbH
Remote
Junior
PyTorch
Tensorflow
Computer Vision