Checkmate the OCR Challenge: Image to Text Extraction

By jayant kodwani on May 29, 2021 • ( Leave a comment )

Extract text from images in 3 simple steps

Unleash the power of Microsoft Azure Computer vision using Python

Checkmate the OCR challenge : Image to Text Extraction using Python & Azure, (Photo by GR Stocks on Unsplash)

OCR optical character recognition or to say in layman terms extracting text from images has been one of the most remarkable features available today. Gone are the days when organizations employed thousands of people just to type the text from images! Today, it is a job of a few people who can process thousands of images in just few minutes. Extracting text from images has helped organizations to massively improve customer service, increase storage spaces and secure the sensitive data in a compliant manner.

What will we Discuss?

We would learn how to extract text from images using the power of Microsoft Azure computer vision cognitive services.

We would use a sample image stored in Azure container Blob storage (https://jayantml1356189034.blob.core.windows.net/jayantcontainer/0001.jpg)
Process the given image to extract 100+ words.
Store the extracted result in a local MS Excel file.

Resources Required

Python instance (i.e. Spyder IDE)
Microsoft Azure Subscription (to run computer vision cognitive services and storage service to store images in Blob)

Step 1: Create Azure Computer Vision

1.1 Login to the Azure Portal: https://portal.azure.com/#home, search for “ Computer Vision ”

1.2 Create computer vision service by selecting subscription, creating a resource group (just a container to bind the resources), location and pricing tier. A free web container allows 5,000 transactions free per month. After clicking “Review + Create” , Azure may take a couple of minutes to create the resource.

1.3 Once the computer vision service has been created, navigate to “Keys and Endpoint” and copy the keys and endpoint details probably in a notepad.

Keys and Endpoint for Computer vision service

⚠️ Please keep a note that keys and endpoint should not be disclosed to unauthorized people as they may impact your azure consumption cost. Regenerate keys if you have accidently disclosed the same.

Now, you are done with the Azure Portal portion now and can navigate to Python (Spyder IDE).

Step 2: Install AZURE packages

Installation: Open Python instance (i.e. Spyder 🐍 ) and issue below commands to install the required azure packages.

pip install — upgrade azure-cognitiveservices-vision-computervision
pip install pillow
pip install azure-storage-blob

Step 3: Python code magic

Use the below script and replace (a) Subscription key (b) Endpoint and Execute! You can try executing the script with other images by replacing the field ‘remote_image_handw_text_url’. The script has been updated with self-explanatory comments. Feel free to ask any further questions in the comments section.

This is how your output looks like. You can download the Excel here.

Conclusion

We learned 📘 how to extract text from image and get the output in MS Excel for further analysis.. You could use other images and customize the code to see what suits your use case best! 👍

Have questions?? Please drop it in the comments !

References

[1] https://docs.microsoft.com/en-in/azure/cognitive-services/computer-vision/quickstarts-sdk/client-library?tabs=visual-studio&pivots=programming-language-python

Follow me on Linkedin, Medium, GitHub for more stuff like this

Categories: Data Science

Tagged as: Azure, Computervision, imagetotext, OCR, Python

	jayant kodwani on VBA Macro to Split Single Exce…
	norberto on VBA Macro to Split Single Exce…
	jayant kodwani on VBA Macro to Split Single Exce…
	PM Aspire on PMP vs PRINCE2 Practitioner: W…
	PM Aspire on A Beginner’s Guide for P…

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

JayantKodwani.com

परोपकारार्थं इदं शरीरं – This life is to help others

Checkmate the OCR Challenge: Image to Text Extraction

Extract text from images in 3 simple steps

What will we Discuss?

Resources Required

Step 1: Create Azure Computer Vision

Step 2: Install AZURE packages

Step 3: Python code magic

Conclusion

References

Related

Published by jayant kodwani

Leave a CommentCancel reply

Translate

Subscribe to JayantKodwani.com via Email

Top Posts & Pages

Recent Comments

Categories

Like My Facebook Page

Archives

Checkmate the OCR Challenge: Image to Text Extraction

Extract text from images in 3 simple steps

What will we Discuss?

Resources Required

Step 1: Create Azure Computer Vision

Step 2: Install AZURE packages

Step 3: Python code magic

Conclusion

References

Share this:

Related

Published by jayant kodwani

Leave a CommentCancel reply

Translate

Subscribe to JayantKodwani.com via Email

Top Posts & Pages

Recent Comments

Categories

Like My Facebook Page

Archives

Discover more from JayantKodwani.com