31 2
Contact Author Sign in to rate

AnkiOCR

81.70MB. Updated 2021-09-19.
The author has shared 1 other item(s).

This item is large, and may take some time to download.

Description

AnkiOCR Anki 2.1 addon to generate OCR text from images inside of Anki notes/cards. Note that this is only designed for computer generated text, not handwritten. The aim of this addon was to generate searchable text for image-heavy notes, it is not intended to produce high quality, perfectly ordered text! Features This is currently in beta stage, please submit a bug report on GitHub if bugs are found, or you want to raise a feature request. Installation AnkiOCR depends on the Tesseract OCR library. If you're on Windows or Mac, teseract is bundled with the addon If you're on Linux carefully follow the instructions here Source code available at my GitHub This program is distributed in the hope that it will be useful but WITHOUT ANY WARRANTY. Usage 1. Open the card browser and select the note(s) you want to process. Use the search bar at the top, select tags, decks, etc. 2. On the toolbar at the top, select 'Cards', then 'AnkiOCR', and select 'Run AnkiOCR on selected notes', as shown below 3. After processing, each of the images in the note will have the ocr data embedded in the `title` html tag, viewable as a tooltip: 4. If you want to remove the OCR data from any notes, select them and then use the "Remove OCR data from selected notes" option in the menu shown above If you wish to have the OCR data outputted to a separate 'OCR' field on the note, which will modify your note types in your deck, you can set the `text_output_location` config option to `new_field` If you want to add new languages, you need to download the appropriate language data from here. Known issues Changelog

Download

As add-ons are programs downloaded from the internet, they are potentially malicious. You should only download add-ons you trust.

Supported Anki versions:

To download this add-on, please copy and paste the following code into Anki 2.1:

450181164

If you were linked to this page from the internet, please open Anki on your computer, go to the Tools menu and then Add-ons>Browse & Install to paste in the code.

All Anki 2.1.x Add-Ons Contact Author

Reviews

on 1651096253
Great!
on 1645020519
work only when searching in browser but not real text
i want a text to be read by tts
on 1642519024
Works really well!
However, I had encounter some problem when I tried to download the tessarct-ocr file
How do I import the language file into Anki after downloading from the github page?
Thanks for your contribution on such a great addon!
on 1642190896
Works perfectly
on 1640951378
It's useful!
on 1634790450
worked perfect when text_oytput_location was "tooltip". Howver, when I tried to change it to new_field I got the following error.

A fatal error occurred, and Anki must close. Please report this message on the forums.
Anki 2.1.48 (fb07bad3) Python 3.8.6 Qt 5.14.2 PyQt 5.14.2
Platform: Mac 10.15.2
Flags: frz=True ao=True sv=?
Add-ons, last update check: 2021-10-20 10:26:51

Caught exception:
Traceback (most recent call last):
File "aqt/progress.py", line 54, in handler
File "aqt/utils.py", line 965, in <lambda>
File "aqt/browser/browser.py", line 429, in onRowChanged
File "aqt/browser/table/table.py", line 83, in get_current_card
File "aqt/browser/table/model.py", line 179, in get_card
File "aqt/browser/table/state.py", line 142, in get_card
File "anki/collection.py", line 317, in get_card
File "anki/cards.py", line 59, in __init__
File "anki/cards.py", line 67, in load
File "anki/_backend/generated.py", line 930, in get_card
File "anki/_backend/__init__.py", line 126, in _run_command
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: PoisonError { .. }
on 1634736980
Hi, this addon does not load up for me.. pls help. Im on 2.1.25. The message I get after I download is below:

An add-on you installed failed to load. If problems persist, please go to the Tools>Add-ons menu, and disable or delete the add-on.

When loading '⁨AnkiOCR⁩':
⁨Traceback (most recent call last):
File "aqt/addons.py", line 211, in loadAddons
File "/Users/Library/Application Support/Anki2/addons21/450181164/__init__.py", line 6, in <module>
from . import gui
File "/Users/adrian/Library/Application Support/Anki2/addons21/450181164/gui.py", line 13, in <module>
from .ocr import OCR
File "/Users/adrian/Library/Application Support/Anki2/addons21/450181164/ocr.py", line 21, in <module>
from .api import OCRNote, NotesQuery, OCRImage
File "/Users/adrian/Library/Application Support/Anki2/addons21/450181164/api.py", line 7, in <module>
from anki.collection import Collection
ImportError: cannot import name 'Collection' from 'anki.collection' (/Applications/Anki.app/Contents/MacOS/anki/collection.pyc
on 1633185354
I want the output of the OCR in the same field.
So that I can remove the image later.
how to do this.
on 1633096196
Excellent add-on! Super useful for finding info in IO cards and anatomy in general.
on 1631818271
Been waiting for something like this! I use the image occlusion plugin for 99% of my notes. This should make it very easy to search for and find individual notes for reference. Thank you for building and updating this! Such a great and valuable service.
on 1631504066
one of my favorite anki addons
on 1630103383
This is great! thank you so much for sharing
on 1629303713
Love this addon, but it is not yet compatible with 2.1.45+.
Comment from author
It is now compatible, try reinstalling it
on 1628907240
Anki 2.1.46 not working.but it works best in my Anki 2.1.35

An add-on you installed failed to load. If problems persist, please go to the Tools>Add-ons menu, and disable or delete the add-on.

When loading 'AnkiOCR':

Traceback (most recent call last):

File "aqt\addons.py", line 217, in loadAddons

File "C: \Users AppData\Roaming\Anki2\addons21\450181164\_init__.py", \Users\\AppData\Roaming Anki2\addons21\450181164\gui.py", line

line 6, in <module>

from. import gui

File "C:

7, in <module>

from aqt.browser import Browser, QMenu

ImportError: cannot import name 'QMenu' from 'aqt.browser" (C:\Program

Files\Anki\aqt\browser\_init__.pyc)
Comment from author
This has now been fixed :)
on 1627577659
Good
on 1627097192
Hi, thanks for sharing! This addon looks very useful, but I've got an error message like this:

System:
mac OS

Anki:
Version ⁨2.1.41

Error encountered during processing, attempting to stop AnkiOCR gracefully. Error below:
Traceback (most recent call last):
File "/Users/owlowiscious/Library/Application Support/Anki2/addons21/450181164/gui.py", line 56, in on_run_ocr
ocr.run_ocr_on_notes(note_ids=selected_nids)
File "/Users/owlowiscious/Library/Application Support/Anki2/addons21/450181164/ocr.py", line 308, in run_ocr_on_notes
notes_query = self.run_ocr_on_query(note_ids=note_ids)
File "/Users/owlowiscious/Library/Application Support/Anki2/addons21/450181164/ocr.py", line 276, in run_ocr_on_query
batched_txts, batched_txts_dir, batch_mapping = self._gen_batched_txts(notes_to_process=notes_query.notes,
File "/Users/owlowiscious/Library/Application Support/Anki2/addons21/450181164/ocr.py", line 227, in _gen_batched_txts
batch_txt_pth.write_text("\n".join([str(i.img_pth) for i in batched_imgs]))
File "pathlib.py", line 1255, in write_text
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 91: ordinal not in range(128)

Hope this helps. Thanks for your time!
on 1623803670
Surprisingly accurate and very helpful for searching cards.
on 1621897397
I have more than 10k notes made from images in PDF and now with your addon I can do research on them. Thank you so much!
on 1620411242
caused a crash, I wish I could rate this more negative t hatn this
Comment from author
Some information would help me solve the cause of the crash, please reply here or raise an issue on the Github, thanks!
on 1619928339
Looks good,
but mine is not working:
every time I run this on a random note, it says:

Error encountered during processing, attempting to stop AnkiOCR
gracefully. Error below:
Traceback (most recent call last):
\Users\auser\AppData\Roaming\Anki2\ addons21\450181164\gui.py", line
54, in on run ocr
ocr.run ocr on notes(note ids=selected nids)
\Users\auser\AppData\Roaming\Anki2addons21\450181164\ocr.py", line
303, in run ocr on notes
notes_query self. run_ocr_on query(query=query_str)
\Users\auser\AppData\Roaming\Anki2addons21\450181164\ocr.py", line
266, in run ocr on_ query
notes_ query= NotesQuery(col=self.col, query=query)
File"<string>", line 7, in init
File"C:
\Users\auser\AppData\Roaming\nki2\21\450181164\api.py",line
216, in post init
self.notes [OCRNote(note_id=nid, col=self.col) for nid in
self.col. findNotes(query=self. query)]
TypeError: find_ got an unexpected keyword argument'query'
Comment from author
Hi there, thanks for this - this has been fixed in the most recent update.
on 1619051560
Amazing and simple app
on 1617755546
This addon is amazing! I am using it on my Mac. I wanted to ask if there is anyway I can highlight the text in the HTML tag so that I can have the text read to me using OCR/TTS software?
on 1617190970
nice add on !
must be default
on 1616583696
Milestone work
on 1616514770
This plugin would be very very useful to me on my system, but I can't seem to get it to work:

System:
Arch Linux

Anki version:
2.1.35

Config:
{
"batch_size": 5,
"languages": [
"eng"
],
"num_threads": 2,
"override_tesseract_exec": true,
"overwrite_existing": true,
"tesseract_exec_path": "/usr/bin/tesseract",
"tesseract_install_valid": true,
"text_output_location": "tooltip",
"use_batching": true,
"use_multithreading": true
}

Error message when trying to use:
Cancelled OCR processing with message :
['Traceback (most recent call last):\n', ' File "/home/<username>/.local/share/Anki2/addons21/450181164/gui.py", line 54, in on_run_ocr\n ocr.run_ocr_on_notes(note_ids=selected_nids)\n', ' File "/home/<username>/.local/share/Anki2/addons21/450181164/ocr.py", line 303, in run_ocr_on_notes\n notes_query = self.run_ocr_on_query(query=query_str)\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/ocr.py", line 272, in run_ocr_on_query\n raw_results = self._ocr_batch_process(batched_txts=batched_txts)\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/ocr.py", line 94, in _ocr_batch_process\n raw_results[batched_img_txt] = future.result()\n', '
File "concurrent/futures/_base.py", line 432, in result\n', '
File "concurrent/futures/_base.py", line 388, in __get_result\n', '
File "concurrent/futures/thread.py", line 57, in run\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/ocr.py", line 259, in _ocr_img\n return pytesseract.image_to_string(str(img_pth), lang="+".join(languages or ["eng"]))\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/_vendor/pytesseract/pytesseract.py", line 368, in image_to_string\n return {\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/_vendor/pytesseract/pytesseract.py", line 371, in <lambda>\n Output.STRING: lambda: run_and_get_output(*args),\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/_vendor/pytesseract/pytesseract.py", line 280, in run_and_get_output\n run_tesseract(**kwargs)\n', '
File "/home/<username>/.local/share/Anki2/addons21/450181164/_vendor/pytesseract/pytesseract.py", line 257, in run_tesseract\n raise TesseractError(proc.returncode, get_errors(error_string))\n', '450181164._vendor.pytesseract.pytesseract.TesseractError: (1, "/usr/bin/tesseract: /usr/local/share/anki/bin/liblzma.so.5: version `XZ_5.2\' not found (required by /usr/lib/libarchive.so.13)")\n']

I didn't want to give a bad review since I haven't actually been able to use it yet, and I am sure it will be great if I can get it to work. Please let me know any other information that might be useful. Thanks!
on 1611452467
It was working great until the update on 1/19/20. I get an error when trying to run OCR (pops up immediately):

Caught exception:
Traceback (most recent call last):
File "/Users/----------/Library/Application Support/Anki2/addons21/450181164/gui.py", line 108, in <lambda>
act_run_ocr.triggered.connect(lambda b=browser: on_run_ocr(browser))
File "/Users/----------/Library/Application Support/Anki2/addons21/450181164/gui.py", line 24, in on_run_ocr
num_batches = ceil(num_notes / config["batch_size"])
TypeError: 'NoneType' object is not subscriptable

EDIT: Works again with the updated version!
Comment from author before post was edited
Just pushing a fix now!
on 1610645298
This is very useful for searching!

edit: Freezes my Anki, even when I process just one note. But it does work.
on 1607120488
This is perfect for making IO cards from textbook screenshots!
on 1605204118
Very good!
on 1603884924
wonderful. could you add additional options like the option to have the text inserted into a specified field instead of the pop-up text.
on 1602387453
The OCR is really good! You have saved me loads of time. Thanks a lot!
on 1601995086
ITS work with image occlusion ?
Comment from author
Yes it should work fine with image occlusion :)
on 1601899404
Brilliant!
Thank you