You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The combination of ara and orcb_int (whether on their own or included in a larger list such as above) trigger the abort each time
Suggested Fix
No known suggested fixes at this time.
tesseract -v
$ tesseract -v
tesseract 5.3.0
leptonica-1.82.0
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.2) : libpng 1.6.40 : libtiff 4.5.1 : zlib 1.2.13 : libwebp 1.2.4 : libopenjp2 2.5.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.6.2 zlib/1.2.13 liblzma/5.4.0 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.2
Found libcurl/8.2.1 OpenSSL/3.0.10 zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.3) libssh/0.10.5/openssl/zlib nghttp2/1.55.1 librtmp/2.3 OpenLDAP/2.6.6
This has also been confirmed with the latest git main.
Operating System
No response
Other Operating System
Ubuntu 23.10-based docker image, running under CentOS 7 host (all amd64). This has been reproduced in development setups however, including Ubuntu 20.04, 22.04, and WSL2.
uname -a
Linux e04873eb47b5 3.10.0-1160.95.1.el7.x86_64 #1 SMP Mon Jul 24 13:59:37 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux (container view - host view is the same aside from the hostname)
First of all: use the latest version (leptonica 1.84.1, tesseract 5.3.3) when reporting issue
I agree Tesseract should not crash, but your input image is not suitable input.
You should be able to demonstrate a problem with official data only, and it looks like you use custom-generated languages (mrz, ocrb_int). E.g. when I run tesseract i4148.jpg - -l eng+ara+fra+spa+chi_sim+chi_sim_vert+chi_tra+chi_tra_vert+rus or tesseract i4148.jpg - -l ara it does not crash - e.g. I can not reproduce an issue.
@marcreichman-pfi, it looks like at least ocrb_int.traineddata is needed to reproduce the issue. Can you provide more information about that model and add it to the issue?
Current Behavior
When running this command line:
The following occurs:
Estimating resolution as 303 !w_it.cycled_list():Error:Assert failed:in file src/ccstruct/pageres.cpp, line 1502 Aborted
This was first detected with API usage from an internal app, but reproducible in tesseract commandline.
Backtrace:
Expected Behavior
The expectation is that the frame would be analyzed for OCR data without aborting. Other language combinations which have run without crash are:
eng+ara+fra+spa+mrz+chi_sim+chi_sim_vert+chi_tra+chi_tra_vert+rus
ocrb_int+eng+fra+spa+mrz+chi_sim+chi_sim_vert+chi_tra+chi_tra_vert+rus
ocrb_int+eng
The combination of
ara
andorcb_int
(whether on their own or included in a larger list such as above) trigger the abort each timeSuggested Fix
No known suggested fixes at this time.
tesseract -v
This has also been confirmed with the latest git
main
.Operating System
No response
Other Operating System
Ubuntu 23.10-based docker image, running under CentOS 7 host (all amd64). This has been reproduced in development setups however, including Ubuntu 20.04, 22.04, and WSL2.
uname -a
Linux e04873eb47b5 3.10.0-1160.95.1.el7.x86_64 #1 SMP Mon Jul 24 13:59:37 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
(container view - host view is the same aside from the hostname)Compiler
CPU
Virtualization / Containers
Docker 24.0.6
Other Information
No response
The text was updated successfully, but these errors were encountered: