impossible to make recognition of text

Bug #1532447 reported by L'Africain
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
yagf (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Hi,
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Package :
yagf:
  Installé : 0.9.2.1-1
  Candidat : 0.9.2.1-1
 Table de version :
 *** 0.9.2.1-1 0

When I try to make recognition of the image to text I have this message with tesseract:
Starting tessaract failed :
The system said: Tesseract Open Source OCR Engine v3.03 with Leptonica
Error in pixReadStreamBmp: bmp(1) read fail
Error in pixReadStream: bmp: no pix returned
Error in pixRead: pix not read
Error in pixGetInputFormat: pix not defined
Reading input.bmp as a list of filenames...
Error in fopenReadStream: file not found
Error in pixRead: image file not found
Image file BM²þG cannot be read!
Error during processing.

And this message with cuneiform :
Starting cuneiform failed
The system said: Cuneiform for Linux 1.1.0
Magick: Length and filesize do not match (/home/cyrille/.config/yagf/input.bmp) reported by coders/bmp.c:807 (ReadBMPImage)

I tried to make recognition scanned files save as jpg.
And recognition failed, with no issues. It appears suddenly without reason. I reinstalled the system for other reasons, but the problem happens yet.

I discovered that the problem appears when I use the option "Correct the page skew".

Revision history for this message
L'Africain (lafricain79) wrote :
L'Africain (lafricain79)
description: updated
Revision history for this message
Tehnick (tehnick) wrote :

Try version 0.9.5+repack1-1 please.

Revision history for this message
L'Africain (lafricain79) wrote :

It's working only with cuneiform. if I try to use tesseract, I have this alert:
You have selected recognising Française language using tesseract OCR. Currently the data for this language is not installed in your system. Please install the tesseract data files for "fra" from your system repository.

But all is installed on my system (Ubuntu 20.04).

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in yagf (Ubuntu):
status: New → Confirmed
Revision history for this message
András Korn (kornandras) wrote :

I have the same problem trying to recognize a Hungarian document. The tesseract-ocr-hun package is installed buy yagf complains that it is not. Unfortunately the error message doesn't contain any useful troubleshooting information (such as how yagf arrived at the incorrect conclusion that the Hungarian data files for Tesseract were not installed).

Revision history for this message
András Korn (kornandras) wrote :

It seems that the problem is that "Edit > Settings > Path to Tesseract Data files" is, by default, "/". Pointing it to "/usr/share/tesseract-ocr/5" instead gets rid of the error message, although attempting text recognition still fails to produce any result.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.