pdftotext -htmlmeta does output incomplete metadata, pdfinfo outputs them all

Bug #993292 reported by Torsten Krah
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Poppler
New
Unknown
poppler (Ubuntu)
New
Undecided
Unassigned

Bug Description

pdftotext -htmlmeta does miss metadata from PDF catalog. pdfinfo does output all values known:

e.g. a pdfinfo output:

Title: Titel
Author: Word
Creator: WordToPDF 2.4 build 127
Producer: AFPL Ghostscript 8.54
CreationDate: Fri Jul 2 09:14:02 2007
ModDate: Fri Jul 2 09:14:02 2007
Tagged: no
Pages: 6
Encrypted: no
Page size: 595 x 842 pts (A4)
File size: 104664 bytes
Optimized: no
PDF version: 1.3

in contrast the meta section of the pdftotext -htmlmeta output:

<head>
<title>Titel</title>
<meta name="Author" content="Word"/>
<meta name="Creator" content="WordToPDF 2.4 build 127"/>
<meta name="Producer" content="AFPL Ghostscript 8.54"/>
<meta name="CreationDate" content=""/>
</head>

Does not match and miss some meta data.

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: poppler-utils 0.16.7-2ubuntu2
Uname: Linux 3.3.3-030303-generic x86_64
NonfreeKernelModules: vboxpci vboxnetadp vboxnetflt vboxdrv
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
Date: Wed May 2 15:44:06 2012
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64+mac (20110427.1)
ProcEnviron:
 LANGUAGE=en_US.UTF-8
 PATH=(custom, user)
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
SourcePackage: poppler
UpgradeStatus: Upgraded to oneiric on 2012-02-16 (76 days ago)

Revision history for this message
Torsten Krah (tkrah) wrote :
Revision history for this message
In , madbiologist (me-again) wrote :

This bug was originally reported at https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/993292

pdftotext -htmlmeta output is missing metadata from PDF catalog. pdfinfo does output all values known:

e.g. a pdfinfo output:

Title: Titel
Author: Word
Creator: WordToPDF 2.4 build 127
Producer: AFPL Ghostscript 8.54
CreationDate: Fri Jul 2 09:14:02 2007
ModDate: Fri Jul 2 09:14:02 2007
Tagged: no
Pages: 6
Encrypted: no
Page size: 595 x 842 pts (A4)
File size: 104664 bytes
Optimized: no
PDF version: 1.3

in contrast the meta section of the pdftotext -htmlmeta output:

<head>
<title>Titel</title>
<meta name="Author" content="Word"/>
<meta name="Creator" content="WordToPDF 2.4 build 127"/>
<meta name="Producer" content="AFPL Ghostscript 8.54"/>
<meta name="CreationDate" content=""/>
</head>

Revision history for this message
madbiologist (me-again) wrote :

I have reported this bug upstream at https://bugs.freedesktop.org/show_bug.cgi?id=50646

Does this still occur on Ubuntu 12.04 "Precise Pangolin" with poppler 0.18.4-1ubuntu2 ?

Changed in poppler (Ubuntu):
status: New → Incomplete
Changed in poppler:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
Torsten Krah (tkrah) wrote :

Yes this does still occur with 12.04.

madbiologist (me-again)
tags: added: precise
Revision history for this message
In , Gitlab-migration (gitlab-migration) wrote :

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/136.

Changed in poppler:
status: Confirmed → Unknown
madbiologist (me-again)
Changed in poppler (Ubuntu):
status: Incomplete → New
madbiologist (me-again)
no longer affects: poppler
Changed in poppler:
status: Unknown → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.