Important CPU and memory usage when zooming

Bug #2012678 reported by Sébastien Lamy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qpdfview
Opinion
Undecided
Unassigned

Bug Description

On a small configuration (Atom processor 1.8ghz, 2GB RAM), When zooming a lot on page with an image build with different layers (something like a map with different layers of informations), qpdfview consumes a lot of RAM and CPU and display goes blank before it can show anything. I tried various pdf viewer, and this is a quite common issue. I found another open source program that has not this issue : xpdfreader. But the UI is not as good as the qpdf one.

https://www.xpdfreader.com/index.html

Maybe you should consider using the same pdf renderer ? (I suppose this is XpdfWidget/Qt)

http://www.glyphandcog.com/XpdfWidgetQt.html

Revision history for this message
Sébastien Lamy (lamyseba-b) wrote :

Over this problem, there is a "redraw" behavior that is quite agressive, that seems only to occurs when big zoom factor (500%) causes drawing latency.

I.E when moving another window in the desktop, the qpdfview window redraw all its content, becoming blank while redrawing, and then showing the same thing as it was before. CPU jumps just to redraw an already drawed image.

Same thing occurs when minimizing windows and the maximizing again.

I join a sample pdf, to use with zoom at 500%

Revision history for this message
Adam Reichold (adamreichold) wrote :

Hello Sebastian,

we basically already use the same renderer as xpdfreader, namely Poppler which originally was a fork of xpdfreader. However, I don't think what you describe are issues with the speed of rendering.

For one thing, redraw can be slow if qpdfview's cache is not large enough to keep a single page within it at large zoom levels. There are two things you can try to improve things, either increase the cache size on the "Graphics" tab of the settings dialog. Or alternatively, enable "Use tiling" on the same tab so that qpdfview does not need to cache the whole page but only visible parts.

Note however, that while these changes might help in this particular situation, there are not generally helpful. For example, a cache that is too large will just increase memory pressure and might even lead to swapping making the application slower instead of faster. Similarly, tiling adds measurable overhead as certain parts of rendering a page need to be repeated for each tile so that when the whole page fits into the cache, rendering it once is often significantly faster than as multiple tiles (even when this happens in parallel).

Hence, I am currently not convinced there is anything about the current defaults that should be changed.

Regards,
Adam

Changed in qpdfview:
status: New → Opinion
Revision history for this message
Sébastien Lamy (lamyseba-b) wrote :

Hello Adam, thank you for your answer.

Poppler is indeed the library used by Qpdfview, but it is a fork of the one used by xpdf 3.0. Now xpdf is 4.x. Are changes in xpdf well downstreamed to poppler ?
glyph and cog write about poppler and xpdf: https://www.glyphandcog.com/opensource.html
poppler tell it is based on xpdf 3.0: https://poppler.freedesktop.org/

I have tried the options you pointed, and results are not convicing. On this kind of file, with big zoom level (500%), qpdfview stays very slow, freezing system or display for 5-10 seconds at least to display requested part of the page.
* Tiling mode make things lot slower at the first display (when zooming change): I have to wait for more than 60s to get my part fully drawn, which is a real pain. But then it is more ok for scrolling once zoom level is set and image is drawn. I think this is because a quite big part of the full image is drawn and in memory. If I scroll to a part that is still "blank" (not drawn), the drawing of this part will take again a lot more time to load than in non-tiling mode. In tiling mode, hell begins if I want to zomm out and then in again, you can imagine.
* Augmenting cache did not give noticeable improve.

On the same system, reading the same pdf file at the same level of zoom (or greater), xpdf is a lot faster. There is indeed a delay to draw the lines over the map, but it is always less than 2 seconds, whatever the zoom level. When you scroll the image, while at big zoom level (500 - 800%), you may see blank zone on the part that were not shown before scrolling, but this blank zones are then quickly drawn (less than 1s). In qpdfview blank zones that appear while scrolling take a lot more time to draw (4-5x more time at least, when not in tiling mode, and worse in tiling mode).

I noticed that when not in tiling mode, qpdfview had a peak of memory use while drawing (say x2 memory usage). When drawing is done, the memory usage goes back to normal. When in tiling mode, the memory usage is greater: it is the same as the peak level in non-tiling. Xpdf has no such behaviour and memory usage stays the same when exploring the document (zoom, scroll).

And last but not least, at big zoom level, in non-tiling mode, Qpdfview redraw its display every time it lost or gain focus (display of document goes full blank, memory and CPU usage grow, and then document is shown again). When you have to look closely at a pdf document to write something about in another application, this is just impossible. I think the "redraw on focus change" is a problem that concerns only qpdfview, and has not to do with the pdf library used. It may be a parameter to change to tell wether or not display needs to be redrawn when window focus changes.

Revision history for this message
Sébastien Lamy (lamyseba-b) wrote :

I'm sending a more complicated pdf file, that make qpdfview freeze when zooming at 500% on my machine. Memory usage goes very high. I think at a zoom level of 500%, you may experiment the resource usage problem even on a faster machine with more RAM.

On a pdf with 2 pages of the same kind, With preload of one page enabled in the "show" options, qpdfview will crash when entering the zoom of 500% (use 100% of available RAM memory and swap). But with only one page qpdfview freeze and slow down all system, but does not crash. I'd like to post here the 2 pages version, but it seems to big for launchpad.

Xpdf memory usage with this same file on the same machine stays ok (7% of available memory), even at higher zoom level (800%).

Revision history for this message
Adam Reichold (adamreichold) wrote :

Poppler mostly tries to include changes from xpdfreader but I am not sure if that is still possible as the code bases continue to diverge. The main difference in the behaviour you report is probably that xpdfreader renders directly to the screen instead of into off-screen buffers which can displayed from a cache.

This is also why I am convinced that the redraw-on-focus-change is a cache size issue: The window certainly needs to be redrawn after gaining the focus and the image has come from somewhere. It usually comes from the cache which avoids any rendering. If this operation is slow because the images need to be rendered anew, this hints at a cache that is too small to keep the image data around. (Other options which effectively reduce the available cache size are "Keep obsolete pixmaps" and "Prefetch". Disabling those should also avoid "wasting" any cache capacity.)

If you are convinced that this is a rendering issue, please try to reproduce this using poppler-utils, e.g. pdftoppm, so that this can be reported upstream. Please make sure to use an up-to-date version of Poppler though as the above map renders, if somewhat slowly here taking 1.5 GB of memory at 500 % zoom without tiling and 650 MB with tiling.

As a closing remark, I am willing to help you diagnose this and if anything actionable is found to work on and include changes to qpdfview's code base. But if this stays at the level of "xpdfreader is so much better at this", then my advice is to use xpdfreader and call it a day.

Revision history for this message
Sébastien Lamy (lamyseba-b) wrote :
Download full text (5.0 KiB)

Thank you for your Answer.

I insisted on the diff with xpdf-reader because it seemed to me it could not be the same library that leads to results so differents in term of performance when dealing with "big" files.

Qpdfview is a great tool with a nice UI, and it fits perfect for my everyday use on "normal" documents (mainly text, or images with not so high resolution, that I do not need to zoom). On this kind of everyday documents it is indeed a fast and lightweight reader, convenient for low-resource computers as mine.

Xpdfreader UI is much less friendly, and lacks features (like thumbnails of pages). Scrolling with mouse wheel is a pain (very small steps, so very slow to get down even one page), you cannot grab document to scroll.

It seemed to me that the "light-weight" characteristic was indeed a goal for Qpdfview, maybe I was mistaken. Anyway, this is why I thought the problem was interesting to report here. And I talked about Xpdfreader to encourage "other already do it, they are open-source so maybe you can take advantage of their codebase and libraries". I understand this is more difficult than it may seem at first look from outside.

Qpdfview is often reported as a lightweight reader in external reviews, but indeed on the official page and github page, this is not written as an explicit goal, so maybe I was mistaken.

As of xpdfreader, I know I have no chance about reporting "your UI is just a disaster, could you change this to be like qpdfview, please ?". But true, I can switch to xpdf when dealing with big document, even if browsing this documents will be quite a pain with this "no grab" interface.

I took the time to launch qpdf view alone on my desktop, disabled "prefetch" option (did not find the "keep obsolete pixmap" option), increased cache size to 256MB, open sample_map2 and wait till rendering is done at 500% in non-tiling mode. It took 15 minutes to finally display document (!!!), with a peak RAM usage at 1,1GB, and 750MB RAM used when done rendering. After the 15minutes waiting qpdfview to load, it seamed that full document page was available for scrolling, there was then no blank parts that had to be recomputed. During loading, CPU usage was not at peak, it is clearly only a memory usage/management problem.

Problem of redrawing on focus gain/lost was indeed directly linked to cache size. 256Mo was enough to get rid of the redraw problem. 64Mo was to small.

To my surprise, tiling mode at 500% was more efficient (at scale 100% or less it was lot slower than non-tiling). More efficient means, taking like 5/6sec. to display the part of the document inside the window. Scrolling to a part still blank will take 5 to 15 seconds to show, wich is already a problematic delay.

So here are my suggestions, I'm not sure to be writing anything relevant, and understand that anyway optimizing resource usage may be lots of work, that nobody may be wanting to do if usability on small configurations is not a goal.

* Why do the cache need to have the full document page (or even more pages if prefetch enabled) ? At least for redrawing at focus gain/focus lost, it will be ok to just have the part of the document that is displayed on scre...

Read more...

Revision history for this message
Adam Reichold (adamreichold) wrote :
Download full text (4.9 KiB)

> I took the time to launch qpdf view alone on my desktop, disabled "prefetch" option (did not find the "keep obsolete pixmap" option), increased cache size to 256MB, open sample_map2 and wait till rendering is done at 500% in non-tiling mode. It took 15 minutes to finally display document (!!!), with a peak RAM usage at 1,1GB, and 750MB RAM used when done rendering. After the 15minutes waiting qpdfview to load, it seamed that full document page was available for scrolling, there was then no blank parts that had to be recomputed. During loading, CPU usage was not at peak, it is clearly only a memory usage/management problem.

As mentioned above, please try running pdftoppm from poppler-utils using something like 5 * 72 dpi to determine what part of that is the plain rendering and what parts comes from memory management. Also as mentioned above, make sure to include the version numbers and ideally try to current version of poppler-utils, i.e. 23.03.0.

> * Why do the cache need to have the full document page (or even more pages if prefetch enabled) ? At least for redrawing at focus gain/focus lost, it will be ok to just have the part of the document that is displayed on screen. It is not necessary to have full document page for this kind of redraw.

Because for most documents at common scale factors, this is the most efficient way of handling continuously scrolling and jumping between multiple pages, e.g. think of a scientific paper which is seldom read only from front to back. Rendering whole pages once has the lowest overhead and keeping a few of them around helps when changing location.

> * You say xpdf is drawing directly. Is it possible to do same thing, and putting image in cache afterward, rather than before displaying ?

This is not really possible with Poppler as its Qt integration is not a feature parity with its Splash backend yet. (You can try it by changing the PDF backend from Splash to Arthur via the settings.)

But even then, qpdfview does support multiple renderer libraries which sort of forces a least common denominator approach as not all libraries provide any Qt integration at all, meaning exchanging image data. xpdfreader is in a somewhat special position as it is a rather monolithic code base which includes both the renderer and a rather specific kind of Qt integration. A generic library with a stable API and diverse consumers like Poppler by design uses much narrower interfaces.

> * Is there no way to have constant RAM usage, making a kind of "hybrid" tiling mode ? I mean

Tiles are currently fixed size (in pixels) assuming that processing time scales with the number of pixels. This relation does not always hold. The additional memory consumption when many tiles are on screen is most likely due to rendering them in parallel which means temporary data structures and buffer that would only be allocated once for the page, are allocated once for each tile at the same time. This is basically why tiling is not enabled by default.

A more adaptive tiling strategy would certainly be possible and it isn't sure whether it would be worth it as tiling is mainly geared towards reading huge documents and high zoom factors and usuall...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.