Comment 18 for bug 1899193

Revision history for this message
Julian Andres Klode (juliank) wrote :

Attached patch addresses the reference loop in apt_inst.DebFile, ensuring that no longer referencing the object will release it immediately, as opposed to when the cycle GC runs.

Subject: [PATCH] apt_inst.DebFile: Avoid reference cycle with control,data
 members

apt_inst.DebFile provides two members `data` and `control` for
easy access to those tarballs. Each of those members stores a
reference to the DebFile as its owner:

           v-----------------\
        control ----\ |
                     -> deb -|
        data ----/ |
           ^-----------------/

This means that whenever a DebFile is successfully constructed,
and no longer needed, it won't be collected until the GC runs,
which is bad, as the DebFile holds an open FileFd.

Introduce a __FileFd wrapper that holds the FileFd and becomes
the owner of both control and data, and replaces the direct use
of the FileFd in ArArchive/DebFile:

          v-----------------------------\
        control ----\ \
                     -> __FileFd <- deb -|
        data ----/ /
          ^-----------------------------/

This avoids the reference cycle, ensuring the memory and file
descriptor are released by the reference counter as soon as
the reference count drops to 0.

A future version should move `apt_inst.__FileFd` to `apt_pkg.FileFd`
and expose all the methods, such that people can make use of FileFd's
extensive compression support.

We have a similar cycle in TagFile that we have yet to address,
the problem there is arguably more frustrating, as the buffer
I believe is stored inside the TagFile, and that's really shared
between the TagSection objects.

This is related to LP: #1899193 and CVE-2020-27351, but an additional
hardening measure - the fix for those bugs was for more direct leaks.