nginx keeps writing to rotated logs

Bug #1642794 reported by Dmitry Sutyagin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Alexey Stupnikov
7.0.x
Fix Released
Critical
Alexey Stupnikov
Mitaka
Invalid
High
Alexey Stupnikov

Bug Description

Fuel 8.0 MU3

Some services inside containers: nginx, postgres, astute - keep open descriptors to files which have been removed. The problem is with nginx, which keeps open rotated log files and keeps writing to them. This causes log rotation to technically fail, and space to be polluted but ever expanding but non-visible files.

How to reproduce:
1. deploy Fuel 8.0
2. wait 2 days
3. run `lsof 2>/dev/null| grep "(deleted)" | grep var/log`

Expected result - no output
Actual result - many open descriptors to logs which have been deleted
Additionelly, as expected in this situation new logs are empty.

Example:

[root@fuel ~]# lsof 2>/dev/null | grep "(deleted)" | grep var/log
nginx 8558 root 2w REG 253,2 832 392504 /var/log/nginx/error.log-20161116 (deleted)
nginx 8558 root 4w REG 253,2 832 392504 /var/log/nginx/error.log-20161116 (deleted)
nginx 8558 root 6w REG 253,2 3479159 392513 /var/log/nginx/access_nailgun.log-20161116 (deleted)
nginx 8558 root 7w REG 253,2 271475922 392514 /var/log/nginx/error_nailgun.log-20161116 (deleted)
nginx 8558 root 8w REG 253,2 332456 392515 /var/log/nginx/access_repo.log-20161116 (deleted)
nginx 8558 root 9w REG 253,2 178745 392516 /var/log/nginx/error_repo.log-20161117 (deleted)
nginx 8559 dockerroot 2w REG 253,2 832 392504 /var/log/nginx/error.log-20161116 (deleted)
nginx 8559 dockerroot 4w REG 253,2 832 392504 /var/log/nginx/error.log-20161116 (deleted)
nginx 8559 dockerroot 6w REG 253,2 3479159 392513 /var/log/nginx/access_nailgun.log-20161116 (deleted)
nginx 8559 dockerroot 7w REG 253,2 271475922 392514 /var/log/nginx/error_nailgun.log-20161116 (deleted)
nginx 8559 dockerroot 8w REG 253,2 332456 392515 /var/log/nginx/access_repo.log-20161116 (deleted)
nginx 8559 dockerroot 9w REG 253,2 178745 392516 /var/log/nginx/error_repo.log-20161117 (deleted)
[root@fuel ~]# date
Fri Nov 18 02:13:41 UTC 2016
[root@fuel ~]# uptime
 02:13:52 up 1 day, 5:58, 1 user, load average: 0.00, 0.01, 0.05
[root@fuel ~]#

Changed in fuel:
milestone: none → 8.0-mu-4
milestone: 8.0-mu-4 → none
Changed in fuel:
milestone: none → 8.0-updates
tags: added: area-library
Changed in fuel:
importance: High → Critical
Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

proposed fix for 8.0 and 7.0:

In /etc/logrotate.d/nginx (provided by nginx-1.6.3-6.el7~mos1.x86_64), change:

/var/log/nginx/*log {
    create 0644 nginx nginx
    daily
    rotate 10
    missingok
    notifempty
    compress
    sharedscripts
    postrotate
        /bin/kill -USR1 `cat /run/nginx.pid 2>/dev/null` 2>/dev/null || true
    endscript
}

to:

/var/log/docker-logs/nginx/*log {
    create 0644 nginx nginx
    daily
    rotate 10
    missingok
    notifempty
    compress
    sharedscripts
    postrotate
        /bin/kill -USR1 `pgrep -f /usr/sbin/nginx 2>/dev/null` 2>/dev/null || true
    endscript
}

Additionally, remove /var/log/docker-logs/nginx/*log from /etc/logrotate.d/fuel.nodaily, otherwise the above change will not work.

For Fuel 9.0, which is also affected due to double logrotate policy - remove nginx from /etc/logrotate.d/fuel.nodaily, the logrotate config installed with nginx is working fine by itself.

no longer affects: fuel/8.0.x
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS Maintenance (mos-maintenance)
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Need more time to get this issue fixed - retargeted to 9.3

tags: added: move-to-9.3
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

Moving to Invalid for 9.3 Milestone, as I was unable to reproduce this bug in our lab. I have spoken with Oleksiy Molchanov and he confirmed that this issue was nominated to 9.3 just to check.

Note, similar issue was reported on stackexchange.com [1], so feel free to re-open if you will catch it in production.

[1] http://unix.stackexchange.com/questions/259091/nginx-log-rotation-doesnt-seem-to-be-working-correctly

Changed in fuel:
assignee: MOS Maintenance (mos-maintenance) → Alexey Stupnikov (astupnikov)
tags: removed: move-to-9.3
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

There is an intersection between general logrotate configuration and fuel.nodaily runs in MOS9 (but not MOS10). I will open a separate bug on this an I will try to analyze possible impact on the product there.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

Opened a bug #1656305 about inconsistent logrotate configuration in mitaka.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

It looks like deleted files with open descriptors are created during installation (I was able to reproduce 3 of 3 times). The problem didn't occur again after executing
`/bin/kill -USR1 `cat /run/nginx.pid 2>/dev/null` 2>/dev/null || true`
command inside nginx container.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

This issue is caused by logrotate cron jobs that are started inside nailgun and keystone containers. Obviously, they can't reset nginx/postgres running in another containers. As a result, we got this problem.

The obvious way to fix this issue is to delete logrotate cron jobs from containers. It also possible to delete logrotate configuration from installed software, but it will take additional effort.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/428122

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

BTW, I would like to confirm the Critical severity of this bug, fix has to be included in next MUs for both MOS7 and MOS8.

Changed in fuel:
status: Confirmed → In Progress
milestone: 8.0-updates → 8.0-mu-4
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/435359

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/435360

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/7.0)

Change abandoned by Alexey Stupnikov (<email address hidden>) on branch: stable/7.0
Review: https://review.openstack.org/435359
Reason: Incorrect change ID in commit MSG.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/428122
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=b0d0afe5ce284f8da6816f653cfbc6aff5771b66
Submitter: Jenkins
Branch: stable/8.0

commit b0d0afe5ce284f8da6816f653cfbc6aff5771b66
Author: Alexey Stupnikov <email address hidden>
Date: Thu Feb 2 15:43:07 2017 +0300

    Disable logrotate cron job in nailgun and keystone containers

    Some nailgun and keystone jobs are started periodically with
    dedicated cron jobs. By default, logrotate cron job is
    configured to run daily and to use configuration files from
    logrotate.d directory. Some packages like cobbler, nginx and
    apache are shipped with specific logrotate configuration file
    that will be added to logrotate.d directory after install.

    As a result, we do not rotate some log files correctly, since
    it is impossible to restart process from another container.
    This patch will disable logrotate cron job for nailgun and
    keystone containers.

    Change-Id: I8696bf02fc5aded3b919699b45c576de626bc9f3
    Closes-bug: #1642794

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/7.0)

Reviewed: https://review.openstack.org/435360
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=b7535fdaef901f42acb407238a1bf3405edac59b
Submitter: Jenkins
Branch: stable/7.0

commit b7535fdaef901f42acb407238a1bf3405edac59b
Author: Alexey Stupnikov <email address hidden>
Date: Thu Feb 2 15:43:07 2017 +0300

    Disable logrotate cron job in nailgun and keystone containers

    Some nailgun and keystone jobs are started periodically with
    dedicated cron jobs. By default, logrotate cron job is
    configured to run daily and to use configuration files from
    logrotate.d directory. Some packages like cobbler, nginx and
    apache are shipped with specific logrotate configuration file
    that will be added to logrotate.d directory after install.

    As a result, we do not rotate some log files correctly, since
    it is impossible to restart process from another container.
    This patch will disable logrotate cron job for nailgun and
    keystone containers.

    (cherry-picked from commit b0d0afe5ce284f8da6816f653cfbc6aff5771b66)

    Change-Id: I8696bf02fc5aded3b919699b45c576de626bc9f3
    Closes-bug: #1642794

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 7.0 + MU7 updates.

Before updates:
[root@nailgun ~]# dockerctl shell nailgun
[root@nailgun ~]# ll /etc/cron.daily/logrotate
-rwxr-xr-x 1 root root 180 Aug 1 2012 /etc/cron.daily/logrotate

[root@nailgun ~]# dockerctl shell keystone
[root@nailgun ~]# ll /etc/cron.daily/logrotate
-rwxr-xr-x 1 root root 180 Aug 1 2012 /etc/cron.daily/logrotate

After updates:
[root@nailgun ~]# dockerctl shell nailgun
[root@nailgun ~]# ll /etc/cron.daily/logrotate
ls: cannot access /etc/cron.daily/logrotate: No such file or directory

[root@nailgun ~]# dockerctl shell keystone
[root@nailgun ~]# ll /etc/cron.daily/logrotate
ls: cannot access /etc/cron.daily/logrotate: No such file or directory

tags: removed: on-verification
tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 8.0 + MU4 updates.

Before updates:
[root@nailgun ~]# dockerctl shell nailgun
[root@nailgun ~]# ll /etc/cron.daily/logrotate
-rwx------ 1 root root 180 Jul 31 2013 /etc/cron.daily/logrotate

[root@nailgun ~]# dockerctl shell keystone
[root@nailgun ~]# ll /etc/cron.daily/logrotate
-rwx------ 1 root root 180 Jul 31 2013 /etc/cron.daily/logrotate

After updates:
[root@nailgun ~]# dockerctl shell nailgun
[root@nailgun ~]# ll /etc/cron.daily/logrotate
ls: cannot access /etc/cron.daily/logrotate: No such file or directory

[root@nailgun ~]# dockerctl shell keystone
[root@nailgun ~]# ll /etc/cron.daily/logrotate
ls: cannot access /etc/cron.daily/logrotate: No such file or directory

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.