[SRU] gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from meta_context_start() from main() ["Failed to start X Wayland: Directory \"/tmp/.X11-unix\" is not writable"]

Bug #2069564 reported by errors.ubuntu.com bug bridge
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
mutter (Ubuntu)
Fix Released
High
Nathan Teodosio
Noble
Confirmed
High
Alessandro Astone
Oracular
Won't Fix
High
Nathan Teodosio
Plucky
Fix Released
High
Nathan Teodosio

Bug Description

If you are hitting this bug, please write your case in https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3984.

[Impact]

This causes one of the top Gnome-Shell crashes as caught by Ubuntu error tracker.

It happens when /tmp/.X11-unix is accessible by the user but does not pass a stricter permission check Mutter makes.

In the typical Ubuntu setup, both /tmp and /tmp/.X11-unix are root:root rwxrwxrwt, so it passes that stricter permission check.

[Test case]

1. Stop GDM:

  systemctl stop gdm

This will bring you to the virtual console (i.e. without a graphical environment, only command-line).

2. Install mutter, gir1.2-mutter-15, libmutter-15-0, mutter-common, mutter-common-bin from proposed (the exact steps to enable proposed are laid down by the SRU bot comment to this bug).

3. Execute:

  chmod 1007 /tmp/.X11-unix

4. Execute:

  MUTTER_DEBUG=all G_DEBUG=1 dbus-run-session -- gnome-shell --display-server --wayland &>log

Gnome-Shell must start and Mutter (you can look for them in 'ps aux' output if you are not acquainted with the looks of those pieces of software).

5. Now open xterm and kill the graphical environment by executing:

  pkill wayland

Verify that there are no mentions of the issue directory in the log, i.e. 'grep X11-unix log' returns nothing.

[Where problems could occur]

In Mutter or Gnome-Shell, particularly at launch phase. Also make sure Xorg applications behave normally (that's why xterm is included in test case), without failing to open or delays.

---------------

The Ubuntu Error Tracker has been receiving reports about a problem regarding gnome-shell. This problem was most recently seen with package version 46.0-0ubuntu5.1, the problem page at https://errors.ubuntu.com/problem/a7dbd55723a8ea326d1ccf32e31fe307151786c2 contains more details, including versions of packages affected, stacktrace or traceback, and individual crash reports.
If you do not have access to the Ubuntu Error Tracker and are a software developer, you can request it at http://forms.canonical.com/reports/.

summary: - /usr/bin/gnome-
- shell:5:meta_wayland_compositor_new:meta_context_start:main
+ gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from
+ meta_context_start() from main()
summary: gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from
- meta_context_start() from main()
+ meta_context_start() from main() ["Failed to start X Wayland: Directory
+ \"/tmp/.X11-unix\" is not writable"]
Revision history for this message
Launchpad Janitor (janitor) wrote : Re: gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from meta_context_start() from main() ["Failed to start X Wayland: Directory \"/tmp/.X11-unix\" is not writable"]

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gnome-shell (Ubuntu):
status: New → Confirmed
Changed in mutter (Ubuntu):
status: New → Confirmed
Revision history for this message
Nathan Teodosio (nteodosio) wrote :

This comes from ensure_x11_unix_perms, which is determining that the /tmp/.X11-unix does not have at least permission bits 022:

--->
static gboolean
ensure_x11_unix_perms (GError **error)
{
  /* Try to detect systems on which /tmp/.X11-unix is owned by neither root nor
   * ourselves because in that case the owner can take over the socket we create
   * (symlink races are fixed in linux 800179c9b8a1). This should not be
   * possible in the first place and systems should come with some way to ensure
   * that's the case (systemd-tmpfiles, polyinstantiation …).
   *
   * That check however only works if we see the root user namespace which might
   * not be the case when running in e.g. toolbx (root and other user are all
   * mapped to overflowuid). */
  struct stat x11_tmp, tmp;

  if (lstat (X11_TMP_UNIX_DIR, &x11_tmp) != 0)
    {
      g_set_error (error, G_IO_ERROR, g_io_error_from_errno (errno),
                   "Failed to check permissions on directory \"%s\": %s",
                   X11_TMP_UNIX_DIR, g_strerror (errno));
      return FALSE;
    }

  if (lstat (TMP_UNIX_DIR, &tmp) != 0)
    {
      g_set_error (error, G_IO_ERROR, g_io_error_from_errno (errno),
                   "Failed to check permissions on directory \"%s\": %s",
                   TMP_UNIX_DIR, g_strerror (errno));
      return FALSE;
    }

  /* If the directory already exists, it should belong to the same
   * user as /tmp or belong to ourselves ...
   * (if /tmp is not owned by root or ourselves we're in deep trouble) */
  if (x11_tmp.st_uid != tmp.st_uid && x11_tmp.st_uid != getuid ())
    {
      g_set_error (error, G_IO_ERROR, G_IO_ERROR_PERMISSION_DENIED,
                   "Wrong ownership for directory \"%s\"",
                   X11_TMP_UNIX_DIR);
      return FALSE;
    }

  /* ... be writable ... */
  if ((x11_tmp.st_mode & 0022) != 0022)
    {
      g_set_error (error, G_IO_ERROR, G_IO_ERROR_PERMISSION_DENIED,
                   "Directory \"%s\" is not writable",
                   X11_TMP_UNIX_DIR);
      return FALSE;
    }
<---

So the bug is not in Mutter.

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

In the case that /tmp is owned by ourselves and it matches 0200 but not 0022, it is still writable but it would fail with the present error.

I do not know if this is actually an intended blockage and thus the description is just not clear enough or if this is an edge case to consider.

Nor do we know whether all the reports here would be solved by allowing this scenario, namely {we are not root, /tmp is ours, /tmp/.X11-unix is at least 0200}.

Revision history for this message
Pedro Monteiro (pedro.monteiro) wrote :

Just to let you know, my problem was a slight different message: "Failed to start X Wayland: Wrong ownership for directory "/tmp/.X11-unix"
And the "/tmp/.X11-unix" directory had the right permissions.
Turns out the message is quite misleading in my case because the problem was actually in "/tmp" itself being owned by my user and not root.
I think at least the message should be clearer.

(https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/7857)

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

Thanks, Pedro, but without the hard facts it is difficult to ascertain whether this is a bug. Namely,

  whoami; stat /tmp /tmp/.X11-unix

Revision history for this message
Nathan Teodosio (nteodosio) wrote :
description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Today this crash is in a tie for most common gnome-shell crash reported from oracular (on aarch64, looks like Raspberry Pi each time):

https://errors.ubuntu.com/problem/71fb5661de02aa699fcaa6c2067212e2aa004c42

tags: added: oracular
removed: mantic
Changed in mutter (Ubuntu):
assignee: nobody → Nathan Teodosio (nteodosio)
status: Confirmed → In Progress
importance: Undecided → High
Changed in mutter (Ubuntu):
milestone: none → ubuntu-25.04
no longer affects: gnome-shell (Ubuntu)
Changed in mutter (Ubuntu Oracular):
milestone: none → oracular-updates
importance: Undecided → High
status: New → Triaged
Revision history for this message
Nathan Teodosio (nteodosio) wrote :
Revision history for this message
Nathan Teodosio (nteodosio) wrote :
Changed in mutter (Ubuntu Plucky):
status: In Progress → Fix Committed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Nathan, armhf rebuilds are failing repeatedly on one test case (x11-sync). Can you investigate?

https://launchpad.net/ubuntu/+source/mutter/47.0-1ubuntu6

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

Yes, thank you for calling my attention to that.

Revision history for this message
Nathan Teodosio (nteodosio) wrote : Re: [Bug 2069564] Re: gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from meta_context_start() from main() ["Failed to start X Wayland: Directory \"/tmp/.X11-unix\" is not writable"]

I sent off builds[1] of the previous release, 47.0-1ubuntu5 (just renamed it to
*ubuntu7 so Launchpad wouldn't complain), and it fails at x11-sync for ARM and
x64 too, so not a problem introduced by this patch.

The error is raised when trying to replace an existing Mutter:

--->
XIO: fatal IO error 9 (Bad file descriptor) on X server ":100"

      after 1617 requests (1617 known processed) with 0 events remaining.
<---

[1]https://launchpad.net/~nteodosio/+archive/ubuntu/fixes/+sourcepub/16628604/+listing-archive-extra

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

For what it's worth, I killed my Xserver and in started mutter 47.0-1ubuntu5 so:

  startx /bin/mutter --x11 &> logmutter

I spawned a Xterm in it and ran

  mutter --x11 --replace &> logmutter2

I didn't get that crash looking but indeed the replace failed as I
got thrown back to the virtual console.

Revision history for this message
Nathan Teodosio (nteodosio) wrote : Re: gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from meta_context_start() from main() ["Failed to start X Wayland: Directory \"/tmp/.X11-unix\" is not writable"]
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mutter - 47.0-1ubuntu7

---------------
mutter (47.0-1ubuntu7) plucky; urgency=medium

  [ Nathan Pratta Teodosio ]
  * Warn if the X11 temp directory does not have the expected permission set.

  [ Simon McVittie ]
  * d/rules: Treat x11-test.sh as flaky

 -- Marco Trevisan (Treviño) <email address hidden> Wed, 20 Nov 2024 15:33:34 +0100

Changed in mutter (Ubuntu Plucky):
status: Fix Committed → Fix Released
Changed in mutter (Ubuntu Oracular):
assignee: nobody → Nathan Teodosio (nteodosio)
status: Triaged → In Progress
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
description: updated
summary: - gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new() from
- meta_context_start() from main() ["Failed to start X Wayland: Directory
- \"/tmp/.X11-unix\" is not writable"]
+ [SRU] gnome-shell crashed with SIGTRAP in meta_wayland_compositor_new()
+ from meta_context_start() from main() ["Failed to start X Wayland:
+ Directory \"/tmp/.X11-unix\" is not writable"]
Changed in mutter (Ubuntu Oracular):
status: In Progress → Triaged
description: updated
Changed in mutter (Ubuntu Oracular):
status: Triaged → In Progress
tags: added: udeng-4319
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello errors.ubuntu.com, or anyone else affected,

Accepted mutter into oracular-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mutter/47.0-1ubuntu4.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-oracular to verification-done-oracular. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-oracular. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in mutter (Ubuntu Oracular):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-oracular
Revision history for this message
Gre0 (gre0) wrote :

Ubuntu 24.10

* Enabled -proposed and did an update with Synaptic
  - Forced Version of mutter from '47.0-1ubuntu4.1' to '47.0-1ubuntu4.2'

* Then after a reboot I could not login:

"Failed to start X Wayland: Wrong ownership for directory "/tmp/.X11-unix": 120.#012Ownership of "/tmp": 0.#012Current user: 1000."

* I could only open TTY2 by ctrl+alt+F2 and after a 'sudo chown root: /' and a reboot I could login again.

The error message: "Failed to start X Wayland: Directory "/tmp/.X11-unix" is not writable" is also gone.

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

Thank you for your feedback, Gre0.

Can you please give context on your previous situation, i.e., were you ever hit by the bug as currently described or were you just testing the .2 update, realized you could not login anymore (I assume you could login normally in .1, right?) and traced it back to here?

What anyway bugs me is that the patch[1] in that Mutter update does not change logic pertaining to the ownership check, it just makes the error messsage more verbose. Also the ownership of / is not checked, just /tmp and /tmp/.X11-unix, can you confirm you wrote

  sudo chown root: /

correctly?

[1]http://launchpadlibrarian.net/764836919/mutter_47.0-1ubuntu4.1_47.0-1ubuntu4.2.diff.gz

Revision history for this message
Yao Wei (medicalwei) wrote (last edit ):

Hi, just by judging from the SRU test case above, I can confirm that the issue is present on 47.0-1ubuntu4.1 but not on 47.0-1ubuntu4.2. Logs from both versions are attached.

However, I have no idea for the root cause of the permission change of /tmp/.X11-unix directory. I am only here to unblock the upcoming SRU upload.

tags: added: verification-done verification-done-oracular
removed: verification-needed verification-needed-oracular
Revision history for this message
Yao Wei (medicalwei) wrote :

@nteodosio regarding to issue from Gre0, I found it reproducible if /tmp/.X11-unix is removed before starting GDM, as the ownership of that directory became "gdm" (user 120), however after reboot the issue is gone. I am wondering this is related to how that directory is created.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

@medicalwei thanks for the verification and your findings in the above comment #22. @nteodosio, is this something we need to address in this sru?

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

I'm conflicted, because what Yao and Gre0 describe would be a regression, but I cannot reproduce it in a clean install of 24.10 with mutter *u4.2 installed. Rather, after doing as Yao described (namely systemctl stop gdm; rm -r /tmp/.X11-unix; systemctl start gdm; then log in) I could log in fine, Mutter was fine.

However, in that case GDM/Gnome-Shell forgets about Wayland apparently, as the classical X cursor is shown while it loads, and there is no cog wheel to select the session type, and after logging in we are indeed in X11 Mutter (as confirmed by xeyes). Anyway this behavior on my side is exactly equal for the versions in -updates and -proposed so on my side I can detect no problem with the upload.

Revision history for this message
Alessandro Astone (aleasto) wrote :

> However, in that case GDM/Gnome-Shell forgets about Wayland apparently, as the classical X cursor is shown while it loads, and there is no cog wheel to select the session type, and after logging in we are indeed in X11 Mutter

When GDM hides Wayland, it is because it has tried to start a Wayland session but failed; after failing, it falls back to Xorg. So presumably, that would indicate that you managed to reproduce an issue where Xwayland fails to start because of /tmp/.X11-unix and brings down the Wayland session altogether.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

See also bug 2089701.

Revision history for this message
Yao Wei (medicalwei) wrote (last edit ):

@nteodosio: who owns /tmp/.X11-unix after you login? in the case I can reproduce it is owned by gdm, and I can confirm it is a regression compared to *u4.1 as in *u4.1 it does not happen if /tmp/.X11-unix is owned by gdm.

In *u4.2 if /tmp/.X11-unix is not owned by gdm nor root nor user, I am able to login into the desktop but only in X11 (no cogwheel present).

nvidia-driver-560 is used, just in case if it is relevant.

If /tmp/.X11-unix is owned by gdm, it seems to be showing gdm login screen in a Wayland session, I can see a cogwheel, but I cannot login in neither Wayland nor X11 session.

Should we move this to verification-failed?

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

> who owns /tmp/.X11-unix after you login? in the case I can reproduce it is owned by gdm, and I can confirm it is a regression compared to *u4.1 as in *u4.1 it does not happen if /tmp/.X11-unix is owned by gdm.

We're missing something different in our set ups then because exact same procedure gives different results... I'm testing in a virtual machine.

After 'systemctl stop gdm; rm -r /tmp/.X11-unix; systemctl start gdm':

  File: /tmp/.X11-unix/
  Size: 80 Blocks: 0 IO Block: 4096 directory
Device: 0,32 Inode: 276 Links: 2
Access: (1777/drwxrwxrwt) Uid: ( 120/ gdm) Gid: ( 122/ gdm)
Access: 2025-02-27 08:35:02.481540724 +0100
Modify: 2025-02-27 08:35:02.861535505 +0100
Change: 2025-02-27 08:35:02.861535505 +0100
 Birth: 2025-02-27 08:35:02.481540724 +0100

After logging in, same thing.

By the way, I don't see

> "Failed to start X Wayland: Wrong ownership for directory "/tmp/.X11-unix": 120.#012Ownership of "/tmp": 0.#012Current user: 1000."

in your 4.2 log, but I see

> (gnome-shell:9421): libmutter-ERROR **: 14:57:37.313: Failed to start X Wayland: Directory "/tmp/.X11-unix" is not writable

in your 4.1 log, so looks like it's not the same failure Gre0 saw.

> Should we move this to verification-failed?

I think it is the safest to do.

Yao Wei (medicalwei)
tags: added: verification-failed verification-failed-oracular
removed: verification-done verification-done-oracular
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Nathan, please let me know if you would like this fix removed in 4.3 or if you will retry it.

Revision history for this message
Nathan Teodosio (nteodosio) wrote :

Daniel, given the verification-failed, please remove the fix.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

No worries, done:
https://salsa.debian.org/gnome-team/mutter/-/commit/ec587865e9a947f3a860ba145e3b56629d59822c

While this is still one of the top crashers in 24.10, it is nowhere to be seen in 25.04 yet. So as 24.10 reaches EOL in July, if 25.04 remains fixed then we won't have to fix anything else here.

Changed in mutter (Ubuntu Oracular):
status: Fix Committed → Won't Fix
milestone: oracular-updates → none
Revision history for this message
Alessandro Astone (aleasto) wrote :

> it is nowhere to be seen in 25.04 yet

Well, the patch is included in 25.04 (except it was accidentally dropped on the GNOME 48 update, and now restored).

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mutter - 47.0-1ubuntu4.3

---------------
mutter (47.0-1ubuntu4.3) oracular; urgency=medium

  [ Alessandro Astone ]
  * debian: Update metadata for ubuntu/oracular branch
  * Backport patch to fix touch drag&drop (LP: #1966635)
  * Backport patch to fix touch grab (LP: #2089843)
  * Backport patch to fix opening popups with touch input (LP: #2091159)

  [ Yao Wei (魏銘廷) ]
  * debian/patches: Synchronize touchscreen enabled state when adding it to
    device mapper, which fixes suspend/resume related issue (LP: #2087831)

  [ Daniel van Vugt ]
  * Drop the previous fix attempted in 47.0-1ubuntu4.2 because verification
    failed and rather than retry it right now we just want to unblock the
    above fixes.

 -- Daniel van Vugt <email address hidden> Thu, 06 Mar 2025 14:06:50 +0800

Changed in mutter (Ubuntu Oracular):
status: Won't Fix → Fix Released
Revision history for this message
Alessandro Astone (aleasto) wrote :

Looks like this bug was incorrectly marked Fix Released for oracular despite the fix being dropped from a later upload. I've reset the status to "Won't fix" as was decided in comment #30.

Changed in mutter (Ubuntu Oracular):
status: Fix Released → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mutter (Ubuntu Noble):
status: New → Confirmed
Changed in mutter (Ubuntu Noble):
importance: Undecided → High
Changed in mutter (Ubuntu Noble):
assignee: nobody → Alessandro Astone (aleasto)
milestone: none → noble-updates
tags: added: udeng-8454
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.