GNU bug report logs - #77150
pdumper: reduce pdmp size for 64-bit systems

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Pip Cet <pipcet@HIDDEN>; dated Fri, 21 Mar 2025 11:43:03 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 21 Mar 2025 11:42:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 21 07:42:14 2025
Received: from localhost ([127.0.0.1]:35858 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tvalR-0000kq-Vb
	for submit <at> debbugs.gnu.org; Fri, 21 Mar 2025 07:42:14 -0400
Received: from lists.gnu.org ([2001:470:142::17]:60874)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1tvalP-0000jm-0N
 for submit <at> debbugs.gnu.org; Fri, 21 Mar 2025 07:42:12 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pipcet@HIDDEN>)
 id 1tvalD-0006i9-8M
 for bug-gnu-emacs@HIDDEN; Fri, 21 Mar 2025 07:41:59 -0400
Received: from mail-40133.protonmail.ch ([185.70.40.133])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pipcet@HIDDEN>)
 id 1tval7-0002bp-Ov
 for bug-gnu-emacs@HIDDEN; Fri, 21 Mar 2025 07:41:57 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1742557309; x=1742816509;
 bh=jGVZNkla9CHbiM5E/FWhjHT2ZHOnsAyvdSpoCzjdTcw=;
 h=Date:To:From:Subject:Message-ID:Feedback-ID:From:To:Cc:Date:
 Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector:
 List-Unsubscribe:List-Unsubscribe-Post;
 b=QNdHJ0+fiAF52/MEpkVl9fgvVht2wtet1pTLrFhTCaqvrTdhIbiSsfzDSY6QdAxCt
 3bYQ7BxLyhHC9NNle+8cMcyqslE6uT6zia3OWNq8r1C/FTHKN97Dm0XQCGHsxs3xWw
 hZQQKv/e/uFJzphF3SF7tDDk0+eCYZ3yqtRaMjYXnT8Ml7aN0ZxQ+GaAPAPrI3v69h
 kcjPkVdLiu8vZJQVVBS/bH+ZSLtvchUSmhXeyrNHMipZxhvH/3SfjgcckqsUxCIbZU
 xRSJZQaoW026DbzFEVFGjxcaDkWJZXaEabUoYLmP0GSgwPCBt0GZVRkEv23hxna/l/
 Fzi84yR7RfZKA==
Date: Fri, 21 Mar 2025 11:41:43 +0000
To: bug-gnu-emacs@HIDDEN, Daniel Colascione <dancol@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: pdumper: reduce pdmp size for 64-bit systems
Message-ID: <87h63mzk4o.fsf@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: f6f7b1f0f62c32051a5bc440625e19d7e2ba6a15
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Received-SPF: pass client-ip=185.70.40.133; envelope-from=pipcet@HIDDEN;
 helo=mail-40133.protonmail.ch
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001,
 RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

This is a wishlist item (with a PoC patch) for improving pdumper on
64-bit systems.

On such systems, we can save about 2 MB of the pdmp file (reducing it
from 14 MB to 12 MB without nativecomp, or 17 MB to 15 MB with
nativecomp) by storing dump-to-dump relocations for Lisp_Objects and
pointers in the heap image part of the dump, and omitting the
corresponding explicit 32-bit relocations in the relocation part of the
dump.

What makes this possible is that a 64-bit Lisp_Object or pointer is
usually large enough to contain two dump offsets: one to indicate which
object should be referred to after relocation, and another one linking
to the next in-place relocation, allowing us to find all of them by
walking the linked list.

There is no measurable performance difference on my system, though I
suspect that older systems will see performance gains because of the
reduction in pdmp size.  (In theory, it's easier to prefetch relocation
data with the old format, but in practice, using offsets rather than
pointers makes this impossible for GCC to achieve on current CPUs.)

Here's a proof of concept.  Is this worth pursuing?

My TODO list:

1. put the initial pointer into a relocation rather than the header
2. investigate possibility of a 32-bit version
3. compress dump-to-emacs relocations
4. use bitfields rather than direct bit manipulation
5. investigate reversing the chain so we don't go through the heap backward=
s

Note that the code is written to fall back to the old functions if a
relocation doesn't fit, so unusual dumping scenarios should continue to
work, they just won't benefit as much, and there will be fewer conflicts
with other changes to the pdumper code.

In particular, if there is still interest in working on a version of
pdumper which loads the dump at a fixed address and pre-fills the heap
image with the right pointers/Lisp_Objects (unexec-style), there'd be no
conflict: we just couldn't do both at the same time.

Similarly, the disabled DANGEROUS optimization in
dump_field_lv_or_rawptr would reduce the amount of memory saved.  It's
probably time to remove that code.

From 50712c9a349e97a3f9fdd1daf08e5765799645b1 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@HIDDEN>
Subject: [PATCH] Reduce pdump size by using in-place relocations on 64-bit
 systems

A 64-bit Lisp_Object is enough to store the dump offsets of both the
Lisp_Object that is meant to appear in a given location and the dump
offset of the next compressed relocation in the list.

* src/pdumper.c (dump_do_fixup): Compress a relocation plus a link to
the previous one into a single Lisp_Object, if it fits.
(dump_do_fixups):
(Fdump_emacs_portable): Keep track of the list of in-place fixup
relocations.
(dump_do_fixup_chain): Replace compressed in-place relocations by
their fixups.
(pdumper_load): Call 'dump_do_fixup_chain' before applying explicit
relocs.
---
 src/pdumper.c | 67 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 59 insertions(+), 8 deletions(-)

diff --git a/src/pdumper.c b/src/pdumper.c
index de213130756..5ccac63637a 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -403,6 +403,9 @@ dump_fingerprint (FILE *output, char const *label,
=20
   /* Offset of a vector of the dumped hash tables.  */
   dump_off hash_list;
+
+  /* Offset of the last relocation compressed into a Lisp_Object.  */
+  dump_off fixup_chain;
 };
=20
 /* Double-ended singly linked list.  */
@@ -3991,10 +3994,11 @@ drain_reloc_list (struct dump_context *ctx,
   ctx->flags =3D old_flags;
 }
=20
-static void
+static dump_off
 dump_do_fixup (struct dump_context *ctx,
                Lisp_Object fixup,
-               Lisp_Object prev_fixup)
+               Lisp_Object prev_fixup,
+=09       dump_off prev_fixup_chain)
 {
   enum dump_fixup_type type
     =3D (enum dump_fixup_type) XFIXNUM (dump_pop (&fixup));
@@ -4013,6 +4017,7 @@ dump_do_fixup (struct dump_context *ctx,
   dump_seek (ctx, dump_fixup_offset);
   intptr_t dump_value;
   bool do_write =3D true;
+  bool do_link =3D false;
   switch (type)
     {
     case DUMP_FIXUP_LISP_OBJECT:
@@ -4051,7 +4056,18 @@ dump_do_fixup (struct dump_context *ctx,
           dump_value =3D dump_recall_object (ctx, arg);
           if (dump_value <=3D 0)
             error ("fixup object not dumped");
-          if (type =3D=3D DUMP_FIXUP_LISP_OBJECT)
+=09replace:
+=09  if (sizeof (Lisp_Object) =3D=3D 8
+=09      && ctx->offset - prev_fixup_chain < 0x10000000
+=09      && dump_value < 0x10000000)
+=09    {
+=09      uint64_t new_dump_value =3D
+=09=09((dump_value << 36LL) + ((long long)((type =3D=3D DUMP_FIXUP_LISP_OB=
JECT) ? (int) XTYPE (arg) : (int) 8) << 32LL) +
+=09=09 (ctx->offset - prev_fixup_chain));
+=09      do_link =3D true;
+=09      dump_value =3D new_dump_value;
+=09    }
+          else if (type =3D=3D DUMP_FIXUP_LISP_OBJECT)
             dump_reloc_dump_to_dump_lv (ctx, ctx->offset, XTYPE (arg));
           else
             dump_reloc_dump_to_dump_ptr_raw (ctx, ctx->offset);
@@ -4062,8 +4078,7 @@ dump_do_fixup (struct dump_context *ctx,
          object.  It knows the exact location it wants, so just
          believe it.  */
       dump_value =3D dump_off_from_lisp (arg);
-      dump_reloc_dump_to_dump_ptr_raw (ctx, ctx->offset);
-      break;
+      goto replace;
     case DUMP_FIXUP_BIGNUM_DATA:
       {
         eassert (BIGNUMP (arg));
@@ -4081,11 +4096,13 @@ dump_do_fixup (struct dump_context *ctx,
     default:
       emacs_abort ();
     }
+  dump_off ret =3D do_link ? ctx->offset : prev_fixup_chain;
   if (do_write)
     dump_write (ctx, &dump_value, sizeof (dump_value));
+  return ret;
 }
=20
-static void
+static dump_off
 dump_do_fixups (struct dump_context *ctx)
 {
   dump_off saved_offset =3D ctx->offset;
@@ -4094,13 +4111,15 @@ dump_do_fixups (struct dump_context *ctx)
 =09=09=09      Qdump_emacs_portable__sort_predicate);
   Lisp_Object prev_fixup =3D Qnil;
   ctx->fixups =3D Qnil;
+  dump_off prev_fixup_chain =3D 0;
   while (!NILP (fixups))
     {
       Lisp_Object fixup =3D dump_pop (&fixups);
-      dump_do_fixup (ctx, fixup, prev_fixup);
+      prev_fixup_chain =3D dump_do_fixup (ctx, fixup, prev_fixup, prev_fix=
up_chain);
       prev_fixup =3D fixup;
     }
   dump_seek (ctx, saved_offset);
+  return prev_fixup_chain;
 }
=20
 static void
@@ -4371,7 +4390,7 @@ DEFUN ("dump-emacs-portable",
   ctx->end_heap =3D ctx->offset;
=20
   /* Make remembered modifications to the dump file itself.  */
-  dump_do_fixups (ctx);
+  dump_off fixup_chain =3D dump_do_fixups (ctx);
=20
   drain_reloc_merger emacs_reloc_merger =3D
 #ifdef ENABLE_CHECKING
@@ -4410,6 +4429,7 @@ DEFUN ("dump-emacs-portable",
=20
   /* Dump is complete.  Go back to the header and write the magic
      indicating that the dump is complete and can be loaded.  */
+  ctx->header.fixup_chain =3D fixup_chain;
   ctx->header.magic[0] =3D dump_magic[0];
   dump_seek (ctx, 0);
   dump_write (ctx, &ctx->header, sizeof (ctx->header));
@@ -5636,6 +5656,36 @@ dump_do_emacs_relocation (const uintptr_t dump_base,
     }
 }
=20
+static void
+dump_do_fixup_chain (const struct dump_header *const header,
+=09=09     const uintptr_t dump_base)
+{
+  const dump_off fixup_chain =3D header->fixup_chain;
+  dump_off curr_off =3D fixup_chain;
+  while (curr_off)
+    {
+      uint64_t dump_value;
+      memcpy (&dump_value,
+              dump_ptr (dump_base, curr_off),
+              sizeof (dump_value));
+      int type =3D (dump_value >> 32LL) & 15;
+      uintptr_t value =3D (dump_value >> 36LL) & 0xfffffff;
+      uintptr_t delta =3D dump_value & 0xfffffff;
+      void *obj_ptr =3D dump_ptr (dump_base, value);
+      Lisp_Object lv =3D Qnil;
+      if (type =3D=3D 8)
+=09lv =3D XIL ((uintptr_t)obj_ptr);
+      else if (type =3D=3D Lisp_Symbol)
+=09lv =3D make_lisp_symbol (obj_ptr);
+      else
+=09lv =3D make_lisp_ptr (obj_ptr, type);
+      memcpy (dump_ptr (dump_base, curr_off),
+=09      &lv,
+=09      sizeof lv);
+      curr_off -=3D delta;
+    }
+}
+
 static void
 dump_do_all_emacs_relocations (const struct dump_header *const header,
 =09=09=09       const uintptr_t dump_base)
@@ -5831,6 +5881,7 @@ pdumper_load (const char *dump_filename, char *argv0)
   dump_public.start =3D dump_base;
   dump_public.end =3D dump_public.start + dump_size;
=20
+  dump_do_fixup_chain (header, dump_base);
   dump_do_all_dump_reloc_for_phase (header, dump_base, EARLY_RELOCS);
   dump_do_all_emacs_relocations (header, dump_base);
=20
--=20
2.48.1





Acknowledgement sent to Pip Cet <pipcet@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#77150; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Fri, 21 Mar 2025 11:45:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.