GNU bug report logs - #77373
[shepherd] Deadlock between service-controller and service-builtin-logger

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: nathan <nathan_mail@HIDDEN>; Done: Ludovic Courtès <ludo@HIDDEN>; Maintainer for guix is bug-guix@HIDDEN.
bug closed, send any further explanations to 77373 <at> debbugs.gnu.org and nathan <nathan_mail@HIDDEN> Request was from Ludovic Courtès <ludo@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 77373 <at> debbugs.gnu.org:


Received: (at 77373) by debbugs.gnu.org; 30 Mar 2025 16:45:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 30 12:45:01 2025
Received: from localhost ([127.0.0.1]:37919 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tyvmO-0001Y5-Ob
	for submit <at> debbugs.gnu.org; Sun, 30 Mar 2025 12:45:01 -0400
Received: from hera.aquilenet.fr ([185.233.100.1]:53876)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1tyvmM-0001Xo-Iz
 for 77373 <at> debbugs.gnu.org; Sun, 30 Mar 2025 12:44:59 -0400
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id C64B6201;
 Sun, 30 Mar 2025 18:44:51 +0200 (CEST)
Authentication-Results: hera.aquilenet.fr;
	none
X-Virus-Scanned: Debian amavis at hera.aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavis, port 10024) with ESMTP
 id E2tF6O_KYFaZ; Sun, 30 Mar 2025 18:44:51 +0200 (CEST)
Received: from ribbon (91-160-117-201.subs.proxad.net [91.160.117.201])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id C8E481AD;
 Sun, 30 Mar 2025 18:44:50 +0200 (CEST)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: nathan <nathan_mail@HIDDEN>
Subject: Re: bug#77373: [shepherd] Deadlock between service-controller and
 service-builtin-logger
In-Reply-To: <7fbb2aa8-ac2e-4c6c-93ca-12b834ecfce3@HIDDEN> (nathan's
 message of "Sun, 30 Mar 2025 16:16:15 +0000")
References: <f588d1e5-38bc-4cf6-b4fe-e679380ffaf3@HIDDEN>
 <87iknqfuif.fsf@HIDDEN>
 <7fbb2aa8-ac2e-4c6c-93ca-12b834ecfce3@HIDDEN>
Date: Sun, 30 Mar 2025 18:44:50 +0200
Message-ID: <87bjtifoxp.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Queue-Id: C64B6201
X-Spamd-Result: default: False [4.90 / 15.00]; SPAM_FLAG(5.00)[];
 BAYES_HAM(-3.00)[99.99%]; NEURAL_SPAM(3.00)[1.000];
 MIME_GOOD(-0.10)[text/plain]; RCVD_COUNT_TWO(0.00)[2];
 FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+];
 RCPT_COUNT_TWO(0.00)[2]; ARC_NA(0.00)[];
 TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[];
 RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
 MID_RHS_MATCH_FROM(0.00)[]
X-Spam-Level: ****
X-Rspamd-Action: no action
X-Spamd-Bar: ++++
X-Rspamd-Server: hera
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 77373
Cc: 77373 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hey,

nathan <nathan_mail@HIDDEN> skribis:

> happy sunday
>
>> Yes, but only if =E2=80=98pid=E2=80=99 hasn=E2=80=99t been cached before=
, which I think would
>> mean that not a single line was logged before stopping the service.
>
> doesn't get cached if no output is read while service is 'running.
> fork+exec-command: if the pid-file doesnt show up immediately, there is a=
n entire 1 second sleep. The logger can easily read the output while the se=
rvice is still in 'starting
> also: if the service doesn't flush stdout, we dont get its output until i=
t dies. ('stopping)
>
>> Could you explain exactly how that happens (sequence of actions leading
>> to the deadlock) and share the relevant /var/log/messages excerpt?
>
> ./shepherd --socket /tmp/s2/mysocket --config <path>

Yup, I see.

I believe a5d9a41aa0bfadea9617df4411b86c6ea0c525bd fixes it.  I added a
test case that=E2=80=99s similar to what you have: the service only starts
logging when being stopped, which makes it very likely that the logger
will receive lines of log after the service has terminated.  It would
trigger the deadlock 100% of the time.

Let me know what you think.

Thanks a lot!

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#77373; Package guix. Full text available.

Message received at 77373 <at> debbugs.gnu.org:


Received: (at 77373) by debbugs.gnu.org; 30 Mar 2025 16:16:26 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 30 12:16:26 2025
Received: from localhost ([127.0.0.1]:37877 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tyvKj-0000AZ-Pe
	for submit <at> debbugs.gnu.org; Sun, 30 Mar 2025 12:16:26 -0400
Received: from mail.nborghese.com ([2001:19f0:5:48e0:5400:4ff:fe08:e8a8]:53502)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <nathan_mail@HIDDEN>)
 id 1tyvKb-0000A5-81
 for 77373 <at> debbugs.gnu.org; Sun, 30 Mar 2025 12:16:17 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; s=062122; bh=CepH04ITOBSfn
 /W2RPTX2OCaaz8N2NPw7J6s+vMbBMA=;
 h=in-reply-to:from:references:cc:to:
 subject:date; d=nborghese.com; b=dnT9e3dqptIhHKMVLnQnmvaK6BCWi5bEvHe6Q
 l8PaptOguLjeyGURdaWBqzkFTHuwDomFCiApMkn4ElccQSPj153JulLnIUrgmXT+nT1/Ds
 aAdkpT90M6hTsq/5cuejxPVgrVDupzQM2u/Rtp1aqPHxMDvKtpokdojatARm0Z+c=
Received: by nborghese.com (OpenSMTPD) with ESMTPSA id c9fd09f6
 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); 
 Sun, 30 Mar 2025 16:16:05 +0000 (UTC)
Content-Type: multipart/mixed; boundary="------------1v0hkmEIoyoy2i0TS0acCOVP"
Message-ID: <7fbb2aa8-ac2e-4c6c-93ca-12b834ecfce3@HIDDEN>
Date: Sun, 30 Mar 2025 16:16:15 +0000
MIME-Version: 1.0
User-Agent: Icedove Daily
Subject: Re: bug#77373: [shepherd] Deadlock between service-controller and
 service-builtin-logger
To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
References: <f588d1e5-38bc-4cf6-b4fe-e679380ffaf3@HIDDEN>
 <87iknqfuif.fsf@HIDDEN>
Content-Language: en-US
From: nathan <nathan_mail@HIDDEN>
In-Reply-To: <87iknqfuif.fsf@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 77373
Cc: 77373 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

This is a multi-part message in MIME format.
--------------1v0hkmEIoyoy2i0TS0acCOVP
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

happy sunday

> Yes, but only if ‘pid’ hasn’t been cached before, which I think would
> mean that not a single line was logged before stopping the service.

doesn't get cached if no output is read while service is 'running.
fork+exec-command: if the pid-file doesnt show up immediately, there is an entire 1 second sleep. The logger can easily read the output while the service is still in 'starting
also: if the service doesn't flush stdout, we dont get its output until it dies. ('stopping)

> Could you explain exactly how that happens (sequence of actions leading
> to the deadlock) and share the relevant /var/log/messages excerpt?

./shepherd --socket /tmp/s2/mysocket --config <path>

GNU Shepherd 1.0.3 (Guile 3.0.9, x86_64-unknown-linux-gnu)
Starting service root...
Service root started.
Service root running with value #<<process> id: 18114 command: #f>.
Service root has been started.
starting services...
Configuration successfully loaded from '<path>'.
Starting service myservice...
Service myservice has been started.
Service myservice started.
Successfully started 1 service in the background.
Service myservice running with value #<<process> id: 18132 command: ("/tmp/a.out")>.

in other terminal:
./herd -s /tmp/s2/mysocket status myservice
<status is 'running and doesnt show any "Recent messages">

./herd -s /tmp/s2/mysocket stop myservice
works fine

more shepherd output:
Stopping service myservice...
Service myservice stopped.
Service myservice is now stopped.

in other terminal, all of these hang:
./herd -s /tmp/s2/mysocket status myservice
./herd -s /tmp/s2/mysocket stop myservice
./herd -s /tmp/s2/mysocket start myservice
./herd -s /tmp/s2/mysocket status
./herd -s /tmp/s2/mysocket stop root

does not hang:
./herd -s /tmp/s2/mysocket status aaaaa
herd: error: service 'aaaaa' could not be found

I have to kill -9 shepherd.

c source code attached for the test program.
I mentioned two possibilities above, and this is scenario #2. stdout not flushed. I also had what is probably scenario #1 with a different program.


On 3/30/25 2:44 PM, Ludovic Courtès wrote:
> Hi nathan,
> 
> nathan <nathan_mail@HIDDEN> skribis:
> 
>> I definitely have a deadlock problem with Shepherd and I do believe I've found it.
>> shepherd 1.0.3
> 
> Could you explain exactly how that happens (sequence of actions leading
> to the deadlock) and share the relevant /var/log/messages excerpt?
> 
>> This is in service-controller when the service has been stopped:
>> (when logger
>>    (put-message logger 'terminate))
>> But in service-builtin-logger, this is called every time a line is read:
>> (or pid
>>      (and service
>>           (eq? 'running (service-status service))
>>           (match (service-running-value service)
>>             ((? process? process)
>>              (process-id process))
>>             (value
>>              value))))
>>
>> service-status -> service-control-message -> put-message to the service
>> The fibers documentation says put-message is blocking. Surely this is a deadlock.
> 
> Yes, but only if ‘pid’ hasn’t been cached before, which I think would
> mean that not a single line was logged before stopping the service.
> 
> I’ll take a closer look.
> 
> Thanks for reporting it and for investigating!
> 
> Ludo’.
--------------1v0hkmEIoyoy2i0TS0acCOVP
Content-Type: text/x-csrc; charset=UTF-8; name="b.c"
Content-Disposition: attachment; filename="b.c"
Content-Transfer-Encoding: base64

I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzaWduYWwuaD4KI2luY2x1ZGUgPHVuaXN0
ZC5oPgoKdm9sYXRpbGUgaW50IGRvbmU9MDsKdm9pZCBoYW5kbGUoaW50IGEpewoJZG9uZT0x
Owp9CmludCBtYWluKGludCBhcmdjLCBjaGFyICoqYXJndil7CglwcmludGYoIiVkXG4iLChp
bnQpZ2V0cGlkKCkpOwoJc3RydWN0IHNpZ2FjdGlvbiBhPXswfTsKCWEuc2FfaGFuZGxlcj0m
aGFuZGxlOwoJc2lnYWN0aW9uKFNJR1RFUk0sJmEsTlVMTCk7CgkvLyBjaGFuZ2UgdGhpcyB0
byBpPDEwMDAgdG8gZml4Cglmb3IgKGludCBpPTA7aTwxMDA7aSsrKXsKCQlwdXRzKCJ0ZXN0
MCIpOwoJfQoJLy8gb3IgYWRkIHRoaXMgbGluZSB0byBmaXgKCS8vIGZmbHVzaChzdGRvdXQp
OwoJd2hpbGUoIWRvbmUpewoJCXVzbGVlcCgxMCk7Cgl9Cglmb3IgKGludCBpPTA7aTwxMDA7
aSsrKXsKCQlwdXRzKCJ0ZXN0MSIpOwoJfQp9Cg==

--------------1v0hkmEIoyoy2i0TS0acCOVP--




Information forwarded to bug-guix@HIDDEN:
bug#77373; Package guix. Full text available.

Message received at 77373 <at> debbugs.gnu.org:


Received: (at 77373) by debbugs.gnu.org; 30 Mar 2025 14:44:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 30 10:44:37 2025
Received: from localhost ([127.0.0.1]:37699 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tytts-0004Ah-Vn
	for submit <at> debbugs.gnu.org; Sun, 30 Mar 2025 10:44:37 -0400
Received: from hera.aquilenet.fr ([185.233.100.1]:33622)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1tyttq-0004AM-16
 for 77373 <at> debbugs.gnu.org; Sun, 30 Mar 2025 10:44:34 -0400
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id 6096D2E6;
 Sun, 30 Mar 2025 16:44:27 +0200 (CEST)
Authentication-Results: hera.aquilenet.fr;
	none
X-Virus-Scanned: Debian amavis at hera.aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavis, port 10024) with ESMTP
 id 11xKVmX-tEtU; Sun, 30 Mar 2025 16:44:26 +0200 (CEST)
Received: from ribbon (91-160-117-201.subs.proxad.net [91.160.117.201])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id 77F9A1AD;
 Sun, 30 Mar 2025 16:44:26 +0200 (CEST)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: nathan <nathan_mail@HIDDEN>
Subject: Re: bug#77373: [shepherd] Deadlock between service-controller and
 service-builtin-logger
In-Reply-To: <f588d1e5-38bc-4cf6-b4fe-e679380ffaf3@HIDDEN> (nathan's
 message of "Sat, 29 Mar 2025 23:41:24 +0000")
References: <f588d1e5-38bc-4cf6-b4fe-e679380ffaf3@HIDDEN>
Date: Sun, 30 Mar 2025 16:44:24 +0200
Message-ID: <87iknqfuif.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Queue-Id: 6096D2E6
X-Spamd-Result: default: False [4.90 / 15.00]; SPAM_FLAG(5.00)[];
 BAYES_HAM(-3.00)[99.99%]; NEURAL_SPAM(3.00)[1.000];
 MIME_GOOD(-0.10)[text/plain]; FROM_EQ_ENVFROM(0.00)[];
 RCVD_COUNT_TWO(0.00)[2]; RCPT_COUNT_TWO(0.00)[2];
 MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[];
 TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[];
 RCVD_TLS_ALL(0.00)[];
 URIBL_BLOCKED(0.00)[localhost:rdns,localhost:helo];
 TO_DN_SOME(0.00)[]; FROM_HAS_DN(0.00)[];
 MID_RHS_MATCH_FROM(0.00)[]
X-Spam-Level: ****
X-Rspamd-Action: no action
X-Spamd-Bar: ++++
X-Rspamd-Server: hera
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 77373
Cc: 77373 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hi nathan,

nathan <nathan_mail@HIDDEN> skribis:

> I definitely have a deadlock problem with Shepherd and I do believe I've =
found it.
> shepherd 1.0.3

Could you explain exactly how that happens (sequence of actions leading
to the deadlock) and share the relevant /var/log/messages excerpt?

> This is in service-controller when the service has been stopped:
> (when logger
>   (put-message logger 'terminate))
> But in service-builtin-logger, this is called every time a line is read:
> (or pid
>     (and service
>          (eq? 'running (service-status service))
>          (match (service-running-value service)
>            ((? process? process)
>             (process-id process))
>            (value
>             value))))
>
> service-status -> service-control-message -> put-message to the service
> The fibers documentation says put-message is blocking. Surely this is a d=
eadlock.

Yes, but only if =E2=80=98pid=E2=80=99 hasn=E2=80=99t been cached before, w=
hich I think would
mean that not a single line was logged before stopping the service.

I=E2=80=99ll take a closer look.

Thanks for reporting it and for investigating!

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#77373; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 29 Mar 2025 23:41:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 29 19:41:39 2025
Received: from localhost ([127.0.0.1]:33456 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tyfo3-0008RK-4a
	for submit <at> debbugs.gnu.org; Sat, 29 Mar 2025 19:41:39 -0400
Received: from lists.gnu.org ([2001:470:142::17]:39688)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <nathan_mail@HIDDEN>)
 id 1tyfnz-0008R0-N8
 for submit <at> debbugs.gnu.org; Sat, 29 Mar 2025 19:41:36 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <nathan_mail@HIDDEN>)
 id 1tyfnu-0000RW-7H
 for bug-guix@HIDDEN; Sat, 29 Mar 2025 19:41:30 -0400
Received: from mail.nborghese.com ([2001:19f0:5:48e0:5400:4ff:fe08:e8a8])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <nathan_mail@HIDDEN>)
 id 1tyfns-0008N4-Ik
 for bug-guix@HIDDEN; Sat, 29 Mar 2025 19:41:29 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; s=062122; bh=AsldORf0hahvb
 HzDYAeWxs1+GDYE2WZx4A5zDnqNff4=; h=subject:from:to:date;
 d=nborghese.com; b=f1RVHgF1W2vlkDOnSTJyqCe+GzxOXjCNxUP/lDGoROFv3pLSBs+
 o6kwLHN8fNDX0QeyyRFYxx8x930OEI3nRnAiTwtuyYr2ONk07hzYsjkhnufevM1Vssqgrd
 7obNwLvgbAcfJO3P7LCrqV3FThOs1y6LztmsV74hYVZR0qniVM=
Received: by nborghese.com (OpenSMTPD) with ESMTPSA id 9a261ea0
 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for <bug-guix@HIDDEN>;
 Sat, 29 Mar 2025 23:41:15 +0000 (UTC)
Message-ID: <f588d1e5-38bc-4cf6-b4fe-e679380ffaf3@HIDDEN>
Date: Sat, 29 Mar 2025 23:41:24 +0000
MIME-Version: 1.0
User-Agent: Icedove Daily
Content-Language: en-US
To: bug-guix@HIDDEN
From: nathan <nathan_mail@HIDDEN>
Subject: [shepherd] Deadlock between service-controller and
 service-builtin-logger
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=2001:19f0:5:48e0:5400:4ff:fe08:e8a8;
 envelope-from=nathan_mail@HIDDEN; helo=mail.nborghese.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 0.9 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.1 (/)

I definitely have a deadlock problem with Shepherd and I do believe I've found it.
shepherd 1.0.3

This is in service-controller when the service has been stopped:
(when logger
   (put-message logger 'terminate))
But in service-builtin-logger, this is called every time a line is read:
(or pid
     (and service
          (eq? 'running (service-status service))
          (match (service-running-value service)
            ((? process? process)
             (process-id process))
            (value
             value))))

service-status -> service-control-message -> put-message to the service
The fibers documentation says put-message is blocking. Surely this is a deadlock.

The value of `pid' can be cached and we might avoid the deadlock.
But this happens ONLY if a line of output is read while the service is 'running.
If output only comes while the service is 'starting/'stopping, we will keep calling service-status.

Potential fix:
At the time spawn-service-builtin-logger is called in fork+exec-command, we already know the pid.
Maybe just prepend the pid to the `command' argument if you want it to appear in the log output.





Acknowledgement sent to nathan <nathan_mail@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#77373; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 30 Mar 2025 17:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.