GNU bug report logs - #73194
ls command converts utf-8 character into escape sequences

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Reported by: Simon Wolfe <sekaihenodoa@HIDDEN>; Keywords: notabug; Done: Paul Eggert <eggert@HIDDEN>; Maintainer for coreutils is bug-coreutils@HIDDEN.
bug closed, send any further explanations to 73194 <at> debbugs.gnu.org and Simon Wolfe <sekaihenodoa@HIDDEN> Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Added tag(s) notabug. Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 73194 <at> debbugs.gnu.org:


Received: (at 73194) by debbugs.gnu.org; 13 Sep 2024 00:45:50 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 12 20:45:50 2024
Received: from localhost ([127.0.0.1]:42118 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1souRa-0005Na-1j
	for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 20:45:50 -0400
Received: from relais.domaineinternet.ca ([158.85.89.116]:48266)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <sekaihenodoa@HIDDEN>) id 1souRX-0005NQ-NJ
 for 73194 <at> debbugs.gnu.org; Thu, 12 Sep 2024 20:45:48 -0400
Received: from box24.domaineinternet.ca (box24.domaineinternet.ca
 [72.10.160.82])
 (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by relais.domaineinternet.ca (Postfix) with ESMTPS id 7AECE52537;
 Thu, 12 Sep 2024 20:45:37 -0400 (EDT)
Received: from softbank126145144040.bbtec.net ([126.145.144.40]
 helo=[192.168.3.7])
 by box24.domaineinternet.ca with esmtpsa (TLS1.3) tls TLS_AES_128_GCM_SHA256
 (Exim 4.97.1) (envelope-from <sekaihenodoa@HIDDEN>)
 id 1souRO-0000000H8kW-0i8t; Thu, 12 Sep 2024 20:45:38 -0400
Message-ID: <7e7930f7-9ce5-4983-a525-ed48638de483@HIDDEN>
Date: Fri, 13 Sep 2024 09:45:35 +0900
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: P@HIDDEN, 73194 <at> debbugs.gnu.org
Subject: bug#73194: ls command converts utf-8 character into escape sequences
Content-Language: en-US
From: Simon Wolfe <sekaihenodoa@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Authenticated-Id: sekaihenodoa@HIDDEN
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 73194
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

How does ls version 9.4 do with code points not yet used ?

I'm asking because it seems it takes 2 years for changes to make it to distros; it might be a good idea to code things ahead...

Like if you use  U+40500 ( 񀔀 ) and type

touch ''$'\361\200\224\200'
ls ''$'\361\200\224\200'

will it show 񀔀 or ''$'\361\200\224\200' ?




Information forwarded to bug-coreutils@HIDDEN:
bug#73194; Package coreutils. Full text available.

Message received at 73194 <at> debbugs.gnu.org:


Received: (at 73194) by debbugs.gnu.org; 12 Sep 2024 13:14:48 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 12 09:14:48 2024
Received: from localhost ([127.0.0.1]:40397 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sojep-0000l0-PY
	for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 09:14:48 -0400
Received: from relais.domaineinternet.ca ([158.85.89.116]:40649)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <sekaihenodoa@HIDDEN>) id 1sohZU-0002Hv-13
 for 73194 <at> debbugs.gnu.org; Thu, 12 Sep 2024 07:01:08 -0400
Received: from box24.domaineinternet.ca (box24.domaineinternet.ca
 [72.10.160.82])
 (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by relais.domaineinternet.ca (Postfix) with ESMTPS id 0F5FC521FE;
 Thu, 12 Sep 2024 07:00:57 -0400 (EDT)
Received: from softbank126145144040.bbtec.net ([126.145.144.40]
 helo=[192.168.3.7])
 by box24.domaineinternet.ca with esmtpsa (TLS1.3) tls TLS_AES_128_GCM_SHA256
 (Exim 4.97.1) (envelope-from <sekaihenodoa@HIDDEN>)
 id 1sohZK-0000000BnUo-2QRu; Thu, 12 Sep 2024 07:00:58 -0400
Message-ID: <f8c040d9-9025-4cdd-b1bf-3fca9a6f9fa5@HIDDEN>
Date: Thu, 12 Sep 2024 20:00:51 +0900
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73194: ls command converts utf-8 character into escape
 sequences
Content-Language: en-US
To: =?UTF-8?Q?P=C3=A1draig_Brady?= <P@HIDDEN>, 73194 <at> debbugs.gnu.org
References: <0da42847-ba02-417b-979c-2ab79863f31f@HIDDEN>
 <523dd8c5-6faf-4ce2-b918-6641733e8fd4@HIDDEN>
From: Simon Wolfe <sekaihenodoa@HIDDEN>
In-Reply-To: <523dd8c5-6faf-4ce2-b918-6641733e8fd4@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Authenticated-Id: sekaihenodoa@HIDDEN
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 73194
X-Mailman-Approved-At: Thu, 12 Sep 2024 09:14:46 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On 2024/09/12 19:42, Pádraig Brady wrote:
> On 12/09/2024 11:16, Simon Wolfe wrote:
>> I have one file name that uses Unicode character U+318DF, which is in the tertiary pane, more precisely CJK Unified Ideographs Extension H.
>>
>> touch 𱣟
>> ls
>>
>> returns:
>>
>> ''$'\360\261\243\237'
>>
>> Extension H was introduced in Unicode 15.0 in 2022.
>>
>> I also notice that this bug occurs with any character with Extension I (introduced in 2023).
>>
>> Extension G seems to works okay.
> 
> ls 9.4 works as expected for me with glibc-2.39 in a UTF-8 locale.
> I.e. that file is displayed directly.
> Now if I set the locale to non UTF-8 it will display the form above
> (which works on all locales BTW).
> 
>    $ touch ''$'\360\261\243\237'
>    $ ls ''$'\360\261\243\237'
>    𱣟
>    $ LC_ALL=C ls ''$'\360\261\243\237'
>    ''$'\360\261\243\237'
> 
> So I suspect your system libs are not updated to recognize this character,
> hence the fallback format is used.
> 
> cheers,
> Pádraig.
> 
I am on UTF-8 locale (ja_JP.utf8), though with glibc-2.35. I am not sure I can upgrade without breaking dependencies.

Thanks for checking, anyway.







Information forwarded to bug-coreutils@HIDDEN:
bug#73194; Package coreutils. Full text available.

Message received at 73194 <at> debbugs.gnu.org:


Received: (at 73194) by debbugs.gnu.org; 12 Sep 2024 10:43:25 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 12 06:43:25 2024
Received: from localhost ([127.0.0.1]:40250 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sohIK-0001Gn-Qd
	for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 06:43:25 -0400
Received: from mail-ej1-f45.google.com ([209.85.218.45]:60560)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pixelbeat@HIDDEN>) id 1sohII-0001GZ-3o
 for 73194 <at> debbugs.gnu.org; Thu, 12 Sep 2024 06:43:22 -0400
Received: by mail-ej1-f45.google.com with SMTP id
 a640c23a62f3a-a8d60e23b33so106796566b.0
 for <73194 <at> debbugs.gnu.org>; Thu, 12 Sep 2024 03:43:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1726137728; x=1726742528; darn=debbugs.gnu.org;
 h=content-transfer-encoding:in-reply-to:from:content-language
 :references:to:subject:user-agent:mime-version:date:message-id
 :sender:from:to:cc:subject:date:message-id:reply-to;
 bh=qGSv6K2kVkShXFMX52PRQ7wTndvc3MwUWVZfHiCCuRc=;
 b=KLtuIWuyecYZpB2P8Ivak/5CgFibRpYoYmCK2ol7Bk8zLcISPxL8tfSQTag2uguXF9
 5Y6uccQY2KuSNMpzkXLVONjx/25k85Zeh9wT7wLMvlBz8z2ZuATu8EjGoZk9U7rF4VJt
 OO2XUer7Z3TaljoWXjYtTmvjgW1t2EjldZYFMr5FI0twOn0LbhZGdCsdnKl9/9D/FKsh
 pFObmAclVqT+BZ9FWdfydIh71KVmxZ3n2QJcY4fSsUrFk5oPG/mm+BR2hYTYzIznUrTt
 d4HLUItckFn0bdqgx5htT8boO0VYvRZhimcpzvQddeFy+JIQbCScxWgFMXHEJI6Y3ni5
 H0eA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1726137728; x=1726742528;
 h=content-transfer-encoding:in-reply-to:from:content-language
 :references:to:subject:user-agent:mime-version:date:message-id
 :sender:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=qGSv6K2kVkShXFMX52PRQ7wTndvc3MwUWVZfHiCCuRc=;
 b=qy290pTdrZfVCZfyEggZuNYCdgjp02DhIaN2x94HK1sosX7bi0CUXfrLPzOnsQegab
 kwJlcWm/xW4hspvA7gs+hS4hI1ZOoTyc3Nr622c6Oic3jTthJTzHdgAGPpFvmODjthn8
 Hd7CH1sbg16O0eBs5vdCs4RkKkVowUkB7hfu42Nr017tfpsaxP+QykgtShNQc6xHsxJn
 D13sAmvqQgL6tcYCPitCxWJ1Hr4X8Ubvo6AB3N0PXov3oY/+JkIiJuG5m3YtFu0qpIqs
 u3WeF8njVXBFs99PJzpjuupXAAOyEsLl79pMYGS4McXNtU4buh0LDOYwrzzvZBhKOQJf
 SSPw==
X-Forwarded-Encrypted: i=1;
 AJvYcCXdGqFWA2qXjfflWVCqssf4CE1WTikzdDhWmfL5lRotdGZ1dO7wAzh8Xf9W4bArmJQP7k9uXw==@debbugs.gnu.org
X-Gm-Message-State: AOJu0YwzM6SUJTsCZmMusV/hxOHyU+umOpaK91/eTkguTE69AMtNXBpm
 Gv4WSPSYPrrFk1mD4PfB1d1gsnZAKwsnNkhs5hefpMdnKz2Sl8yjn9e/zw==
X-Google-Smtp-Source: AGHT+IH1s6oWCLUE/45rJqVptRaTj9L2+CR/QX1AVGZjU0RQ/nTaSScXZjxI1oNEWIchHndNGAXSkA==
X-Received: by 2002:a17:907:c88f:b0:a86:8e3d:86e2 with SMTP id
 a640c23a62f3a-a902941dc57mr214136666b.11.1726137727907; 
 Thu, 12 Sep 2024 03:42:07 -0700 (PDT)
Received: from [192.168.1.76]
 (86-44-211-146-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.44.211.146])
 by smtp.googlemail.com with ESMTPSA id
 a640c23a62f3a-a8d25c61185sm729117366b.100.2024.09.12.03.42.07
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 12 Sep 2024 03:42:07 -0700 (PDT)
Message-ID: <523dd8c5-6faf-4ce2-b918-6641733e8fd4@HIDDEN>
Date: Thu, 12 Sep 2024 11:42:06 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird Beta
Subject: Re: bug#73194: ls command converts utf-8 character into escape
 sequences
To: Simon Wolfe <sekaihenodoa@HIDDEN>, 73194 <at> debbugs.gnu.org
References: <0da42847-ba02-417b-979c-2ab79863f31f@HIDDEN>
Content-Language: en-US
From: =?UTF-8?Q?P=C3=A1draig_Brady?= <P@HIDDEN>
In-Reply-To: <0da42847-ba02-417b-979c-2ab79863f31f@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Score: 0.2 (/)
X-Debbugs-Envelope-To: 73194
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.8 (/)

On 12/09/2024 11:16, Simon Wolfe wrote:
> I have one file name that uses Unicode character U+318DF, which is in the tertiary pane, more precisely CJK Unified Ideographs Extension H.
> 
> touch 𱣟
> ls
> 
> returns:
> 
> ''$'\360\261\243\237'
> 
> Extension H was introduced in Unicode 15.0 in 2022.
> 
> I also notice that this bug occurs with any character with Extension I (introduced in 2023).
> 
> Extension G seems to works okay.

ls 9.4 works as expected for me with glibc-2.39 in a UTF-8 locale.
I.e. that file is displayed directly.
Now if I set the locale to non UTF-8 it will display the form above
(which works on all locales BTW).

   $ touch ''$'\360\261\243\237'
   $ ls ''$'\360\261\243\237'
   𱣟
   $ LC_ALL=C ls ''$'\360\261\243\237'
   ''$'\360\261\243\237'

So I suspect your system libs are not updated to recognize this character,
hence the fallback format is used.

cheers,
Pádraig.





Information forwarded to bug-coreutils@HIDDEN:
bug#73194; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 12 Sep 2024 10:36:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 12 06:36:34 2024
Received: from localhost ([127.0.0.1]:40246 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sohBh-0000y4-W6
	for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 06:36:34 -0400
Received: from lists.gnu.org ([209.51.188.17]:37444)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <towo@HIDDEN>) id 1sohBf-0000xv-Ll
 for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 06:36:32 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <towo@HIDDEN>) id 1sohBX-0006vx-Nq
 for bug-coreutils@HIDDEN; Thu, 12 Sep 2024 06:36:23 -0400
Received: from mout.kundenserver.de ([212.227.126.131])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <towo@HIDDEN>) id 1sohBV-0002v6-Qw
 for bug-coreutils@HIDDEN; Thu, 12 Sep 2024 06:36:23 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=towo.net;
 s=s1-ionos; t=1726137377; x=1726742177; i=towo@HIDDEN;
 bh=FeF2LtGdEIHEDjsUrYoHZfi/+Ernx3OICIHC+ML/vhs=;
 h=X-UI-Sender-Class:Message-ID:Date:MIME-Version:Subject:To:
 References:From:In-Reply-To:Content-Type:
 Content-Transfer-Encoding:cc:content-transfer-encoding:
 content-type:date:from:message-id:mime-version:reply-to:subject:
 to;
 b=y4rb48gLTFV/JUPY9Urs3nzQ/CQgmJPjLjUWF5XXpalwpmKC4sHbkFYwvhx5b7E4
 CcalhulmNcuAM46mDAog6xrIv4APrq2LeoTmdEqSljIdRud/fgWvVWfteCYVm4+mw
 xxEqSy/AtTlY637Fcer5odNdrbDpOeE+stTNJRDvsQWp1bk8HzOZJxsoT25LLIcxd
 2Us6mwfsMTNYINyCEu9KTjqO6GsKXr0JL7RTplgHFGq8lee8n+E9uUsMCgdrMMyxt
 gy6IpdJlwXfM6N/y8lVYpYCdGMTVfAvn7iGkuPx2RHg0XFVeDnaPXUsmKIOZwNW1k
 HE7GaGwKjz6ulzROvw==
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Received: from [192.168.178.51] ([178.25.168.171]) by mrelayeu.kundenserver.de
 (mreue009 [212.227.15.167]) with ESMTPSA (Nemesis) id
 1MEmAV-1smHfe1yUF-000ylO for <bug-coreutils@HIDDEN>; Thu, 12 Sep 2024
 12:36:17 +0200
Message-ID: <2cf1e91f-5794-4775-8a02-06e0c4b586b2@HIDDEN>
Date: Thu, 12 Sep 2024 12:36:05 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73194: ls command converts utf-8 character into escape
 sequences
To: bug-coreutils@HIDDEN
References: <0da42847-ba02-417b-979c-2ab79863f31f@HIDDEN>
From: Thomas Wolff <towo@HIDDEN>
Autocrypt: addr=towo@HIDDEN; keydata=
 xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
 KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
 bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
 fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
 BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
 BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
 AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
 GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
 I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
 UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
 KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
 k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
 D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
 G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
 UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
 agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
 IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
 mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
 PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
 QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
 FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
 y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
 iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <0da42847-ba02-417b-979c-2ab79863f31f@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:afGQP5CWRELwgP+rPe1TU8XuZ9jtVJfCV3OeUsLCcW4CHt6sdnh
 BpxZJ2ms1JzXNnjO1iVS6Dh8AwdBL5MuIGMbK3R4hHZDDvqNT6lnyZRF5PlcOI9QglYhI4D
 JzErXAc6zV+tjkrzbM/tGBW0wivADBy7PurjmtHi3fHOmFzq2/9MPcg6BsJJWaGVlLGWdYq
 thZDEPCk8OxwkMeuZqCKg==
X-Spam-Flag: NO
UI-OutboundReport: notjunk:1;M01:P0:cHjoSSNln0Y=;NO4YivMgAbd/hZGDvKf5RV6sbW1
 w+dIE3u67anKpCarv3MNBAvefPIL46IymlF9tkE9io6s+P7ekTDEGWgXYlq1PqwPnxTCf3aCf
 3Odd8JuOXCEZcL6imDqaYySX3KnpvpRykMEnLjmZ4AsvhSIvZCa+o1HA+0xCjTA7anFF9HPQt
 EoqpkuBpwEXl/KDNII+0bn7A3x8xiVcNRAwp7t5Tc3zpH8Emkn2RmWLkwnA4NJfIHIKnuVI52
 M7ulv/+pG2LEaYNizIk83GEa7YjC5apvELu7NRpfeKYH/tj+ZwwmCgf3wSNcbhO39qlno87by
 IMTGu7cOSClGCxxhYEz4QkM0pq7q8ekDxbXF+XRJRxItblHm3mjjXRdzUXWK6O5afOW9kzr3E
 OUr5/SQsdN90jC+w8DGBsGmsGtzBdTEZm/FE1Zuojyr18pKG050r9cdEtszkHBwlneJTpm+ra
 3BeHCX6bqEAB9rGE9v42kO+cEvAE9C0y7FNMnAxhhVveFy8TawT2iWFkHO3bEG1/jo2+5Uy8B
 isaXM5MJAZGKz4najQjjF2bJcwf1r3BKi2I4+wc9eAf7qM6qcrcYUSr/4qR7pOog6XCRXr2em
 C3/SSzS4YDHCCd/fIWT/Z9vBrQbwu2jXc3oGwnd8aDnPsuRF2weSrQy15TzKzi/WDTxtmsKX5
 bbTquxMSLaI8k/T52vjAct30uPWE7/DWi2D0+7CfEUALolsNX2PUxb16RVUuLVt3ErHHEWaK6
 GZ5QHPLxEaHoayAJlhsEK2cPwa7G+9dZw==
Received-SPF: pass client-ip=212.227.126.131; envelope-from=towo@HIDDEN;
 helo=mout.kundenserver.de
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001,
 RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)


Am 12.09.2024 um 12:16 schrieb Simon Wolfe:
> I have one file name that uses Unicode character U+318DF, which is in
> the tertiary pane, more precisely CJK Unified Ideographs Extension H.
>
> touch =F0=B1=A3=9F
> ls
>
> returns:
>
> ''$'\360\261\243\237'
I use a wrapper with my favourite options and a pipe to stop ls from
being witty about the terminal:
ls | cat

>
> Extension H was introduced in Unicode 15.0 in 2022.
>
> I also notice that this bug occurs with any character with Extension I
> (introduced in 2023).
>
> Extension G seems to works okay.





Information forwarded to bug-coreutils@HIDDEN:
bug#73194; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 12 Sep 2024 10:17:21 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 12 06:17:21 2024
Received: from localhost ([127.0.0.1]:40232 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sogt7-0008NV-3j
	for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 06:17:21 -0400
Received: from lists.gnu.org ([209.51.188.17]:54780)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <sekaihenodoa@HIDDEN>) id 1sogsZ-0008MH-SL
 for submit <at> debbugs.gnu.org; Thu, 12 Sep 2024 06:16:48 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <sekaihenodoa@HIDDEN>)
 id 1sogsQ-0003mF-Rt
 for bug-coreutils@HIDDEN; Thu, 12 Sep 2024 06:16:40 -0400
Received: from relais.domaineinternet.ca ([158.85.89.116])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <sekaihenodoa@HIDDEN>)
 id 1sogsP-0000j4-AH
 for bug-coreutils@HIDDEN; Thu, 12 Sep 2024 06:16:38 -0400
Received: from box24.domaineinternet.ca (box24.domaineinternet.ca
 [72.10.160.82])
 (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by relais.domaineinternet.ca (Postfix) with ESMTPS id 62369520A0
 for <bug-coreutils@HIDDEN>; Thu, 12 Sep 2024 06:16:30 -0400 (EDT)
Received: from softbank126145144040.bbtec.net ([126.145.144.40]
 helo=[192.168.3.7])
 by box24.domaineinternet.ca with esmtpsa (TLS1.3) tls TLS_AES_128_GCM_SHA256
 (Exim 4.97.1) (envelope-from <sekaihenodoa@HIDDEN>)
 id 1sogsG-0000000BYs7-0CoY for bug-coreutils@HIDDEN;
 Thu, 12 Sep 2024 06:16:32 -0400
Message-ID: <0da42847-ba02-417b-979c-2ab79863f31f@HIDDEN>
Date: Thu, 12 Sep 2024 19:16:21 +0900
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: bug-coreutils@HIDDEN
From: Simon Wolfe <sekaihenodoa@HIDDEN>
Subject: ls command converts utf-8 character into escape sequences
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Authenticated-Id: sekaihenodoa@HIDDEN
Received-SPF: pass client-ip=158.85.89.116;
 envelope-from=sekaihenodoa@HIDDEN; helo=relais.domaineinternet.ca
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9,
 RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Thu, 12 Sep 2024 06:17:19 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

I have one file name that uses Unicode character U+318DF, which is in the tertiary pane, more precisely CJK Unified Ideographs Extension H.

touch 𱣟
ls

returns:

''$'\360\261\243\237'

Extension H was introduced in Unicode 15.0 in 2022.

I also notice that this bug occurs with any character with Extension I (introduced in 2023).

Extension G seems to works okay.





Acknowledgement sent to Simon Wolfe <sekaihenodoa@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#73194; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 16 Feb 2025 07:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.