tighter %p default precision #340

charlesnicholson · 2025-05-26T00:37:30Z

This PR is a port / migration of #329 by @stefano-zanotti-88 with changes to the paland + unit tests etc.

charlesnicholson · 2025-05-26T00:44:57Z

@stefano-zanotti-88 I'm on the fence about the current implementation- one of the things that keeps npf (relatively) simpler to reason about is that parse_format_spec returns processed + compatible flags. Check out the unit tests I had to comment out to get the build farther along: https://github.com/charlesnicholson/nanoprintf/pull/340/files#diff-194066f3e4aa5818e4ed73fcaed61d0a3a2b1961032b1bad5a5da2bfb2bef230

Is it possible to rework this such that parse_format_spec only returns legal combinations of flags?

stefano-zanotti-88 · 2025-05-26T17:32:03Z

About npf_parse_format_spec().
I think it should just parse the format syntactically, and not do any semantic analysis, ie not rejecting any string that is malformed due to incompatible options, and not suppressing any incompatible flag.
This is easier, and also much clearer to the user of the function; they can do their semantic analysis later.
Think about all the intricate cases, which are best left to the user of npf_parse_format_spec(): '0' with dynamic precision (cannot be determined by the function); '0' with literal precision (it can be); '#' with %i, c, s, ...; '+' and ' ' with unsigned specifiers; '0' with float specifiers; anything with %%; anything with %p; anything other than length modifiers with %n; ...

Also: is this function meant to be user-accessible? If so, then maybe you might consider integrating into it the parsing of spans of literal characters. I mean, instead of returning 0 when a non-% or an error is found, it could return the length of the span, with npf_format_spec_t.conv_spec set to NONE.
This is similar to what I did in #287 -- but I did this in the loop calling npf_parse_format_spec() instead of directly in the function. This would also make it easier for the caller to decide what to do with wrong specifiers, ie print all the chars, all except the leading %, or nothing. It would also be faster, because we can print multiple chars without multiple function calls (both because we can use puts instead of putc, and because we call npf_parse_format_spec just once instead of once per char).

A simple parsing without applying any semantic constraint is also useful for #298; otherwise, those safety checks don't have all the information needed for proper error detection (note that I'll have to rework #298 after the recent and future changes to NPF; some checks were indeed missing because of this reason).

Also see #341

charlesnicholson · 2025-05-28T19:55:54Z

Might as well just delete the empty unit tests at this point :) Leaving the test cases around with all of the code commented out can be confusing to readers; if the claim is that the format spec can not be validated until star-args are read, and deleting the validation code is a core part of this PR, then would you mind also just fully deleting the tests that are half-commented-out please?

stefano-zanotti-88 · 2025-05-29T05:37:47Z

This PR is authored by you, I don't think I can edit it.
Or, I could resurrect #329.

I would not delete the validation code. It just needs tweaks to check for all the options found in the format string, without any semantically-driven adjustment.
So, for instance:

REQUIRE(npf_parse_format_spec("%.4n", &spec) == 4);
REQUIRE(spec.prec_opt == NPF_FMT_SPEC_OPT_NONE);

would become:

REQUIRE(npf_parse_format_spec("%.4n", &spec) == 4);
REQUIRE(spec.conv_spec == NPF_FMT_SPEC_CONV_WRITEBACK);
REQUIRE(spec.prec_opt == NPF_FMT_SPEC_OPT_LITERAL);
REQUIRE(spec.prec == 4);

And this:

REQUIRE(npf_parse_format_spec("%0-u", &spec) == 4);
REQUIRE(spec.left_justified);
REQUIRE(!spec.leading_zero_pad);

would become:

REQUIRE(npf_parse_format_spec("%0-u", &spec) == 4);
REQUIRE(spec.left_justified);
REQUIRE(spec.leading_zero_pad);

Of course, the subcase names would need some tweaking too.

charlesnicholson · 2025-05-29T13:35:15Z

Ugh, I'm sorry, I forgot that we were discussing my PR 😬

Thanks for the feedback, these are good points and assumptions. I'll update the tests and then maybe you could take a final look at it and give me any thoughts. I do like this change!

stefano-zanotti-88 · 2025-05-29T16:59:53Z

Sounds great.

charlesnicholson · 2025-06-12T15:28:40Z

@stefano-zanotti-88 Looks like this PR regresses MSVC 64-bit. The failed run is here, could I ask you to take a look at it please? Everything else looks good at this point though!

https://github.com/charlesnicholson/nanoprintf/actions/runs/15614590541/job/43983976675?pr=340

===============================================================================
D:\a\nanoprintf\nanoprintf\tests\mpaland-conformance\paland.cc(763):
TEST CASE:  pointer

D:\a\nanoprintf\nanoprintf\tests\mpaland-conformance\paland.cc(99): MESSAGE: ABCDEF0123456789

D:\a\nanoprintf\nanoprintf\tests\mpaland-conformance\paland.cc(100): FATAL ERROR: REQUIRE( npf_result == std::string{expected} ) is NOT correct!
  values: REQUIRE( 23456789 == abcdef0123456789 )

stefano-zanotti-88 · 2025-06-12T18:58:26Z

It might be related to #342, especially if the failed test is with a compilation done without NANOPRINTF_USE_LARGE_FORMAT_SPECIFIERS.
You can try and add:

static_assert(sizeof(void*) <= sizeof(npf_uint_t));
static_assert(sizeof(uintptr_t*) <= sizeof(npf_uint_t));

=====
Otherwise:

I'm unable to replicate it (on MinGW).

Could you please add these tests in paland.cc:764?
I mean, in the "pointer" test case, outside of the if.

#if NANOPRINTF_USE_PRECISION_FORMAT_SPECIFIERS == 1
    require_conform("00", "%.2p", nullptr);
    require_conform("1234", "%.2p", (void *)0x1234u);
    require_conform("1234-5678", "%p-%p", (void *)0x1234u, (void *)0x5678u);
#endif

This is meant to check (on other platforms too) whether the problem is with NPF itself, mishandling the precision, or whether it is a problem with the argument itself, getting truncated.
Is it possible that MSVC is doing weird stuff with the size of void* and uintptr_t?
They should both be 64 bits.
These tests are good to add even if the problem is not here.

You could also try and add this somewhere:

static_assert(sizeof(void*) <= sizeof(uintptr_t));

Compilation should fail, if the problem is indeed with MSVC's definition of builtin types.

charlesnicholson · 2025-06-16T01:05:54Z

I added your new tests and they're failing on all platforms- check out the run here:

https://github.com/charlesnicholson/nanoprintf/actions/runs/15669412831?pr=340

===============================================================================
/__w/nanoprintf/nanoprintf/tests/mpaland-conformance/paland.cc:771:
TEST CASE:  pointer

/__w/nanoprintf/nanoprintf/tests/mpaland-conformance/paland.cc:775: FATAL ERROR: REQUIRE( std::string{buf} == std::string{"1234-5678"} ) is NOT correct!
  values: REQUIRE( 0000000000001234-0000000000005678 == 1234-5678 )

stefano-zanotti-88 · 2025-06-16T05:11:58Z

Sorry, the third test I gave you is wrong, it was meant to specify a precision too:

require_equal("1234-5678", "%.1p-%.3p", (void *)0x1234u, (void *)0x5678u);

Since the first two new tests succeed (on other platforms at least), I'm quite sure the problem is that of #342.
Please try and replace the following (in the PR):

nanoprintf/nanoprintf.h

Lines 295 to 301 in bd3f1f7

    
           #if NANOPRINTF_USE_LARGE_FORMAT_SPECIFIERS == 0 
        
             typedef long npf_int_t; 
        
             typedef unsigned long npf_uint_t; 
        
           #else 
        
             typedef intmax_t npf_int_t; 
        
             typedef uintmax_t npf_uint_t; 
        
           #endif

with:

//#if NANOPRINTF_USE_LARGE_FORMAT_SPECIFIERS == 0
//  typedef long npf_int_t;
//  typedef unsigned long npf_uint_t;
//#else
  typedef intmax_t npf_int_t;
  typedef uintmax_t npf_uint_t;
//#endif

I think this will make the tests succeed; nothing else should break, since I don't think we have any test specifically targeting the condition long == npf_int.
Then, we can think about how to address #342 in a less drastic way.

charlesnicholson · 2025-07-27T01:38:30Z

I wasn't crazy about fully committing to uintmax and intmax because they can have huge performance penalties on smaller architectures, so I sat on this for a bit. I fixed the windows tests and AVR2/5 compilation tests with an alternate approach, please take a look- i'd love to hear your thoughts. Everything's green though, I'd like to land this!

stefano-zanotti-88 · 2025-07-27T06:10:36Z

I wouldn't want to always use (u)intmax_t either. That was just a temporary change to pinpoint the problem, and indeed it seems that npf_int was the root cause. Adapting to the smallest size necessary is what I'd like too.

Your latest commit causes no test failures, but I'm not sure it's correct.
Please see my comment on #342:

It might seem like it could be done more easily with intptr_t/uintptr_t instead of long, but on 8/16-bit architectures that would be wrong too: intptr_t could be 16 bits, and long 32 bits, so npf_int_t would be large enough for %p but not for %li

That is, if long specifiers are disabled in NPF, we have no long long, size_t, etc., so the widest type we need to deal with is either intptr or long. Both must be supported, since %p and %li are never disabled in NPF.
Usually, intptr is larger or equal, but on 8/16-bit architectures, intptr is usually 16 bits, and long is 32 bits (this is required by the standard: long width >= 4*CHAR_BIT -- and I'm assuming CHAR_BIT == 8)
I'm not sure why no test about long is failing in your AVR tests. I think some should be failing, but maybe we have a gap in the tests.
This should detect it:

#if ULONG_MAX >= 18446744073709551616ul
    require_conform("18446744073709551616", "%lu", 18446744073709551616ul);
#else // the standard ensures that width of long >= 4*CHAR_BIT >= 4*8
    require_conform("4294967296", "%lu", 4294967296ul);
#endif

Could you add it to the tests? I think this will fail on AVR, and the definition of npf_(u)int should be revised as a consequence. Again, see the proposed solution in #342
If you don't like my ifdef's, you can certainly come up with different ones, but they necessarily involve looking at the size of both intptr and long.

charlesnicholson · 2025-07-27T14:46:08Z

I don't actually have an AVR platform to execute on, and haven't set up anything like qemu, so the extent of the AVR test is just "do things compile". Re: the sizes, that makes sense, we want max(sizeof(void*), sizeof(unsigned long))

(edit: deleted the sizeof stuff; you're right)

charlesnicholson · 2025-07-27T15:34:08Z

When it's at all possible, I prefer to probe the types using limits.h comparisons- I think this achieves what you're aiming at with the _WIN32 and __SIZEOF_POINTER__ etc approach; if you care to take another look I'd appreciate feedback. If not I'm happy merging this one as-is.

stefano-zanotti-88 · 2025-07-27T15:49:23Z

sizeof cannot be used in the preprocessor, unless by some compiler-specific extension, which is better avoided in a library like NPF.
sizeof is compile-time constant, but unknown to the preprocessor, which has no knowkedge of types (nor their sizes).

The macros in #342 came from some code that clang uses for a similar check, which I can't find anymore.
They do not use UINTPTR_WIDTH and similar, which require C23.
Also, they do not rely on explicit value checks like UINTPTR_MAX == 0xFFFFFFFFu, which rely on CHAR_BIT == 8 (not always true, though probably true for all the architectures NPF cares about) -- this explicit reliance is also there in your latest commit.

Actually, we don't need the explicit values of the size of long and intptr, so the following should suffice:

#if NANOPRINTF_USE_LARGE_FORMAT_SPECIFIERS == 0
  #if ULONG_MAX >= UINTPTR_MAX
    typedef long npf_int_t;
    typedef unsigned long npf_uint_t;
  #else
    typedef intptr_t npf_int_t;
    typedef uintptr_t npf_uint_t;
  #endif
#else
  typedef intmax_t npf_int_t;
  typedef uintmax_t npf_uint_t;
#endif

charlesnicholson · 2025-07-27T16:01:21Z

That tidies up to this:

#if NANOPRINTF_USE_LARGE_FORMAT_SPECIFIERS == 1
  typedef intmax_t npf_int_t;
  typedef uintmax_t npf_uint_t;
#elif ULONG_MAX > UINTPTR_MAX
  typedef long npf_int_t;
  typedef unsigned long npf_uint_t;
#else
  typedef intptr_t npf_int_t;
  typedef uintptr_t npf_uint_t;
#endif

which looks pretty nice to me. Thanks for collaborating on this one, I'm (finally!) going to land it.

stefano-zanotti-88 · 2025-07-27T16:16:24Z

Yes, this looks good.
You're welcome.

port pointer precision changes over, fix up / comment out unit tests

7fd80f0

charlesnicholson changed the title ~~port pointer precision changes over, fix up / comment out unit tests~~ tighter %p default precision May 26, 2025

stefano-zanotti-88 mentioned this pull request May 26, 2025

Reworked spec parsing #341

Open

This was referenced May 26, 2025

Fix handling of negative dynamic precision #328

Closed

Fuzzy and exhaustive testing #343

Open

bump paland

d0e6e39

This was referenced May 28, 2025

Fixed handling of "0-" #327

Closed

nicer default for %p precision #329

Closed

charlesnicholson added 13 commits June 12, 2025 10:17

PR feedback: update tests

a2fb8c4

rename test, break up lines

26c9ca0

delete obsolete tests

637e45f

tidy up

e7683f8

statement expressions for min/max on gcc

aa7624c

suppress pedantic on gcc

68d7b47

fix paland

a5089d2

disable pragma-based warnings

7b8ba9c

alphabetize because github needs some action

29a6e3e

delete obsolete tests

3bd62db

reorder

1bccd2e

bump paland

faa8f66

Merge branch 'main' into pointer-precision-default

c66661e

tidy

8b80ebd

charlesnicholson added 2 commits June 15, 2025 20:38

bump paland

b6b5dbd

bump paland

c49e9a5

charlesnicholson added 3 commits July 26, 2025 16:46

bump paland

a2a4a05

try better types

61f0e75

try better types

ef71831

try better types

bced086

try better types

1e651af

charlesnicholson merged commit 9c25f79 into main Jul 27, 2025
28 checks passed

charlesnicholson deleted the pointer-precision-default branch July 27, 2025 16:08

charlesnicholson mentioned this pull request Jul 27, 2025

Possibly wrong npf_uint_t type #342

Closed

Uh oh!

tighter %p default precision #340

tighter %p default precision #340

Uh oh!

Conversation

charlesnicholson commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlesnicholson commented May 26, 2025

Uh oh!

stefano-zanotti-88 commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlesnicholson commented May 28, 2025

Uh oh!

stefano-zanotti-88 commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlesnicholson commented May 29, 2025

Uh oh!

stefano-zanotti-88 commented May 29, 2025

Uh oh!

charlesnicholson commented Jun 12, 2025

Uh oh!

stefano-zanotti-88 commented Jun 12, 2025

Uh oh!

charlesnicholson commented Jun 16, 2025

Uh oh!

stefano-zanotti-88 commented Jun 16, 2025

Uh oh!

charlesnicholson commented Jul 27, 2025

Uh oh!

stefano-zanotti-88 commented Jul 27, 2025

Uh oh!

charlesnicholson commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlesnicholson commented Jul 27, 2025

Uh oh!

stefano-zanotti-88 commented Jul 27, 2025

Uh oh!

charlesnicholson commented Jul 27, 2025

Uh oh!

Uh oh!

stefano-zanotti-88 commented Jul 27, 2025

Uh oh!

Uh oh!

charlesnicholson commented May 26, 2025 •

edited

Loading

stefano-zanotti-88 commented May 26, 2025 •

edited

Loading

stefano-zanotti-88 commented May 29, 2025 •

edited

Loading

charlesnicholson commented Jul 27, 2025 •

edited

Loading