×
all 45 comments

[–]SonOfMrSpock 30 points31 points  (26 children)

Dont we call these Pascal strings ?

[–]TwoIsAClue 29 points30 points  (23 children)

Pascal strings literally are what made C strings look like something not completely insane in the 70s; 255 is not an acceptable maximum length.

What we really needed was to bite the bullet and accept the complexity of a dedicated type with a variable-size header. Is the first byte 0? Empty string, as short as you like. Is the first bit 0? Your string starts at byte 1 and it's up to 127 bytes long. Otherwise the first 4 bytes after zeroing the first bit are the length and the rest of your string follows.

[–]PeaSlight6601 5 points6 points  (4 children)

Why not use the same kind of encoding as we do for utf to encode the length?

[–]MokoshHydro 11 points12 points  (3 children)

Performance.

[–]PeaSlight6601 4 points5 points  (2 children)

Seems a bit silly.

For one you need a performant utf8 decoder anyways to actually handle the strings, and secondly seeking small performance gains is how we got to this situation in the first place.

Seems better in my mind to use utf8 approach everywhere and hope that cpus can implement special instructions to decode them.

[–]MokoshHydro 8 points9 points  (1 child)

UTF8 is great for exchange. When we need to perform individual character operations -- it become a nightmare. That's not a "small penalty". CPU won't help much, cause we'll get alignment issues, etc. Lemire made some low level optimizations for UTF8 using SIMD.

Basically, string implementation is always a tradeoff between fast concatenation and fast symbol iteration.

[–]PeaSlight6601 3 points4 points  (0 children)

But knowing strings length is often about exchange: How much do I need to copy from one buffer to another?

I think party of the problem here is that c strings mean two different things. Exchange between libraries and programs running in the same address space and operations within a single program in a single address space.

If you are in a single address space and want to use a wide format then fine. Just use a simple struct of length and buffer, but in practice everything tends to leak out of your address space in time. So you want those strings to be utf8 in a lot of instances.

If the string itself is utf8 I don't see why it is so bad for the length to be as well.

[–]SonOfMrSpock 2 points3 points  (0 children)

I was just surprised there was no mention of Pascal strings in the article.

Sure 255 was not enough but Delphi-2 had "long Pascal strings" with 32bit size header in 1996. AFAIK, there is no support for any type of pascal or your variable-size header strings in any of C standard until today. IDK why.

[–]alphaglosined 2 points3 points  (4 children)

No need for a variable size header, it makes things harder especially for C as an ABI.

A pair of length with pointer, a slice is all that is needed. Uses struct ABI conventions.

Sadly none of the many proposals even ones from the early 90's ever got accepted.

[–]TwoIsAClue 4 points5 points  (3 children)

The thing is that in the '70s when C was created memory was so terribly limited that using 4 bytes (assuming a field size of 2 just to put a number on the board, it might've been even more) for every string was problematic. Null-terminated strings weren't even conceived in C, they came from the assembly world of the time.

Of course, a couple of Moore's law iterations later the length + pointer to data was the obvious solution in all but the most pathological of cases, but by then C and its strings were everywhere already.

This is just an example of why we should strive to leave this technology behind if at all possible, a lot of the design choices in C made sense in the '70s but are terrible ideas now.

[–]alphaglosined 2 points3 points  (2 children)

Yeah historically it made sense.

But we don't need to drop C to get over this limitation, we just have to add slices to C.

And the reason it hasn't been done is entirely political at this point.

[–]TwoIsAClue 0 points1 point  (1 child)

In my opinion we absolutely need to drop C in every serious context ASAP outside of legacy projects/architectures and FFIs.

The elephant in the room is memory safety, but nowadays C is missing a whole load of basic and obviously useful features, is weighted down by the need to keep backwards compatibility, and a lot of the choices made in its design haven't endured the test of time.

FWIW, I really don't think C strings have stuck around because of politics. It very likely is the result of people following the path of least resistance and using what's already available rather than making their own incomplete replacement that cannot be used anywhere else.

[–]alphaglosined 4 points5 points  (0 children)

Slices have had multiple proposals made for C over the years.

I am serious when I say it is entirely political that it hasn't got it.

Note the dates.

https://www.bell-labs.com/usr/dmr/www/vararray.pdf

https://digitalmars.com/articles/C-biggest-mistake.html

[–]evaned 3 points4 points  (11 children)

What we really needed was to bite the bullet and accept the complexity of a dedicated type with a variable-size header. Is the first byte 0? Empty string, as short as you like. Is the first bit 0? Your string starts at byte 1 and it's up to 127 bytes long. Otherwise the first 4 bytes after zeroing the first bit are the length and the rest of your string follows.

My gut reaction to this is is that it's probably beneficial to have the size of the string alongside the pointer, in a larger string "object", rather than in the header of the string data itself.

Modern string implementations in performance-centered languages do this, alongside the small-string optimization, which makes that almost a no-brainer anyway.

[–]Ameisen 3 points4 points  (8 children)

So, there is a downside to this - which is the same thing as that these two definitions actually do define subtly-different things:

static const char* const foo = "bar";

static const char foo[] = "bar";

If you have a 4, 8, 12, or 16 byte object defining a size and pointer (effectively C++'s std::string_view):

  • It is a performance malus with ABIs like Win64 that won't pass that structure in registers.
  • It requires an additional dereference to get to the character data.
  • It is mainly useful for APIs where the string may come from anywhere - within a library or application, storing the size with your character data is almost always better, but complicates things if you need to support both a view and an inline string - worst case, the inline strings get converted to views anyways.
  • Extending the previous, inline strings have less size overhead, and potentially less performance overhead. You read the size, then the character data is sequentially after it: incredibly cache-friendly.
  • Can end up larger than expected due to alignment issues.
  • it effectively doubles the overhead associated with potentially having allocated a char array.

In my low-alloc low-latency APIs, I support both string structures and inline strings and use C++ templates and enable_if/concept to simplify implementation, with special handling for rvalue refs.

[–]XtremeGoose 2 points3 points  (3 children)

We're talking about designing c right? This is the 60s, win64 doesn't exist yet so I'd imagine calling conventions would have been around fat pointers in this alternative history.

I don't think cache locality matters if you're, say, iterating over the string because you'd almost certainly just hold that in a register. And lots of operations on strings only care about the length, not the actual data so you'll be avoiding indirection in lots of cases.

There's a reason modern string types are designed this way.

[–]evaned 0 points1 point  (1 child)

We're talking about designing c right? This is the 60s, win64 doesn't exist yet

I'd kind of say... "no", actually.

TFA is talking about the approach that he's taking with the project that he's writing now, without (at least direct) use of the cstdlib string functions. His choices are not bound by what C choose to do in the 60s, and Win64 does exist now.

[–]Ameisen -1 points0 points  (0 children)

Looking at the PDP-7 architecture, I feel like using a nul char to indicate the end of a string was absolutely the right decision. They could have possibly gotten away with an 18-bit length, but the ISA really doesn't make doing that 'nice'. It doesn't have GPRs, and this would all largely be handled directly via memory access (cycle time was 1.75 ms). You really don't want to have to have been fetching a different address/offset for every loop iteration in this case, as that would halve the operation's speed, if not worse (as reading character data and reading length data would have been competing for the memory buffer register).

[–]Ameisen 0 points1 point  (0 children)

We're talking about designing c right? This is the 60s, win64 doesn't exist yet so I'd imagine calling conventions would have been around fat pointers in this alternative history.

I mean, we can knock ourselves out:

I could ask some people I know about the specifics of how the PDP-7 was used for procedure calls.

It doesn't have many instructions, and it only has a handful of registers - and they're not general-purpose registers (I believe that the PDP-11 introduced those to the PDP line).

If you look at the PDP-7's architecture, it makes sense that they used a NUL to indicate the end of a string, since that's largely how the CPU worked.

I don't think cache locality matters if you're, say, iterating over the string because you'd almost certainly just hold that in a register.

How do you think that the data ends up in the register (and it almost certainly is not in a register, at least on x86)?

The length value may be in-register depending on ABI, as may the data pointer. You will have to read from memory to actually access the data pointer's data, though. On x86, that includes some fun L1/L2 cache interactions.

There's a reason modern string types are designed this way.

Because it's the ideal layout for general-purpose mutable strings. There are cases where other layouts have benefits.

And lots of operations on strings only care about the length, not the actual data so you'll be avoiding indirection in lots of cases.

Depending on use-case, you either have more, the same number, or fewer indirections.

If you're only using length, "fat" strings are better, though even better is to then adopt an SoA architecture (say, for interned string storage).

[–]evaned 1 point2 points  (3 children)

It is a performance malus with ABIs like Win64 that won't pass that structure in registers.

I will admit this is a downside... but my response to that would be to use a better ABI. If you're defining your own string type anyway, then you've got control over the calling convention of the functions that use it.

It requires an additional dereference to get to the character data.

I wonder if we're talking past each other here, because it's not true for the picture I have in my head.

In fact, it saves a dereference when all you need is the size, though admittedly I'm not convinced that has enough value on its own to make a difference, as long as you can tell empty from non-empty without a dereference.

(I do strongly suspect that empty/non-empty without a deference is important enough that it would justify "my" design, but if you use a null pointer to represent an empty string then you don't need "my" design to achieve that goal.)

I will point out though that something like a += b to destructively append a string will, barring a reallocation, do the above: use the size of a without reading the contents at the start of the string a. That operation would benefit from "my" design.

You read the size, then the character data is sequentially after it: incredibly cache-friendly.

I claim that cache friendlieness actually goes to "my" design, to the extent "you can access the size without reading the initial part of the string data" has practical value. (And to the extent it doesn't, they're basically equivalent.) You need the pointer in either case, and with the size next to the pointer it will be readily available in "my" design as well.

I will point out that every current C++ standard library implementation uses a design similar to what I describe: the string object itself holds not just the pointer-to-data but also the size and current buffer capacity, and that object duals as a SSO buffer. libstdc++ even moved to that design, away from one that has the string object itself as just a pointer pointing to a length-prefixed buffer (length and capacity). (They dropped COW at the same time, but I think that design choice is largely orthogonal.) From what I can tell, Rust's String object is basically the same as that, though no SSO (which surprises me). I'm very inclined to believe that with the engineering effort that goes into these performance-oriented languages, they have reasonably good evidence that a fat object is a better design for your default, go-to string type vs. a pointer to a struct.

[–]Ameisen 0 points1 point  (2 children)

I will admit this is a downside... but my response to that would be to use a better ABI. If you're defining your own string type anyway, then you've got control over the calling convention of the functions that use it.

MSVC has no ABI you can specify for this case. Both Win64 and vectorcall will push this on the stack. There's no reliable way to avoid the stack on Win64. Maybe if you lie to it and make the compiler think that it's a vector value, it will pass it in an XMM register...? Only with vectorcall though.

This is a known problem: https://quuxplusone.github.io/blog/2021/11/19/string-view-by-value-ps/

LTCG/LTO may avoid this problem if the function ends up inclined, or if the function is defined in a header.

In fact, it saves a dereference when all you need is the size

Only if the structure is already in a register.

My design

Your design is largely identical to most string views, including most implementations of std::string_view and Unreal's FStringView.

I will point out though that something like a += b to destructively append a string will, barring a reallocation, do the above: use the size of a without reading the contents at the start of the string a. That operation would benefit from "my" design.

It will then dereference the pointer regardless in order to write b to it, or at least &a[len], which may have already been in a cache line.

I will point out that every current C++ standard library implementation uses a design similar to what I describe

Yes, as I said, though your design matches string_view more than string, though is a mutable one.

They dropped COW at the same time, but I think that design choice is largely orthogonal.

The standard doesn't specify how std::string must be implemented.

Copy-on-write was indirectly forbidden due to changes in required iterator semantics, namely changing when they are allowed to be invalidated. Copy-on-write implementations generally violate that, as non-const operator[] would invalidate the string. This was largely changed to better-support concurrency (and as neither non-const operator[], nor the other functions they specified, have a way to know if the data is being mutated). There are also concurrency benefits to mutable, non-CoW "fat" strings.

Their design change in libstdc++ was absolutely due to the iterator invalidation changes. This also resulted in the Committee now being afraid to break the (non-specified) ABI.

As far as I can tell, the standard doesn't prohibit the string being inline, though I'd need to investigate further.

they have reasonably good evidence that a fat object is a better design for your default, go-to string type vs. a pointer to a struct.

For a mutable string, inline strings aren't really advantageous - their benefit comes as owned, immutable strings.

Especially if you're passing it by reference anyways - then it's effectively free to access relative to containing a pointer.

In the end, it all depends on access patterns, though. An array of inline strings (yes, an array of these is complicated as traversing requires the size - it's not quite an array) where all you do is check the size is very cache-unfriendly, though you could always have a seperate, matching array of sizes for just that purpose (effectively mimicking SoA).

On the flipside, if I am using the character data, "fat" strings jump me all over memory since those pointers are highly-unlikely to be sequential.

Inline strings, and fancy collections of them, are not rare in systems were you are storing immutable strings (like interning).

[–]evaned 1 point2 points  (1 child)

MSVC has no ABI you can specify for this case. Both Win64 and vectorcall will push this on the stack. There's no reliable way to avoid the stack on Win64. Maybe if you lie to it and make the compiler think that it's a vector value, it will pass it in an XMM register...?

I did make that work, and you could do something like pack a pointer+size into an _m128i (or add capacity, and/or SSO, in a second _m128i)... but you then need a few other SSE instructions to extract the relevant values and such. I suspect this winds up not worth it.

That is unfortunate, and I didn't realize MSVC wouldn't have a better way to handle it. I'm pretty surprised by that, actually. (Not surprised it's not default, because Microsoft, but I would have thought this was an important enough thing to get.)

(I will say in my defense that TFA is talking about writing an embedded project; I suspect they're not using MSVC.)

In fact, it saves a dereference when all you need is the size

Only if the structure is already in a register.

OK, fair enough, but it would be in L1 cache which (at least on x86 derivatives) have pretty comparable speed to registers by my understanding.

(And if you're using GCC or Clang, it may well be in a register.)

If you're in a situation where the object itself (handle, pointer, whatever) isn't in L1, then no design is going to help you.

My [sic] design

Your design is largely identical to most string views, including most implementations of std::string_view and Unreal's FStringView.

... and the one in TFA.

Yes, I realize that. That's why I put scare quotes around "my" every time I said it, scare quotes you dishonestly omitted from that "quote" for some reason.

I will point out though that something like a += b to destructively append a string will, barring a reallocation, do the above: use the size of a without reading the contents at the start of the string a. That operation would benefit from "my" design.

It will then dereference the pointer regardless in order to write b to it, or at least &a[len], which may have already been in a cache line.

I'm not sure what you're trying to get at here. With "my" design, it wouldn't need to read or write right at the pointer's address, only at the end. Because the new size isn't stored at the pointer.

(And we're back in the design we had before -- if your string object itself is out of cache/registers, then you've lost anyway.)

The standard doesn't specify how std::string must be implemented.

I would argue that fact makes the observation that three different teams (GNU, LLVM, Microsoft/Dinkumware) all converged on a similar fat-object design (over a pointer to a structure) even more compelling that that's the "right" option for a generic string data structure.

I will point out that every current C++ standard library implementation uses a design similar to what I describe

Yes, as I said, though your design matches string_view more than string, though is a mutable one.

Even in my first comment I was talking about SSO, which doesn't match string_view, and in the followup I mentioned Rust's String which is the buffer-owning string object as well as std::string.

I will admit though that I've been playing fast and loose with whether capacity is present in the fat object... but that's because I don't view it as super important or interesting. I'd say my arguments apply in both cases.

In the end, it all depends on access patterns, though.

Sure, I don't claim that "my" design is going to be better in all circumstances. But at the same time, if you're writing a generic string data structure you have to make some choice that hopefully represents the best tradeoffs across a wide range of access patterns.

My claim is that I think the evidence points toward that being a fat object, not a pointer to a structure.

[–]Ameisen -1 points0 points  (0 children)

I did make that work, and you could do something like pack a pointer+size into an _m128i ...

You can force the compiler to generate the instructions itself like so, though the GPR/MEM->SIMD->GPR conversions are likely worse than using the stack - particularly for view creation.

unsigned __int64 get_length(stringy_view) PROC       ; get_length, COMDAT
        movq    rax, xmm0
        ret     0
unsigned __int64 get_length(stringy_view) ENDP       ; get_length

$T1 = 0
__$ReturnUdt$ = 32
data$ = 40
length$ = 48
stringy_view get_view(char *,unsigned __int64) PROC       ; get_view, COMDAT
$LN6:
        sub     rsp, 24
        mov     QWORD PTR $T1[rsp], r8
        mov     rax, rcx
        mov     QWORD PTR $T1[rsp+8], rdx
        movups  xmm0, XMMWORD PTR $T1[rsp]
        movups  XMMWORD PTR [rcx], xmm0
        add     rsp, 24
        ret     0
stringy_view get_view(char *,unsigned __int64) ENDP       ; get_view

The ideal would be to get it passed in two GPRs, but I cannot think of a way to do that automatically.

I will say in my defense that TFA is talking about writing an embedded project; I suspect they're not using MSVC.

I wrote and compiled an entire bootloader (multiboot and EFI) and a small kernel using MSVC :/. Was a PITA, though.

have pretty comparable speed to registers by my understanding.

L1 is at least 2-4× slower than a register, and can be worse than that depending on circumstances.

Note that the L1 cache is also susceptible to things like false sharing, where registers are not.

I take advantage of the L1 cache in certain systems (VeMIPS assumes that the 32 32-bit register file fits neatly into a single L1 cache line).

Since this is on the stack, it is indeed very likely to be resident in-cache, though.

Yes, I realize that. That's why I put scare quotes around "my" every time I said it, scare quotes you dishonestly omitted from that "quote" for some reason.

Because I dishonestly wrote my entire comment on my phone with an arm suffering from dishonestly severe tendonitis, and the Reddit mobile app on Android is dishonestly terrible.

I'm not sure what you're trying to get at here. With "my" design, it wouldn't need to read or write right at the pointer's address, only at the end. Because the new size isn't stored at the pointer.

That only matters if the string is longer than 128 bytes (a normal cache line size). The cache operates on that granularity and alignment.

So, if it's <= 124 bytes, then size+data fits into a single cache line. Otherwise, it does not. In yours, it has to make sure that size is in-cache (if it's on the stack, then it is likely within a cache line or two (depending on offset) for the stack variables. It will then need to make sure that the 128 bytes of string data that straddle where you're writing are in-cache, as x86 at least (x86 has a... unique cache architecture) writes to the L1 and L2 cache and then it propagates that write later to memory (unless you specify that you want non-temporal stores using, say, movnti - there's no guarantee that this is a benefit, though, depending again on usage/access patterns). This should still be pretty fast, but it's still faster if all the data is already present.

This is largely a very, very-reduced version of arrays-of-structs vs structs-of-arrays.

And we're back in the design we had before -- if your string object itself is out of cache/registers, then you've lost anyway.

Yup. I'm just in the habit of designing specifically for that case or for the specific case that it's not - but the vast majority of strings that I encounter fit neatly into a cache line with 4 bytes of size.

I would argue that fact makes the observation that three different teams (GNU, LLVM, Microsoft/Dinkumware) all converged on a similar fat-object design (over a pointer to a structure) even more compelling that that's the "right" option for a generic string data structure.

Well, yes. it's ideal for general-case mutable strings. There are very specific circumstances - that happen to be more common in certain fields - where inline strings are more optimal.

For immutable strings that own their data, it's significantly more tricky. The general-case is probably still 'good enough', but it really depends on what exactly you're doing.

C++ doesn't have an immutable string type with ownership semantics.

Even in my first comment I was talking about SSO

I apologize, I must have missed that, or I got mixed up with the general discussion in this post where people were largely posting C structs that effectively were string views.

My claim is that I think the evidence points toward that being a fat object, not a pointer to a structure.

I don't disagree, I just want to point out the cases where it's not optimal. They're more common than you'd expect.

For a full mutable string, the ABI issue doesn't matter as people pass those by const reference anyways. It's mainly an issue for views - which is why I was approaching the issue from the view standpoint, and also from the immutable string standpoint.


ED: I just realized that I forgot to mark get_view as __vectorcall. That improves the resultant assembly somewhat:

$T1 = 0
data$ = 32
length$ = 40
stringy_view get_view(char *,unsigned __int64) PROC       ; get_view, COMDAT
$LN6:
        sub     rsp, 24
        mov     QWORD PTR $T1[rsp], rdx
        mov     QWORD PTR $T1[rsp+8], rcx
        movups  xmm0, XMMWORD PTR $T1[rsp]
        add     rsp, 24
        ret     0
stringy_view get_view(char *,unsigned __int64) ENDP       ; get_view

[–]PeaSlight6601 0 points1 point  (1 child)

The one benefit of doing it this way is that only a single object needs to be transferred between libraries and functions.

If you are packing many strings into a single table it may make more sense to separate the Metadata and the data like that.

[–]evaned 1 point2 points  (0 children)

The one benefit of doing it this way is that only a single object needs to be transferred between libraries and functions.

I mean, that's not the only benefit... it wasn't even the benefit I was considering, which was cache locality.

I'm admittedly not sold on this when it comes to size, as long as you can distinguish empty from non-empty without a dereference. But SSO is generally helpful, and unless you want SSO to apply only to extremely short strings (no more than 8 bytes), then you can overlap SSO storage with the size field at no memory cost.

[–]nerd4code 0 points1 point  (1 child)

No, Pascal strings are more sensible, because the length is inline with the string data; unfortunately in the standard case this limits you to 255 chars per string.

In C, you’d preferably

typedef size_t Str_Len;
typedef struct {
    Str_Len len;
    char c[];
} Str;

flexly for Pascal format.

[–]SonOfMrSpock 2 points3 points  (0 children)

Well, 255 chars pascal strings is history, they're still exists but for backward compatibility. Delphi/freepascal have long strings for two decades at least.

[–]zhivago 3 points4 points  (0 children)

The actual safety improvement here is to make strings immutable and well formed.

Although it's a bit embarrassing that they couldn't figure out const properly and have to rely on documentation for this. :)

The rest is effectively mandatory memoization.

[–]matthewt 0 points1 point  (0 children)

My first experience with such an implementation was djb's substdio as used in qmail.

I am terribad at writing C and patching qmail (a couple decades ago now) is probably the only time I've worked on C code and spent more time dealing with logic bugs than segfault bugs (all self inflicted in both cases, mind, see "terribad" ;).

[–]jacobb11 0 points1 point  (8 children)

You don't mention a string deallocation function. I suspect that's because one cannot be correctly implemented given the definition of STR, but either way it's a pretty glaring omission.

[–]pseudomonica 2 points3 points  (6 children)

I mean, only str_but is owning, and that as long as you’re keeping a pointer to the arena around that should be trivial to deallocate

[–]jacobb11 0 points1 point  (5 children)

only str_bu[f] is owning

I don't think so. That's possible because the memory management is unspecified, but... When the library copies a str_buf into a str it probably copies the characters so that further changes to the str_buf do not affect the str.

[–]pseudomonica 2 points3 points  (4 children)

If this was inspired by rust, then str in this library represents a &str — just one that’s mutable, for whatever reason

[–]jacobb11 -1 points0 points  (3 children)

Possible, unspecified, and error prone. Rust without a borrow checker is a memory-management nightmare.

[–]pseudomonica 3 points4 points  (2 children)

I mean, this is C. You were going to need to manually track what was and wasn’t owning anyways, and using a convention like “str is non-owning, str_buf is owning” is sensible

[–]jacobb11 -1 points0 points  (1 child)

Agree to disagree. I'm not using a C string library that separates ownership from strings and expects me to track it manually. But you do you.

[–]pseudomonica 2 points3 points  (0 children)

The C standard library does that. All of the functions that accept a char const* don’t care whether or not it’s owning, and functions like strerror are even worse! They return a char* and the spec is all “yeah don’t worry about deallocating this, just make sure no one else calls strerror while you’re using that char* and you should be fine”

(That function is not thread safe, ofc)

[–]seamsay 0 points1 point  (0 children)

Looks to me like STR is only meant to be used for string literals, but other than that I don't see any reason deallocation would be an issue. In fact, if I'm reading the post correctly, it seems that (de)allocation is explicitly meant to be handled outside of the str and str_buf interfaces.

[–]todo_code -5 points-4 points  (2 children)

I'm sorry, but if you create wrappers, and code to work with null terminated strings the same way you did for your non-null terminated c strings, you will end up with exactly the same level of safety you quoted.

[–]InfinitePoints 11 points12 points  (1 child)

The point is that you only access the struct fields using helper functions, and the helper functions do bounds checks/similar for you.

As long as you never access the inner fields directly, the only memory safety issue left is use after free from borrowing stringbufs, but solving that would require a borrow checker.

[–]todo_code -5 points-4 points  (0 children)

You are missing the point. If you make a helper wrapper set for null terminated vs non-erminated, you get the same thing...

[–]TimeSuck5000 -4 points-3 points  (3 children)

Just use C++ and std::string. We don’t need to be worrying about a few bytes here or there unless it’s an embedded system or something.

[–]stianhoiland 9 points10 points  (2 children)

Did you ahem read the article? At least checks notes three sentences of it?