Curl removes all calls to strcpy
104 points by groctel
104 points by groctel
I know it’s hard to make changes in a large, old codebases, but it would have been so much better to change the string type rather than the parameters for copying. So instead of making every copy pass const char *src, size_t slen, change the string type from const char* to struct {const char *src; size_t slen;}. (This is what C++ calls string_view, Go calls slice, Rust calls str, etc.)
After all, if you have to know a string's length in advance when you copy it, what’s the point of the null byte at the end? You were probably given the length along with the pointer (otherwise you'd have to make a slow strlen call to get it, before you could call safe_strcpy) so why not just package that length up with the pointer?
strcpy however, has its valid uses and it has a less bad and confusing API.
Contrast this with strcpy: a niche function you don't need :)
FWIW I agree with Daniel on this one.
In the end, both authors end up using memcpy instead
I disagree with the premise that strcpy is ever reasonable - you either know the size of the string being copied so can use memcpy, or you don’t in which case strcpy is unsafe.
There really is no time where it is reasonable to use strcpy.
It has been proven numerous times already that strcpy in source code is like a honey pot for generating hallucinated vulnerability claims.
That's an interesting (but very minor) side effect of LLM usage. I know I do similar things to avoid humans getting fixated on particular lines of code.
Yeah. I know humans who will complain about strcpy even before llm existed. Cargo cult isn't new
LLMs get their info from what humans have expressed before, so them aping "folk wisdom" is to be expected.
We indeed now live in AI-supported dunning-kruger "best-practice" hell.
I don't engage at all with vibecoding stories but I read them. The parent and grandparent comments I'm replying to has been one of the most insightful comments about vibecoding I've read.
(Coincidentally, I was reminded to post here after reading this posted in the much maligned other site.)
I feel I have the obligation to stay informed about my trade, and this includes understanding whether LLMs can help me develop code. And every time I play with that, LLM products generate code that triggers my smell detector. Lots of unnecessary code that I would have written 20 years ago and I would have been extremely proud of. To be fair, in the last 20 years I have written mostly "internal" code that is not meant to make customers happy, so likely I'm quite biased towards minimalist code that just works, which might not be the best in all scenarios.
I feel this also ties with my sensation when I read learning materials about programming to recommend to newbies. Most inevitably contain crud that make it very difficult for me to recommend them. (And the materials that do not contain such crud tend to not be very accessible for most people, IMHO.)
And I think the underlying issue is that computing is too new a discipline and that we don't know how to write software yet, so we cannot teach it and we cannot automate it. Yet.
And I think the underlying issue is that computing is too new a discipline and that we don't know how to write software yet, so we cannot teach it and we cannot automate it. Yet.
This comment strongly reminds me of We Really Don't Know How to Compute by Gerald Sussman. Previously, on Lobsters.
Hah, that's the best Lisp advocacy video I've ever seen, I think. I loved the delivery.
(However, like all the Lisp advocacy, you really need to work hard to get it.)
the much maligned other site. … I feel I have the obligation to stay informed about my trade
Coincidental juxtaposition but I’ve recently been thinking that, due to the latter, I should check in on the other site (I currently don’t)
I'm a bit of a masochist. You really don't have to do it :D
Although, very much like Twitter- which I quit entirely because conscience, but which still provided me a lot of interesting insights I can't really get anymore anywhere else- I think I still get some value out of HN. Even beyond "yep, there's still a lot of very weird people out there".
Since LLMs scour the internet for materials, they’re likely to regurgitate what humans regard as best practices. You’ll find some good ones to follow and they can help with comparison.
But you can convince the human (hopefully) and in the future they won't do the same thing. Not to mention, I'd wager most people open far fewer such tickets, than a "helpful" AI would.
I’m tickled to see the prototype of curl’s new function is ordered (destination stuff) (source stuff). I find that ordering really unintuitive and I’d have been tempted to swap them, but It makes sense to follow the order of memcpy, so I probably would also resist the urge.
I also prefer move (what) from->to (where), but even hardware doesn't agree. :)
Motorola 68k: MOVE src, dst
vs
Intel x86: MOV dst, src
It is basically either a mental version of:
COPY src,dest or LET dest = src
and my guess is that people tend to like the one they encountered first.
To me this makes sense if you look at it from an object-oriented point of view. In memcpy(dst, src, size), the first argument is the self argument you'd pass when calling method-like functions in C:
MyStruct_myMethod(MyStruct *self);
If instead of memcpy(dst... you read it as something like ByteBuffer_copy(self..., the argument order is pretty much sensible and consistent.
I wonder, when they ban a function, do they put something in their build chain that warns when strcpy is found?
I’d be tempted to #define it to a compile time error in a common curl header
Curl’s checksrc.pl is fairly sophisticated: it supports special pragma comments to allow banned functions in special cases, and it can enforce different rules in different parts of the tree. I guess they don’t want different places to ban functions depending on how strict the ban is.
https://github.com/search?q=repo%3Acurl%2Fcurl+%22%21checksrc%21%22&type=code
GCC has a way of marking a function as deprecated (__attribute__((deprecated))) and will print a diagnostic message when compiling. Clang I think supports this as well.