32

When writing a project, I ran into a strange issue.

This is the minimal code I managed to write to recreate the issue. I am intentionally storing an actual string in the place of something else, with enough space allocated.

// #include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <stddef.h> // For offsetof()

typedef struct _pack{
    // The type of `c` doesn't matter as long as it's inside of a struct.
    int64_t c;
} pack;

int main(){
    pack *p;
    char str[9] = "aaaaaaaa"; // Input
    size_t len = offsetof(pack, c) + (strlen(str) + 1);
    p = malloc(len);
    // Version 1: crash
        strcpy((char*)&(p->c), str);
    // Version 2: crash
        strncpy((char*)&(p->c), str, strlen(str)+1);
    // Version 3: works!
        memcpy((char*)&(p->c), str, strlen(str)+1);
    // puts((char*)&(p->c));
    free(p);
  return 0;
}

The above code is confusing me:

  • With gcc/clang -O0, both strcpy() and memcpy() works on Linux/WSL, and the puts() below gives whatever I entered.
  • With clang -O0 on OSX, the code crashes with strcpy().
  • With gcc/clang -O2 or -O3 on Ubuntu/Fedora/WSL, the code crashes (!!) at strcpy(), while memcpy() works well.
  • With gcc.exe on Windows, the code works well whatever the optimization level is.

Also I found some other traits of the code:

  • (It looks like) the minimum input to reproduce the crash is 9 bytes (including zero terminator), or 1+sizeof(p->c). With that length (or longer) a crash is guaranteed (Dear me ...).

  • Even if I allocate extra space (up to 1MB) in malloc(), it doesn't help. The above behaviors don't change at all.

  • strncpy() behaves exactly the same, even with the correct length supplied to its 3rd argument.

  • The pointer does not seem to matter. If structure member char *c is changed into long long c (or int64_t), the behavior remains the same. (Update: changed already).

  • The crash message doesn't look regular. A lot of extra info is given along.

    crash

I tried all these compilers and they made no difference:

  • GCC 5.4.0 (Ubuntu/Fedora/OS X/WSL, all are 64-bit)
  • GCC 6.3.0 (Ubuntu only)
  • GCC 7.2.0 (Android, norepro???) (This is the GCC from C4droid)
  • Clang 5.0.0 (Ubuntu/OS X)
  • MinGW GCC 6.3.0 (Windows 7/10, both x64)

Additionally, this custom string copy function, which looks exactly like the standard one, works well with any compiler configuration mentioned above:

char* my_strcpy(char *d, const char* s){
    char *r = d;
    while (*s){
        *(d++) = *(s++);
    }
    *d = '\0';
    return r;
}

Questions:

  • Why does strcpy() fail? How can it?
  • Why does it fail only if optimization is on?
  • Why doesn't memcpy() fail regardless of -O level??

*If you want to discuss about struct member access violation, pleast head over here.


Part of objdump -d's output of a crashing executable (on WSL):

objdump


P.S. Initially I want to write a structure, the last item of which is a pointer to a dynamically allocated space (for a string). When I write the struct to file, I can't write the pointer. I must write the actual string. So I came up with this solution: force store a string in the place of a pointer.

Also please don't complain about gets(). I don't use it in my project, but the example code above only.

10
  • 1
    Comments are not for extended discussion; this conversation has been moved to chat.
    – Andy
    Commented Nov 10, 2017 at 13:21
  • 4
    Closing this question being "too broad" is unjustified in my eyes, voting to reopen. Still an answer is missing, which discusses the issue if and why the behaviour of gcc is standard conform or not in detail.
    – Ctx
    Commented Nov 10, 2017 at 13:24
  • 2
    @Ctx i am agree. It is very interesting. Should be reopened.
    – Michi
    Commented Nov 10, 2017 at 14:30
  • Can I ask if C99 flexible array members are an option for you?
    – LThode
    Commented Nov 10, 2017 at 18:20
  • 2
    You failed to include what the "crash" looked like. This is always helpful. Was it an abort() from some checker code, or was it an access violation (e.g. SEH 0xC000.0005 on Windows), etc.: "Crash" is not a technical term on this level :-)
    – Martin Ba
    Commented Nov 10, 2017 at 21:22

6 Answers 6

30

What you are doing is undefined behavior.

The compiler is allowed to assume that you will never use more than sizeof int64_t for the variable member int64_t c. So if you try to write more than sizeof int64_t(aka sizeof c) on c, you will have an out-of-bounds problem in your code. This is the case because sizeof "aaaaaaaa" > sizeof int64_t.

The point is, even if you allocate the correct memory size using malloc(), the compiler is allowed to assume you will never use more than sizeof int64_t in your strcpy() or memcpy() call. Because you send the address of c (aka int64_t c).

TL;DR: You are trying to copy 9 bytes to a type consisting of 8 bytes (we suppose that a byte is an octet). (From @Kcvin)

If you want something similar use flexible array members from C99:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
  size_t size;
  char str[];
} string;

int main(void) {
  char str[] = "aaaaaaaa";
  size_t len_str = strlen(str);
  string *p = malloc(sizeof *p + len_str + 1);
  if (!p) {
    return 1;
  }
  p->size = len_str;
  strcpy(p->str, str);
  puts(p->str);
  strncpy(p->str, str, len_str + 1);
  puts(p->str);
  memcpy(p->str, str, len_str + 1);
  puts(p->str);
  free(p);
}

Note: For standard quote please refer to this answer.

6
  • 2
    I would find this answer more useful if you dedicated a few more sentences on why this is UB.
    – Ctx
    Commented Nov 10, 2017 at 13:00
  • 3
    The Standard specifies that a pointer to a structure's initial member, suitably converted, points to the structure. If a pointer to an allocation which is larger than a structure is cast to that structure type and then to a character type, the resulting pointer may be used to access the allocation without being limited to the structure size (among other things, this allows allocations to be treated as holding arrays of structures). The only way I can see that the behavior would be justifiable would be if (char*)(&p->c) isn't the same as (char*)(struct pack_*)(&p->c).
    – supercat
    Commented Nov 10, 2017 at 20:14
  • 7
    The question allocates enough space in the malloc buffer, and casts the pointer to char* - I don't see how this is UB in any obvious way. Since both the question and the other answer highlight a function _memcpy_chk, I assume that the "crash" is actually caused by the checker code, that gcc only introduces in case of strcpy, not in the case of memcpy. Personally, my take would be that the gcc check is practically very useful, but not entirely conforming.
    – Martin Ba
    Commented Nov 10, 2017 at 21:31
  • 1
    Indeed. Compiler is not conforming here. The code is roughly equivalent to the old-school technique of allocating the control structure and its data elements past the end in the same allocation.
    – Joshua
    Commented Nov 11, 2017 at 3:04
  • 3
    @Joshua which never was standards-conforming until flexible array members Commented Nov 11, 2017 at 4:28
16

I reproduced this issue on my Ubuntu 16.10 and I found something interesting.

When compiled with gcc -O3 -o ./test ./test.c, the program will crash if the input is longer than 8 bytes.

After some reversing I found that GCC replaced strcpy with memcpy_chk, see this.

// decompile from IDA
int __cdecl main(int argc, const char **argv, const char **envp)
{
  int *v3; // rbx
  int v4; // edx
  unsigned int v5; // eax
  signed __int64 v6; // rbx
  char *v7; // rax
  void *v8; // r12
  const char *v9; // rax
  __int64 _0; // [rsp+0h] [rbp+0h]
  unsigned __int64 vars408; // [rsp+408h] [rbp+408h]

  vars408 = __readfsqword(0x28u);
  v3 = (int *)&_0;
  gets(&_0, argv, envp);
  do
  {
    v4 = *v3;
    ++v3;
    v5 = ~v4 & (v4 - 16843009) & 0x80808080;
  }
  while ( !v5 );
  if ( !((unsigned __int16)~(_WORD)v4 & (unsigned __int16)(v4 - 257) & 0x8080) )
    v5 >>= 16;
  if ( !((unsigned __int16)~(_WORD)v4 & (unsigned __int16)(v4 - 257) & 0x8080) )
    v3 = (int *)((char *)v3 + 2);
  v6 = (char *)v3 - __CFADD__((_BYTE)v5, (_BYTE)v5) - 3 - (char *)&_0; // strlen
  v7 = (char *)malloc(v6 + 9);
  v8 = v7;
  v9 = (const char *)_memcpy_chk(v7 + 8, &_0, v6 + 1, 8LL); // Forth argument is 8!!
  puts(v9);
  free(v8);
  return 0;
}

Your struct pack makes GCC believe that the element c is exactly 8 bytes long.

And memcpy_chk will fail if the copying length is larger than the forth argument!

So there are 2 solutions:

  • Modify your structure

  • Using compile options -D_FORTIFY_SOURCE=0(likes gcc test.c -O3 -D_FORTIFY_SOURCE=0 -o ./test) to turn off fortify functions.

    Caution: This will fully disable buffer overflow checking in the whole program!!

1
  • Comments are not for extended discussion; this conversation has been moved to chat.
    – Andy
    Commented Nov 14, 2017 at 13:41
10

No answer has yet talked in detail about why this code may or may not be undefined behaviour.

The standard is underspecified in this area, and there is a proposal active to fix it. Under that proposal, this code would NOT be undefined behaviour, and the compilers generating code that crashes would fail to comply with the updated standard. (I revisit this in my concluding paragraph below).

But note that based on the discussion of -D_FORTIFY_SOURCE=2 in other answers, it seems this behaviour is intentional on the part of the developers involved.


I'll talk based on the following snippet:

char *x = malloc(9);
pack *y = (pack *)x;
char *z = (char *)&y->c;
char *w = (char *)y;

Now, all three of x z w refer to the same memory location, and would have the same value and the same representation. But the compiler treats z differently to x. (The compiler also treats w differently to one of those two, although we don't know which as OP didn't explore that case).

This topic is called pointer provenance. It means the restriction on which object a pointer value may range over. The compiler is taking z as having a provenance only over y->c, whereas x has provenance over the entire 9-byte allocation.


The current C Standard does not specify provenance very well. The rules such as pointer subtraction may only occur between two pointers to the same array object is an example of a provenance rule. Another provenance rule is the one that applies to the code we are discussing, C 6.5.6/8:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

The justification for bounds-checking of strcpy, memcpy always comes back to this rule - those functions are defined to behave as if they were a series of character assignments from a base pointer that's incremented to get to the next character, and the increment of a pointer is covered by (P)+1 as discussed in this rule.

Note that the term "the array object" may apply to an object that wasn't declared as an array. This is spelled out in 6.5.6/7:

For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.


The big question here is: what is "the array object"? In this code, is it y->c, *y, or the actual 9-byte object returned by malloc?

Crucially, the standard sheds no light whatsoever on this matter. Whenever we have objects with subobjects, the standard does not say whether 6.5.6/8 is referring to the object or the subobject.

A further complicating factor is that the standard does not provide a definition for "array", nor for "array object". But to cut a long story short, the object allocated by malloc is described as "an array" in various places in the standard, so it does seem that the 9-byte object here is a valid candidate for "the array object". (In fact this is the only such candidate for the case of using x to iterate over the 9-byte allocation, which I think everyone would agree is legal).


Note: this section is very speculative and I attempt to provide an argument as to why the solution chosen by the compilers here is not self-consistent

An argument could be made that &y->c means the provenance is the int64_t subobject. But this does immediately lead to difficulty. For example, does y have the provenance of *y? If so, (char *)y should have the the provenance *y still, but then this contradicts the rule of 6.3.2.3/7 that casting a pointer to another type and back should return the original pointer (as long as alignment is not violated).

Another thing it doesn't cover is overlapping provenance. Can a pointer compare unequal to a pointer of the same value but a smaller provenance (which is a subset of the larger provenance) ?

Further, if we apply that same principle to the case where the subobject is an array:

char arr[2][2];
char *r = (char *)arr;    
++r; ++r; ++r;     // undefined behavior - exceeds bounds of arr[0]

arr is defined as meaning &arr[0] in this context, so if the provenance of &X is X, then r is actually bounded to just the first row of the array -- perhaps a surprising result.

It would be possible to say that char *r = (char *)arr; leads to UB here, but char *r = (char *)&arr; does not. In fact I used to promote this view in my posts many years ago. But I no longer do: in my experience of trying to defend this position, it just can't be made self-consistent, there are too many problem scenarios. And even if it could be made self-consistent, the fact remains that the standard doesn't specify it. At best, this view should have the status of a proposal.


To finish up, I would recommend reading N2090: Clarifying Pointer Provenance (Draft Defect Report or Proposal for C2x).

Their proposal is that provenance always applies to an allocation. This renders moot all the intricacies of objects and subobjects. There are no sub-allocations. In this proposal, all of x z w are identical and may be used to range over the whole 9-byte allocation. IMHO the simplicity of this is appealing, compared to what was discussed in my previous section.

6
  • Very good discussion on this topic, I can follow your reasoning and agree, that the c standard is in its current state not clear enough regarding this issue.
    – Ctx
    Commented Nov 11, 2017 at 11:44
  • I think there's already another question talking about access violation. But anyway, good job on this.
    – iBug
    Commented Nov 11, 2017 at 12:24
  • I'd also like to point out that, doing pointer arithmetics is not a UB, as long as you don't dereference the ill-formed pointer afterwards.
    – iBug
    Commented Nov 11, 2017 at 12:25
  • 1
    Note that the committee, already in 1992 clarified that in C89 the pointers within array clause applies to the subobject (defect report 017, question 16), and therefore affirmed that even for int a[10][10], though a[0] and a would point to the same object, they would have distinct allowed ranges. Commented Nov 11, 2017 at 19:38
  • 1
    Actually now that I read the N2090, then I have a very different view on what it describes. It actually doesn't take back any of the sub-object clauses that already exist, therefore the range checking of subobjects is still conforming. Commented Nov 11, 2017 at 20:05
3

This is all because of -D_FORTIFY_SOURCE=2 intentionally crashing on what it decides is unsafe.

Some distros build gcc with -D_FORTIFY_SOURCE=2 enabled by default. Some don't. This explains all the differences between different compilers. Probably the ones that don't crash normally will if you build your code with -O3 -D_FORTIFY_SOURCE=2.

Why does it fail only if optimization is on?

_FORTIFY_SOURCE requires compiling with optimization (-O) to keep track of object sizes through pointer casts / assignments. See the slides from this talk for more about _FORTIFY_SOURCE.

Why does strcpy() fail? How can it?

gcc calls __memcpy_chk for strcpy only with -D_FORTIFY_SOURCE=2. It passes 8 as the size of the target object, because that's what it thinks you mean / what it can figure out from the source code you gave it. Same deal for strncpy calling __strncpy_chk.

__memcpy_chk aborts on purpose. _FORTIFY_SOURCE may be going beyond things that are UB in C and disallowing things that look potentially dangerous. This gives it license to decide that your code is unsafe. (As others have pointed out, a flexible array member as the last member of your struct, and/or a union with a flexible-array member, is how you should express what you're doing in C.)


gcc even warns that the check will always fail:

In function 'strcpy',
    inlined from 'main' at <source>:18:9:
/usr/include/x86_64-linux-gnu/bits/string3.h:110:10: warning: call to __builtin___memcpy_chk will always overflow destination buffer
   return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(from gcc7.2 -O3 -Wall on the Godbolt compiler explorer).


Why doesn't memcpy() fail regardless of -O level?

IDK.

gcc fully inlines it just an 8B load/store + a 1B load/store. (Seems like a missed optimization; it should know that malloc didn't modify it on the stack, so it could just store it from immediates again instead of reloading. (Or better keep the 8B value in a register.)

2
  • Can I define this stuff in code? I.e. #define _FORTIFY_SOURCE 0 to force disable it regardless of compiler cmdline arguments?
    – iBug
    Commented Nov 11, 2017 at 7:25
  • @iBug: I just tried it on Godbolt (godbolt.org/g/6aegks). #undef _FORTIFY_SOURCE before any #include <> lines changes the asm output to not call any _chk functions, so yes, you can force-disable it in the source even if it's explicitly on on the command line. Commented Nov 11, 2017 at 16:24
2

why making things complicated? Overcomplexifying like you're doing gives just more space for undefined behaviour, in that part:

memcpy((char*)&p->c, str, strlen(str)+1);
puts((char*)&p->c);

warning: passing argument 1 of 'puts' from incompatible pointer ty pe [-Wincompatible-pointer-types] puts(&p->c);

you're clearly ending up in an unallocated memory area or somewhere writable if you're lucky...

Optimizing or not may change the values of the addresses, and it may work (since the addresses match), or not. You just cannot do what you want to do (basically lying to the compiler)

I would:

  • allocate just what's needed for the struct, don't take the length of the string inside into account, it's useless
  • don't use gets as it's unsafe and obsolescent
  • use strdup instead of the bug-prone memcpy code you're using since you're handling strings. strdup won't forget to allocate the nul-terminator, and will set it in the target for you.
  • don't forget to free the duplicated string
  • read the warnings, put(&p->c) is undefined behaviour

test.c:19:10: warning: passing argument 1 of 'puts' from incompatible pointer ty pe [-Wincompatible-pointer-types] puts(&p->c);

My proposal

int main(){
    pack *p = malloc(sizeof(pack));
    char str[1024];
    fgets(str,sizeof(str),stdin);
    p->c = strdup(str);
    puts(p->c);
    free(p->c);
    free(p);
  return 0;
}
20
  • 6
    Because it doesn't answer the question at all. The author is in my eyes clearly aware of all this and specifically stated that in the question. He wants to know, why his code segfaults in some cases.
    – Ctx
    Commented Nov 10, 2017 at 10:24
  • 5
    @Ctx I think this statements "read the warnings, put(&p->c) is undefined behavior" answers the questing pretty well.
    – Alex Lop.
    Commented Nov 10, 2017 at 10:26
  • 1
    @iBug: In that case, you should change the declaration of c to char c[];. Your approach of casting the pointer is really bad. Commented Nov 10, 2017 at 10:28
  • 4
    @AlexLop. What exactly is the problem? Is it an aliasing violation? Can you explain it clearly? I don't think he wants to fix the code. I think he wants to understand as deeply as possible precisely why his code is allowed to crash. Commented Nov 10, 2017 at 10:53
  • 3
    I do not see a standard violation either... and the pointer &p->c should point into allocated space with at least room for the string to copy left.
    – Ctx
    Commented Nov 10, 2017 at 10:56
-4

Your pointer p->c is the cause of crash.
First initialize struct with size of "unsigned long long" plus size of "*p".
Second initialize pointer p->c with the required area size. Make operation copy: strcpy(p->c, str);
Finally free first free(p->c) and free(p).
I think it was this.
[EDIT]
I'll insist. The cause of the error is that its structure only reserves space for the pointer but does not allocate the pointer to contain the data that will be copied.
Take a look

int main() 
{
    pack *p;
    char str[1024];
    gets(str);
    size_t len_struc = sizeof(*p) + sizeof(unsigned long long);
    p = malloc(len_struc);
    p->c = malloc(strlen(str));
    strcpy(p->c, str); // This do not crashes!
    puts(&p->c);
    free(p->c);
    free(p);
    return 0;
}

[EDIT2]
This is not a traditional way to store data but this works:

    pack2 *p;
    char str[9] = "aaaaaaaa"; // Input
    size_t len = sizeof(pack) + (strlen(str) + 1);
    p = malloc(len);
    // Version 1: crash
    strcpy((char*)p + sizeof(pack), str);
    free(p);

6
  • 1
    What are you insisting on? Isn't there enough space malloc'd?
    – iBug
    Commented Nov 10, 2017 at 14:35
  • No, malloc allocates area for the pointer. Then you must allocate the pointer and copy its value to this area. Remember a pointer points to something that has an address, which is saved on the pointer.
    – lsalamon
    Commented Nov 10, 2017 at 15:13
  • 1
    You're still confused. I'm not using the pointer to save the address of another block of memory, I'm ysing the pointer to save a string.
    – iBug
    Commented Nov 10, 2017 at 15:15
  • This is the point. Pointers do not store data, but the address for something.
    – lsalamon
    Commented Nov 10, 2017 at 16:55
  • @Isalamon Basically, everything stored raw in the storage (primary or secondary) of a computer is called "data". The address of other data is also data, so a pointer stores data.
    – iBug
    Commented Nov 10, 2017 at 17:00

Not the answer you're looking for? Browse other questions tagged or ask your own question.