Tag Archives: BSD

BSD Signals and Cocoa

What is a signal?

A signal is a notification sent by the kernel to a process when it tries to do something it shouldn’t be doing often resulting in termination of the process. There are many kinds of signals like

  • SIGABRT
  • SIGSEGV
  • SIGBUS
  • SIGILL
  • SIGSTOP
  • SIGKILL

A more complete list can be seen here. Among them, the ones of particular import to a Cocoa developer are SIGABRT, SIGSEGV and SIGBUS.

SIGABRT happens when an NSException goes uncaught. SIGSEGV happens when a process tries to access a memory that’s not mapped to its address space for example when it tries to write into the TEXT segment which is a readonly segment. SIGBUS happens when a process tries to access unaligned memory like reading a long value from a memory address that’s not divisible by 4.

Now that we know what signals mean, let’s see if we can generate the important ones on purpose.

Raising SIGABRT

NSString *str = @"";
[(id)str objectAtIndex:0];

The above code should generate a SIGABRT as we are trying to invoke a method of NSArray class on an NSString object. When we run the above code, we get a crash. Looking at the top section of the crash report, we see

Exception Type:  EXC_CRASH (SIGABRT)

Raising SIGSEGV

int i = *(int *)0x1;

The above code when run on an x86_64 Mac produces a SIGSEGV because our process is trying to load the value at address 0x1 which can never be mapped into its address space. Looking at the top section of the crash report we see

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000001

Another example of a code that raises SIGSEGV on an x86_64 Mac is invoking a nil block.

void (^block)(void) = nil;
block()

Looking at the top section of the crash report we see

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000010

The address 0x0000000000000010 seen raises an interesting question. Why would a nil block invocation try to access the value at 0x0000000000000010?

I got the explanation for that here from Stackoverflow courtesy Matt J Galloway. The gist is that a block gets converted into a struct by the compiler and the block variable becomes a pointer to the compiler generated struct. The generated struct looks like this

struct Block_layout {
    void *isa;
    int flags;
    int reserved;
    void (*invoke)(void *, ...);
    struct Block_descriptor *descriptor;
    /* Imported variables. */
};

The 4th member is a function pointer pointing to the body of the block. When we invoke the block, the processor tries to load 0 (nil) + 8 bytes (void *isa) + 4 bytes (int flags) + 4 bytes (int reserved) = 16 i.e. 0x0000000000000010 in hexadecimal and fails because 0x0000000000000010 address does not belong to our process’ address space and is invalid as far as our process is concerned. Note that the address 0x0000000000000010 in itself is valid but its access from our process is not.

Raising SIGBUS

I had read that accessing misaligned memory would lead to SIGBUS. So I tried doing this


typedef struct
{
    unsigned short a1;
    unsigned short b1;
    
} XYZ;

int main(int argc, char *argv[]){
    
    XYZ p =  {255, 2};
    
    char *t = (char *)&p; //deliberate invalid cast
    printf("%p\n", t);
    
    t += 1; //deliberate misaligning by pointer increment
    printf("%p\n", t);

    unsigned short a1 = *(unsigned short *)t; //deliberate invalid dereferencing
    printf("%d\n", a1);
}

I thought this line

unsigned short a1 = *(unsigned short *)t; //invalid dereferencing

would produce a SIGBUS on my x86_64 Mac because I was trying to read a 2 byte unsigned short from an address that I arrived at by deliberate invalid cast and pointer increment. But unfortunately I didn’t get a crash and later learned that Intel processors can guard against misaligned memory access.

Instead of crashing, I got the following result

0xbfffcbe8
0xbfffcbe9
512

The output 512 calls for an explanation. The data type unsigned short is 2 bytes that can store values from 0 to 2^16 – 1. The initialization of p sets p.a1 = 255 and p1.b1 = 2. Shown in bits, they look like this on a little endian (least significant bytes are stored at lower memory addresses) architecture

p.a1 = 255
{11111111 00000000}

p.b1 = 2
{00000010 00000000}

The address of a struct is the same as the address of its first member and the members are laid out adjacent to each other in memory. So when I say &p, that’s equivalent to &p.a1 and p.b1 is adjacent to p.a1. Incrementing t by 1 in pointer terms increments it by 1 byte since its type is (char *) due to our deliberate invalid cast which lands me at the 2nd byte of p.a1. When I read an unsigned short from there, I actually end up with 2nd byte {00000000} of p.a1 and 1st byte {00000010} of p.b1. Therefore my unsigned short value looks like {00000000 00000010} in bytes which when converted to decimal taking little endianess into account means

0*2^0 + ..... 0*2^7 + 0*2^8 + 1*2^9 + .... + 0*2^15 = 512