On self-modifying executables in Go

Alex Muscar

January 30, 2022

Recently, a friend wrote about self-modifying executables in Rust. I was curious if I can the same thing in Go. The short answer is “yes”. Read on if you’re curious how I did it.

General approach

Let’s start with a skeleton program, and we’ll add the patching code later:

package main

import  (
    "fmt"
)

var CNT = 0xCAFEBABE

func main() {
    fmt.Println(CNT)
    // Patch the binary to increment CNT
}

We’ll poke around the ELF file form the command line, so I initialised CNT to an easy to find value.

The Rust program uses the link_section attribute to give the counter variable a section all of its own. That’s quite neat, but Go doesn’t have anything like that as far as I can tell. While having the variable live in a dedicated section makes it easier to find it, it’s not essential to the solution. After all the program must find it somehow. And the way it does it is by going through the symbol table.

Let’s use the strings program to find CNT’s full name in the binary:

$ strings self-modify | grep CNT
    stack=[cgocheckdeadlockmain.CNTno anodepollDescrunnablerwmutexRrwmutexWscavengetraceBufunknown( (forced) -> node= B exp.)  B work ( blocked= in use)
main.CNT
runtime.x86HasPOPCNT

OK, we’re looking for the symbol named main.CNT. readelf is a handy tool for analysing ELF binaries. We can list the entries in our executable’s symbol table, and have a look at main.CNT:

$ readelf -s self-modify

   Num:    Value          Size Type    Bind   Vis      Ndx Name
  
  ...
  
  1706: 00000000005472a8     8 OBJECT  GLOBAL DEFAULT    9 main.CNT

  ...

The entry holds all the details we need to find CNT in the binary: the section index (Ndx), the variable size, and the location for its storage (Value). One thing to keep in mind is that the location is not an offset in the executable file, but a virtual address.

Using readelf we can inspect section 9:

$ readelf --sections self-modify
There are 23 section headers, starting at offset 0x1c8:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align

  ...

  [ 9] .noptrdata        PROGBITS         00000000005471c0  001471c0
       0000000000010a60  0000000000000000  WA       0     0     32

The object is placed in the .noptrdata section which is a Go specific section. But we’re not really bothered by that. What we care about it the Address field. That’s the virtual address of the section. Using the symbol table entry’s virtual address and the section virtual address we can work out our object’s offset in the section:

$ python3 -c 'print(0x00000000005472a8 - 0x00000000005471c0)'
232

The final step is to find the object’s physical location. This is where we use the section offset. We add the object offset to the section offset to find out where the storage for our counter is in the executable. Let’s check our logic:

$ python3 -c 'print(0x001471c0 + 232)'
1340072
$ hexdump -s 1340072 -n 8 self-modify
01472a8 babe cafe 0000 0000 

Cool, it works. Now we can write some code to do the patching.

The code

The full program is just the steps we ran above in the shell translated to Go (plus error handling, more on that later). Go’s standard library comes with a debug/elf module which makes reading ELF files almost as convenient as using readelf.

package main

import (
    "bytes"
    "debug/elf"
    "fmt"
    "io"
    "log"
    "os"
)

var CNT = 1

type entry struct {
    value, off uint64
}

func getEntry(f *elf.File, name string) (*entry, error) {
    syms, err := f.Symbols()
    if err != nil {
        return nil, err
    }
    for _, s := range syms {
        if s.Name == name {
            sect := f.Sections[s.Section]
            bs, _ := sect.Data()
            varOff := s.Value - sect.Addr
            return &entry{f.ByteOrder.Uint64(bs[varOff : varOff+s.Size]), sect.Offset + varOff}, nil
        }
    }
    return nil, fmt.Errorf("can't find symbol '%s'", name)
}

func main() {
    fmt.Println(CNT)

    // Patch the binary to increment CNT
    exeName := os.Args[0]
    tmpName := exeName + ".tmp"

    f, err := os.Open(exeName)
    if err != nil {
        log.Fatalf("can't open file '%s': %e", exeName, err)
    }
    defer f.Close()

    data, _ := io.ReadAll(f)
    elfFile, err := elf.NewFile(bytes.NewReader(data))
    if err != nil {
        log.Fatalf("can't read ELF file: %e", err)
    }

    entry, err := getEntry(elfFile, "main.CNT")
    if err != nil {
        log.Fatalf("can't find counter object in ELF file: %e", err)
    }
    elfFile.ByteOrder.PutUint64(data[entry.off:], entry.value+1)

    fi, err := f.Stat()
    if err != nil {
        log.Fatalf("can't get file mode for '%s': %e", os.Args[0], err)
    }
    if err := os.WriteFile(tmpName, data, fi.Mode()); err != nil {
        log.Fatalf("can't write file '%s': %e", tmpName, err)
    }

    if err := os.Rename(tmpName, os.Args[0]); err != nil {
        log.Fatalln("can't rename temporary file", err)
    }
}

Caveats

The most obvious caveat is that this only works for ELF files on Linux (it might work on other operating systems using ELF, but I haven’t tested it).

The code also assumes that the compiler will always reserve storage for the variable in the object file. At the time of writing, under go 1.17, that’s the case. And since this is an exported variable the compiler won’t inline it. But this is the only module. And the variable is not used anywhere else. If in a later version the compiler starts doing some clever cross module analysis it may well decide to inline the value in the one place it’s used. Fun exercies: try changing the var to a const and see what happens.

Why did we initialise CNT to 1 and not 0? If we did initialise it to 0 the compiler would put it in the .bss which is treated specially by the program loader. The executable only stores the length of the segment, not the data. The loader allocates the data when it loads the program. That means there’s nothing for us to patch.

Bonus: generic error handling

I do like explicit error handling, but even so, I find the above code verbose to the point of making the main logic hard to follow. You could argue that the steps should be encapsulated in functions with the appropriate error handling. And that’s true. But even so you’d end up with a handful of mustX functions, which seems to be a common pattern in Go.

Instead, I decided to give the Go 1.18 beta1 with generics a go. The program below defines two generic helpers–mustDo and must. The former is meant to wrap functions that are only performed for their side effects, and only return an error object. This can actually be written in Go 1.17 since it doesn’t use any generic parameters. must on the other hand is meant to wrap functions that return a value and an error object. These make the code considerably easier to read. If you’re not familiar with Go’s idioms, log.Fatalln is equivalent to calling log.Println followed by a call to os.Exit(1).

package main

import (
    "bytes"
    "debug/elf"
    "fmt"
    "log"
    "io"
    "os"
)

var CNT = 1

type entry struct {
    value, off uint64
}

func mustDo(err error) {
    if err != nil {
        log.Fatalln(err)
    }
}

func must[T any](t T, err error) T {
    if err != nil {
        log.Fatalln(err)
    }
    return t
}

func getEntry(f *elf.File, name string) (*entry, error) {
    syms := must(f.Symbols())
    for _, s := range syms {
        if s.Name == name {
            sect := f.Sections[s.Section]
            bs, _ := sect.Data()
            varOff := s.Value - sect.Addr
            return &entry{f.ByteOrder.Uint64(bs[varOff : varOff+s.Size]), sect.Offset + varOff}, nil
        }
    }
    return nil, fmt.Errorf("can't find symbol '%s'", name)
}

func main() {
    fmt.Println(CNT)

    exeName := os.Args[0]
    tmpName := exeName + ".tmp"

    f := must(os.Open(exeName))
    defer f.Close()

    data, _ := io.ReadAll(f)
    elfFile := must(elf.NewFile(bytes.NewReader(data)))

    entry := must(getEntry(elfFile, "main.CNT"))
    elfFile.ByteOrder.PutUint64(data[entry.off:], entry.value+1)

    fi := must(f.Stat())
    mustDo(os.WriteFile(tmpName, data, fi.Mode()))
    mustDo(os.Rename(tmpName, os.Args[0]))
}