Following ELF imports

Everyone working with dynamically linked ELF files must deal with this, probably semi-manually as I do

You’re reversing some ELF library or executable, following the flow of some code path, and it ends with a function that is imported from some library. IDA will try to determine the prototype with what it has available, but this is not always useful. Especially on certain architectures, where IDA thinks the imported function takes no arguments, or an incorrect number of arguments.

Currently, I manually identify what library the symbol is in, using a combination of GNU find and readelf, to recursively search a sysroot where my target ELF files are. Then I manually open the library in a new IDA instance, let it perform analysis, find the function, and (at a minimum) copy the prototype to the original IDB. That way I can at least have the prototype correct in the “main” IDB. In many cases, I have to reverse the function in the library, and add a function comment in the “main” IDB

In my use-case, where I layout an entire filesystem structure exactly as it is in the target filesystem, it would be relatively easy to automatically identify the library that a given function is imported from (by parsing ELF sections of all imported libraries), using a hard-coded relative search path.

For example, I may have /ida/targets/target1-rootfs/usr/sbin/app open in IDA

Functions of the exe are typically imported from …/lib and …/usr/lib

What an implementation would look like:

When reaching the call and prototype of the imported function, a context menu might have a dropdown showing which libraries relative to the current ELF contain the implementation. Choosing one could have a “open in new IDA session”

Another option would be to look for the IDB in the same search path, and offer an “import prototypes” action. And an “import function level comments” for the function, if the IDB for the library has a function level comment for the function

This would be a nice improvement.

A far more ambitious solution would be seamlessly referencing functions across multiple ELF files within one IDA process. I understand that would require massive changes unless done in very clever way. I don’t expect this to happen any time soon, it’s more work than benefits

The first ideas could be done with a plugin and/or scripting with headless/batch mode IDA, but I’m not aware of any doing it at this time.

1 Like

Have you ever experimented with GitHub - endrazine/wcc: The Witchcraft Compiler Collection or GitHub - whitequark/superlinker: a tool for reinterpreting ELF executables and shared libraries? If so, did either of these help, or did they lead to different issues?

Alternatively, what about force loading the shared libraries, breaking on first instruction or input syscall, then core dumping from gdb, then opening the core dump with IDA?

The functionality that you are describing is already there. The only reason why this works on Windows is because there are function prototypes hidden in mssdk*.til (which are, I believe, automatically loaded using startup signatures and autoload.cfg).

Here is an example:

>tilib -ls mssdk64_win10.til | grep "MessageBoxW("
FFFFFFFF 00000000          int __stdcall MessageBoxW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, UINT uType);

All you need to do is to generate the til files for your libraries.

2 Likes

Thank you very much for reading and taking the time to provide guidance

I’m sorry for this long post. I’m very thankful for any input if you have time

You say “all you need to do is generate the TIL files for your libraries” … and it’s not incorrect, if I had decent TILs for each and loaded them all, that would completely solve the “accurate prototypes for imported functions” aspect (but not the “jump into the function” part - see the end of this post for how I’ll try to do that…)

This is not a technical challenge, it’s just a bit cumbersome to automate. Automating it is critical though as it’s not uncommon for me to work on more than a dozen collections or executables+libraries, with many cases an ELF has imports from 10-20 proprietary libraries

Creating TILs for the libraries is not too far from what I currently do mostly manually. Get list of imported library names from the ELF header, then find each library in the directory tree, load, export to C header, idaclang (in the past, tilib) to produce the TILs

It works, but the glue is not already there, so I need to script it, which will take some time. If it’s the best way to go, I’ll invest the time in automating this approach

If you have any advice or criticism on the process as described below, I’m thankful to hear it!

Generating TIL from ELF

To generate a TIL from a stripped dynamically linked library, there are a few approaches I’m aware of. I’ve used each at one time or another but haven’t actually made any of them part of an automated and repeatable solution…

Step 1 is always load the lib to IDA, auto analyze, optionally do additional improvements (sig, recursive decompile, manual analysis of any remarkable “problems”, …)

From there, a few approaches I know of:

Method 1: produce C header file, build TIL from header with idaclang (or tilib)

Method 2: Export a TIL directly from IDA. I don’t think this is supported in the GUI, but I have used plugins to do it before. I’m still getting used to the new type interfaces in the 9.x API so it can be slow for me to write this even though it’s probably not much code

Method 3: Save the IDB in an unpacked form, use the TIL on the filesystem. (Is this safe/recommended? I avoid doing it for some reason, it seems easiest though…)

Which is the best approach? Are there better alternatives to generate TILs from a proprietary ELF shared library?

Second part: Quickly analyzing the code of the imported function

The other part or what I need - to quickly jump to the actual code of an imported function - is also just a bunch of glue. Not a technical challenge, just finding the right API interfaces to use:

  • right click on function name, present a “jump to implementation”
  • extract the NEEDED entries from the ELF, effectively a lost of shared library dependencies for the exe
  • find the filesystem location for each library (checking ../lib ../usr/lib, …, or even embedded elf runpath entries)
  • parse each library to find the implementation of the imported function
  • present to user the path to the library and an option to open an additional ida instance loading that library

Any better ideas for this?

Thank you again!

This is the type of approach my mind often jumps to, we think alike :laughing:

I previously had significant difficulty finding a way to load ELF core files into IDA, I spent some time on it, then gave up. I recall determining at the time that it was not supported - it required manually parsing the core structure to create the mappings and things like this. Perhaps this has changed with IDA8 or IDA9?

It’s not my preferred solution as I expect many native and open-source/custom analysis scripts and plugins will be confused (or completely broken) due to assumptions about the state of the loaded file. That’s speculation- I will have to look into it

Thank you for the idea, I feel less crazy now that someone else has mentioned it!

Not yet, but now it’s in my TODO list!

Thanks!

if you can use IDA’s debugger, then you can do it directly from IDA: just use “Analyze module” from the Modules’ list context menu to copy the code into the IDB. Note that you may get conflicts on subsequent debugger runs due to ASLR so better do it once per project.