Ask HN:

"How would you explain an ABI?"

This would be my (possibly wrong) explanation:

So in Linux we have syscalls such as write, read, close, socket, ... These syscalls are defined according to the POSIX standard. POSIX is basically a standard for operating systems. Linux/Unix follow these standards. So syscalls are an interface so that we can communicate with the OS. (Is it the application binary interface already?)

Now, we have the C standard library, so we don't need to deal with syscalls (or specific OS stuff). The C standard library is basically an application programming interface (API). So printf, fopen etc. map to write, open, etc. respectively:

printf ~ write

fopen ~ open

fclose ~ close

...

Then, apparently, we require a runtime environment (RTE). This runtime environment (is perhaps) the standard library that is a .so file (a shared object). Shared objects are (perhaps) loaded by the OS to a specific memory location so that all C programs can access that .so file (which is now loaded in memory for all C executables to use.) So that .so file is the "RTE" and the syscalls are the "ABI"?

If so, I find "ABI" and "RTE" confusing. Java has an RTE. You need the Java Virtual Machine (an abstract CPU essentially) to run your .class files within that virtual machine (JVM).

Again, I am unsure about this, so please go easy on me. Thank you!

PS: Why do people like Jason Turner say: "break the ABI to save C++"? So break the .so files for C++'s standard library? What for? (Shouldn't it be: "break C++'s RTE"?)

๐Ÿ‘คdankoncs๐Ÿ•‘4y๐Ÿ”ผ6๐Ÿ—จ๏ธ10

(Replying to PARENT post)

ABI is not quite the same as API. While an ABI is an API not all APIs are ABIs.

Binary interface vs. programming interface.

An ABI is a precise low-level machine definition, bit level, of the register values, etc. to invoking the library functions. call function at 0x32918333 with 0xabcdef12. C++ has gotten terribly crufty in that regard. In practice there's no guarantee that if you link together something from multiple compilers with different flags, that it'll work. ABI incompatibilities and flaws. Though if you recompile it all from scratch you're fine -- it's all the same API.

Stuff like "break the ABI to save C++" means recognize the above truth, and sweep the current stuff away and reach a consensus on an actual stable ABI we can live with.

๐Ÿ‘คretrac๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

The basic difference is that the API is a specification at a level higher than the hardware, e.g. source language. An application binary interface specification is at the machine code level. Often compilers, linkers, and operating system conventions all all tied together and the libraries and calling programs follow the same rules.

Note that there can be more than one. Typically the C calling convention was commonly used where the caller pushes data onto the stack, calls a function, and upon return the caller pops the data it pushed. The advantage of this is for variable length argument lists there's less chance of mismatch. A 'pascal' convention was popular on Windows and OS/2 where the called function would use a single instruction to return and pop N bytes. It also made the code slightly smaller as the callers didn't need the pop instructions. That ABI was specific Windows and OS/2 GUI interfaces but compilers would also let you choose for your own libs--I don't know/remember what the kernels used.

๐Ÿ‘คkarmakaze๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

The term "API" is similarly fuzzy as the term "constant". Whether a variable or "storage bucket" is constant (vs dynamic) dependens heavily on the context: In many contexts there are for instance compile time constants vs. runtime constants.

In similar respect, you can refer to the operating system kernel interface as an "API". Some folks even call network protocols an API. In my humble opinion, in the end, it is just words.

๐Ÿ‘คktpsns๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

I also have a hard time understanding abstract things without clear examples. I think that's the difficulty you're having?

This SO question has many answers that are a bit more concrete with the examples: https://stackoverflow.com/questions/3784389/difference-betwe...

๐Ÿ‘คgtirloni๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

Syscalls are a type of ABI. But mostly people are talking about higher-level shared libraries.

ABI is a general term. The keyword is "binary". One chunk of compiled binary code wants to call a function in another chunk of compiled binary code. How does it do it? The OS has loaded both into the same memory space (perhaps a compiled C++ program and a .so file it depends on). How does the C++ code call functions inside the .so file?

It all comes down to placeholders. When you compile a C++ program that relies on the C++ standard library (say, io::cout), and compile it, gcc will create an ELF file. A few key things: The ELF file will have a header that says "I need the C++ standard library. Also, I make calls to these C++ standard library functions. There are placeholders there for now. Please fill in these gaps when loading me."

Your OS, when it runs your C++ program, will see that header, and go load the C++ standard library .SO into the same memory space. Then it will find the parts of your code where you call the standard library, and replace the placeholder with the address of the standard library function, in the .SO file.

That compiler/OS magic that happens behind the scenes is sometimes considered the mechanism of ABI. C++ has even more magic (since we have virtual functions and stuff). When you run a C++ program, before your main() even runs, a chunk of predefined C++ code runs, which loads vtables and stuff (IDK, I'm not a big C++ programmer).

"Break the ABI to save C++":

This video points out that this magic isn't really magic enough. C++ has a simplistic module loading mechanism, and if you compiled your C++ program to use Module v1, using Module v2 is likely impossible without recompiling your C++ code. This means that your OS needs to stock all versions of the library (ever run into Linux "error while loading shared libraries: libavformat.so.53: cannot open shared object file: No such file or directory" errors? That's what's going on. v53 of that library is missing. Even if you have v54, you can't use it with this program.)

๐Ÿ‘คphendrenad2๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

A ABI is the low-level implementation of any programming interface, no matter if it uses interrupts (0x80), sysenter, or long jump to code that the OS put there, etc, etc..

It is the nuts and bolts that do the low level real work of implementing a call / interface.

๐Ÿ‘คtifkap๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

ABI is what you use when a Papa process and a Mama kernel love each other very much.
๐Ÿ‘คtifkap๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0