## *<p align='center'><a href="/contents/instruction-sets/instruction-sets.md"><-</a>  .languages  <a href="/contents/file-formats/file-formats.md">-></a></p>*

__Historical Context__
* Back in the 50s, people were still programming in assembly. It was also at this time that programming languages were being developed left and right with the hope of reducing costs (programming in assembly was a tedious and time-consuming task and not to mention also a maintenance nightmare), but the transition to programming in a higher-level language was not as easy as the transition to programming in assembly from a decade earlier. The transition to programming in assembly was intuitive since it made the software development lifecycle faster, but most importantly, with zero performance penalty since an assembler translates one assembly instruction to exactly one machine code instruction. Programming in a higher-level language was a completely different story. A higher-level programming statement can be translated to multiple machine code instructions and different compiler implementations for the same language also meant different sets of machine code output. Early skepticism of programming langauges was inevitable since compiler was not able to generate code as efficient as hand-written assembly. One of the first compiler writers, Grace Hopper, was no stranger to such skepticism as she had said: "I had a running compiler and nobody would touch it...they carefully told me, computers could only do arithmetic; they could not do programs..." Fortunately over times, people slowly came into acceptance of programming languages since the upsides (reduced programming and maintenance costs) outvalue their downsides (performance). And through decades of compiler research since the 50s, [with some mishaps along the way](http://research.cs.wisc.edu/wpis/papers/wysinwyx05.pdf), modern compilers' outputs have become relatively efficient.

__Catalyst For Modern Computing__
* Programming in a higher-level language frees programmers from having to worry about how to interact with the underlying hardware. For example, in assembly there is no notion of variables, only registers and memory locations and to make it even harder to program you only have a limited numbers of general-purpose registers that you can modify. Programming languages abstract away those hardware interactions, allowing programmers to focus more on the computational task they wanted to solve. This, in my opinion, is the catalyst that leads to the proliferation of various computing sub-fields (e.g. web development, machine learning). A lot of what's possible today will not be availiable if we are still programming in assembly.

__Different Languages' Implementations From A Reversing Perspective (Interpreted)__
* Interpreted or compiled is a property of the langauge implementations. But since most languages have their implementations slanted toward one, interpreted or compiled becomes an association with the language itself. And to make the associations even more muddier, languages like [Python](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/Python_Reversing.md) and Ruby that are typically associated as being interpreted are not truly interpreted. Rather, they are compiled down to an intermediate form called bytecode and then interpreted during runtime by an interpreter. Usually when we think of compiled languages we think of languages like C or [C++](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/C++_Reversing.md) that uses ahead-of-time compilation to translate high-level source files to native machine code instructions. But in fact, language like [Python](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/Python_Reversing.md) is also compiled, just not all the way down to native machine code. So what does this means to a reverse engineer? __(1)__ When you are reversing an executable file, not all of them will contain native machine code corresponding to [x86](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/instruction-sets/x86.md), [x64](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/instruction-sets/x86-64.md), or [ARM](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/instruction-sets/ARM.md) [instruction set](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/instruction-sets/instruction-sets.md). __(2)__ A bad reversing habit is to drop an executable file you wanted to analyze into a disassembler like [IDA](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/tools/IDA_Tips.md) first without a second thought. [IDA](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/tools/IDA_Tips.md) may still be the best universal disassembler out there, but it is actually not that great at disassembling bytecode. Instead, use a disassembler targeting that specific interpreted language. For example, with compiled [Python](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/Python_Reversing.md) (.pyc) use the built-in dis module. __(3)__ Since they are only compiled down to an intermediate form, a lot less information from the source level is lost compared to those that get compiled down to machine code. This is why a decompiler targeting an interpreted language like [Python](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/Python_Reversing.md) or .NET works better than a decompiler targeting an ahead-of-time compiled language like C.

__Different Languages' Implementations From A Reversing Perspective (Compiled)__
* Although ahead-of-time compiled languages targeting the same CPU all compile down to the same instruction set, their outputs differ greatly in that they are still idiosyncratic to the original source languages. For example, because [C++](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/C++_Reversing.md) supports method overloading its [function names will be mangled](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/C++_Reversing.md#-name-mangling-) and since C doesn't support method overloading it will be easy to identify if a compiled code is compiled from [C++](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/languages/C++_Reversing.md) by just looking at [IDA's function window](https://github.com/yellowbyte/reverse-engineering-reference-manual/blob/master/contents/tools/IDA_Tips.md#-functions-window-). If you can, you always want to try to identify the original source language. Knowing the source language makes it easier for you to relate a block of assembly code back to a higher level construct, allowing you to piece together what the assembly code is doing quicker.

---
### *<p align='center'> section overview </p>*
---
* [C++ Reversing](C++_Reversing.md)
  * [Thiscall](C++_Reversing.md#-thiscall-)
  * [How An Object Is Represented](C++_Reversing.md#-how-an-object-is-represented-)
  * [Name Mangling](C++_Reversing.md#-name-mangling-)
* [Python Reversing](Python_Reversing.md)
  * [PVM (Python Virtual Machine)](Python_Reversing.md#-pvm-python-virtual-machine-)
  * [The 3 Tuples Associated With Function Object](Python_Reversing.md#-the-3-tuples-associated-with-function-object-)
  * [Python Bytecode Instructions](Python_Reversing.md#-python-bytecode-instructions-)

---
### *<p align='center'> further readings </p>*
---
* [Compiler Explorer](https://godbolt.org/): a compiler explorer, for languages such as C, C++, GO, and many more, that lets you explore the assembly listings generated by your choice of compiler. What makes this web application even more awesome is that you can specify the exact compiler version with compiler options
* [Early High-Level Programming Languages](https://gregorias.github.io/2014/11/22/early-high-level-programming-languages.html)

#
<strong><p align='center'><a href="/contents/instruction-sets/instruction-sets.md">.instruction-sets</a> <- <a href="/README.md#-reverse-engineering-reference-manual-beta-">RERM</a> -> <a href="/contents/file-formats/file-formats.md">.file-formats</a></p></strong>
