Code injection is the exploitation of a computer error caused by the processing of invalid data. The injection is used by an attacker to inject code into a vulnerable computer program and bring it to execution. The result of a successful code injection can be devastating, such as the spread of computer viruses or worms.
Certain types of code injection are interpretation errors that give special meaning to user input by not distinguishing between user input and system commands. Code injection vulnerabilities occur when an application sends untrusted data to an interpreter. Most commonly, they occur in SQL, LDAP, XPath, NoSQL queries, operating system commands, XML parsers, SMTP headers, and generally in the parameters of program calls. Injection vulnerabilities are usually easier to detect in the source code than by testing. Scanners can help find injection vulnerabilities. Injection can lead to data loss or corruption, or denial of access—sometimes even a complete host takeover.
Code injection techniques are popular in system hacking or cracking to gain information, privilege escalation, or unauthorized access to a system. Code injection can be used maliciously for many purposes, e.g:
- Arbitrary modification of values in a database by SQL injection. The effects can range from defacement of websites to seriously compromise sensitive data.
- Installing malware or run malicious code on a server by injecting server scripting code (such as PHP or ASP).
- Privilege escalation to root privileges by exploiting shell injection vulnerabilities in a setuid root binary on UNIX or Local System by exploiting a service on Microsoft Windows.
- Attacks on web users with HTML script injection (cross-site scripting).
Good and Unwanted Use
Theoretically, code injection can be used with good intentions, for example by displaying a useful new column or by further filtering, arranging or grouping the content. Code injection also does a good job when testing software, especially penetration tests (white hat). Even a software developer will sometimes use methods that deserve this name – such as temporarily overwriting a library function with its function with the same name.
Of course, a user could also inject code without intention. For example, a developer may consider something to be a valid input that has been given a special meaning by the developer – perhaps just an apostrophe in a company name. Something like this could also be in a file that the user uploads.
To prevent code injection problems, secure handling of inputs and outputs is necessary:
- The application programming interface (API) used should be secure against all inputs, such as precompiled SQL statements with placeholders for the user data or the Criteria API.
- Enforce language separation through a static type system.
- Input validation by whitelisting (preferably server-side), i.e. a list of accepted values.
- Input encoding, i.e. the escape of unwanted characters.
- Appropriate output encoding to prevent HTML injection attacks.
- HttpOnly is a flag for HTTP cookies that prevents client-side script interaction with cookies and thus certain XSS attacks.
- Modular shell decoupling from the kernel
- Runtime image hash validation – Capture a hash of part or all of the image of the executable loaded into memory and compare it to the stored and expected hash.
- All user data is stored in a special memory area that is marked as non-executable. The processor then knows that there can be no code there and refuses to execute it there.
- Canaries – These randomly place values on a stack. After the function returns, the value is checked and the program is stopped if it changes. This prevents stack overflow attacks.
- Code Pointer Masking (CPM) – A pointer to code to be executed is first checked for plausibility. In C, this can be done at the processor level through a bitmask.