I would love a detailed explanation how user-controlled input goes from
readObject to RCE. Java-specific.
This is my attempt to add specificity to the OP question as requested in the answer here.
I have been slowly but surely breaking into web app security (from a network/infra pentesting and binary exploitation background), and am currently trying to wrap my brain around deserialization attacks, particularly Java. I have taken a few intro Java classes, and am familiar with the basic concepts of OOP, but have never done serious development work. Most of my coding experience is sysadmin-related or exploit writing (bash & python scripting), as well as reading code particularly in vuln writeups and more recently, SAST/DAST WAPTs and code reviews (new to this).
At this point, I am well aware that an application deserializing untrusted user input is very dangerous, especially in Java. However, most resources I’ve encountered thus far gloss over how the untrusted input actual results in code execution. This is what I am very interested in at a detailed level. I feel many others are in a similar position to me and would benefit from this answer.
Research I’ve done to try to understand it myself
I’ve watched Robert Seacord’s video and read portions of his whitepaper. This resource appeared really good but I think they assumed more OOP prerequisite knowledge. Ironically, someone asks a similar question to mine in Seacord’s video (I got excited at that point), but he seems to avoid discussing in-depth as he feels it would require responsible disclosure (my excitement…died).
I’ve also done some hands-on labs such as nickstaDB’s DeserLab with the associated blog post. I was able to get code execution, but don’t quite understand how I got there. The blog helped me understand a lot about the structure of the byte stream, but not how code actually gets run when
readObject gets called on the stream. It references Property-Oriented Programming, and compares it to ROP which I am very familiar with. But there is still a gap in my understanding.
I’m also interested in why Robert Seacord felt that going in-depth on a gadget chain would mean he would have to responsibly disclose the gadget chain in some way. I have not heard of that being necessary for other languages such as .NET deserialization gadgets. I well understand ethics and responsible disclosure, I am wondering why or what characteristics of this technique could require disclosure, versus ROP techniques given they compare POP gadgets to ROP. Usually, an overarching technique (ex. ROP) doesn’t need to be responsibly disclosed, but an actual vulnerability does (ex. an overflow that led to exploitation using ROP).