I'm disassembling a packed 16 bit DOS MZ EXE.
To deobfuscate it, I've set a breakpoint in DOSbox at the end of the unpacking routine, let it run, and made a memory dump. This way I essentially got the deobfuscated EXE image.
Problems started when I loaded the image in IDA. You see, I don't understand the IDA's concept of segments. They are similar to x86 segments, but there are numerous differences which I can't grasp. When IDA asked me to create at least one segment, I just made a single huge segment 1 MB length, because the code and data in program's address space are mixed and it doesn't make sense to introduce separate segments such as CODE, DATA etc.
After showing IDA the entry point, everything worked fine: IDA successfully determined functions, local variables, arguments etc. The only problem is that some calls are marked as NONAME, even though they point at correct subroutines. The strangest thing is that those subroutines have correct XREFs to the 'illegal' calls. Here's an example:
seg000:188FF 004 call 1AD9h:1 ; Call Procedure
This line is red and has an associated NONAME problem in Problems List. Why?
The 1AD9h:1 seg:offset address corresponds to linear address 0x1ad91, which has this:
seg000:1AD91 ; =============== S U B R O U T I N E =======================================
seg000:1AD91
seg000:1AD91 ; Attributes: bp-based frame
seg000:1AD91
seg000:1AD91 sub_1AD91 proc far ; CODE XREF: sub_188F2+DP
Note the XREF. So IDA actually processes the call correctly! Why is the call considered invalid? IDA help file says this:
Problem: Can not find name
Description
Two reasons can cause this problem:
- Reference to an illegal address is made in the program being disassembled;
- IDA couldn't find a name for the address but it must exist.
What to do
If this problem is caused by a reference to an illegal address
- Try to enter the operand manually
- Or make the illegal address legal by creating a new segment.
Otherwise, the database is corrupt.
So, I guess the problem is that I have one gargantuan segment instead of several small ones. But, how do I properly divide the address space into appropriate segments?
I know the register values (including DS, CS, SS, IP, etc) at the entry point. Let's assume I create a CODE segment starting from the segment corresponding to the CS register value at the entry point. But what length should this segment have ?
What's the point of segments in IDA at all? If DATA segments can contain instructions, and CODE segments can be read and written as data?
Please excuse me for such a newbie question, but official IDA manual is notoriously scarce and HexRays forums are closed for me because I use freeware version.