Re: Converting hex to Chinese.
I understand what you where saying now. But even that wouldn't be a problem, for the most part, with the setup.exe program at least.. For example, let's say they have a pop-up window that displays garbled text. They call the Windows MessageBoxA() API to display messages. I thought you where talking about the physical code, not how it's physically displayed on the screen. The APis for text boxes change their height and width automatically, I do believe. But worse case, I just call another API that allows me to set the height / width. This is where a code-cave would come in handy.
My example was meant to show one of the problems with modifying the actual executable. By adding bytes, we change everything. jump statements all need to be changed, ect. This is why I think a code-cave would have been the best approach.
The setup.exe appears to be the only problem. The idea was, once I understood the code, I just recreate it, instead of using wise, I use Inno, which handles all the dirty work.
I don't know how you could know from external observation what's what. I mean, you have to actually look at the code and see what it does. That's why I use a disassembler / debugger. It converts the machine code back into assembly. And although it's not as pretty as it'd be as if the program was written directly in assembly, there's tools and plugins that help us along. I was using x64dbg. I used to use Ollydbg, but x64dbg has come along way, can debug 64-bit code and 32-bit code, and has an extensive plugin library that anyone can contribute too.
Some programs (like Polderbits) use a lot of "tricks" to make modifying the program very hard. There's ways to detect a debugger. Polderbits has codes like that. There's plugins to hide the debugger, but the Polderbits developer was smart and used old school tricks that you don't see much anymore, but also used those functions in his actual code. So Polderbits is much, much harder to modify than this setup.exe program.
That's the whole point of using the disassembler / debugger, and why this has taken me so long. Because I had to "step" through the code and understand it. They stripped the function symbols, so I don't see calls to something like setup_crc32table. Instead, I see calls to sub_435404. I step into sub_435404 (which is a function created by the programmers, or in this case, by the Wise Setup Studio program). And then I watch the assembly code and figure out what it does. I watch the registers, I watch what functions are being called, I write it down in gedit (notepad for Linux), I keep track, and when I finally understand the function and what it does, I label it, so instead of it saying sub_435404, it nows says setup_crc32table, just as an example.
If there's any easter eggs, I'd see them before over writting code. I'm not just gonna randomly pick a piece of code and over-write it. I would make sure it's actually unneeded. There's multiple ways to do this with x64dbg.
When I was trying to analyze the first bytes the program read from the itself (I suspected some sort of header file), I manipulated the data. It would read 4 bytes, for example, and then example the last byte to see what it was. If it wasn't a 0, it'd jump to a function. That function would then read some stuff from the program and then call some functions that create a unique serial type number. This number is then compared to some code in the program, and if it's not the same, it displays a message on the screen saying something along the lines of The demo version of this program can only be used to create setup files that are ran on the computer they where created on.
That code has no easter eggs. For the code that generates the serial number, I can search through the entire program, and all the DLLs it's loaded, including the Windows ones, everything, and actually look for references to that sub-function, and I only see that one, so we know it's only used if the Wise Setup Studio program used to create the exe was a demo version, not a registered version.
We can actually just search for and find all references to that string now-a-days. A lot has changed since the days of WinICE or whatever it was called. Things become a lot easier. I use a plugin which attempts to tell me what compiler was used. In this case, it was the Microsoft Visual C++ Compiler, which uses the CDECL calling convention. I didn't need to determine the compiler used to determine the CDECL calling convention was being used. I could see it in the code. Before an Windows API was called, stuff was being pushed onto the stack. According to Wikipedia:
I have some nice plugins that know about most of the Windows API functions and can show me what parameters are being passed in the comments section to the API. I can also right click on the API call and have it search google, MSDN, etc, for that API function and then I can read how it works.
For example, when I see a call to the user32.MessageBoxA windows API, one of my plugins will show look at the push statements and show which parameters are being passed to the function and tell me what they are, so I don't have to remember that MessageBox API takes four parameters:
Sometimes though, it's nice to know exactly what lpText and lpCaption or uType is. So I have another plugin which actually searches what I have it configured to search (MSDN right now) for those functions, and it'll pull up the MSDN function call library, with the function in question, and I can read and study it, see how it works.
No, because I'm using a debugger, I'm replacing bytes that (with the language stuff) represent characters. The code-page determines how those bytes are handled. If I could convert the code page from 1033 to the Chinese one, then the two-byte character representations would have been displayed properly, as Chinese characters, instead of garbage. That was the original goal, then once I accomplished that, I was going to work on converting or writing a new setup.exe using InnoSetup but I just don't have the energy anymore. So I abandoned the project. If the program's code page was set to Chinese, then it would have interpreted the code as a double-byte character set (DBCS), which it didn't. That's why I get garbage. The actual text is a double-byte character set in the program, but the Windows APIs don't know that because code page is set to 1033 (US), and treats each byte as a character. With InnoSetup, I'd have used Unicode for one, which I think is now the standard.
It sucks, but it is what it is, and I try to keep a positive mind. I always know it can be much worse. Still alive, still got my limbs, and although I cannot remember things one day, other days, I can remember things real good. I couldn't remember code-caves, if I came across the term, I would have been wtf is that! I even asked earlier on in one of the posts in this thread if I could append characters in the program or if that'd throw it all off. Today, and yesterday, to me, it was like of course not! We need to write a code-cave if the words we're trying to convert the exe's Chinese to English, or better yet, just watch what the EXE does, translate the Chinese to English, and recreate the EXE using InnoSetup for personal use.
Originally posted by Curious.George
View Post
Originally posted by Curious.George
View Post
Originally posted by Curious.George
View Post
Originally posted by Curious.George
View Post
Some programs (like Polderbits) use a lot of "tricks" to make modifying the program very hard. There's ways to detect a debugger. Polderbits has codes like that. There's plugins to hide the debugger, but the Polderbits developer was smart and used old school tricks that you don't see much anymore, but also used those functions in his actual code. So Polderbits is much, much harder to modify than this setup.exe program.
Originally posted by Curious.George
View Post
If there's any easter eggs, I'd see them before over writting code. I'm not just gonna randomly pick a piece of code and over-write it. I would make sure it's actually unneeded. There's multiple ways to do this with x64dbg.
When I was trying to analyze the first bytes the program read from the itself (I suspected some sort of header file), I manipulated the data. It would read 4 bytes, for example, and then example the last byte to see what it was. If it wasn't a 0, it'd jump to a function. That function would then read some stuff from the program and then call some functions that create a unique serial type number. This number is then compared to some code in the program, and if it's not the same, it displays a message on the screen saying something along the lines of The demo version of this program can only be used to create setup files that are ran on the computer they where created on.
That code has no easter eggs. For the code that generates the serial number, I can search through the entire program, and all the DLLs it's loaded, including the Windows ones, everything, and actually look for references to that sub-function, and I only see that one, so we know it's only used if the Wise Setup Studio program used to create the exe was a demo version, not a registered version.
Originally posted by Curious.George
View Post
Code:
cdecl The cdecl (which stands for C declaration) is a calling convention that originates from the C programming language and is used by many C compilers for the x86 architecture. In cdecl, subroutine arguments are passed on the stack. Integer values and memory addresses are returned in the EAX register, floating point values in the ST0 x87 register. Registers EAX, ECX, and EDX are caller-saved, and the rest are callee-saved. The x87 floating point registers ST0 to ST7 must be empty (popped or freed) when calling a new function, and ST1 to ST7 must be empty on exiting a function. ST0 must also be empty when not used for returning a value. In the context of the C programming language, function arguments are pushed on the stack in the reverse order. In Linux, GCC sets the de facto standard for calling conventions. Since GCC version 4.5, the stack must be aligned to a 16-byte boundary when calling a function (previous versions only required a 4-byte alignment.)
For example, when I see a call to the user32.MessageBoxA windows API, one of my plugins will show look at the push statements and show which parameters are being passed to the function and tell me what they are, so I don't have to remember that MessageBox API takes four parameters:
Code:
int WINAPI MessageBox( _In_opt_ HWND hWnd, _In_opt_ LPCTSTR lpText, _In_opt_ LPCTSTR lpCaption, _In_ UINT uType );
Originally posted by Curious.George
View Post
Originally posted by Curious.George
View Post
Comment