idaclang incorrectly translates wchar_t to __int16


parse 111.h with follow content

struct TestXXX
{
	char f1;
	wchar_t f2;
	wchar_t* f3;
};

will show this picture

This problem also exists in idaclang.exe, which will cause many dependent wchar_t function declarations to be parsed incorrectly. The following figure is a dump of TIL generated by idaclang.exe

Thanks for the report. Please note that when you use the native IDA parser with Windows binaries, wchar_t is actually a typedef for unsigned short and not a native type. We’ll check why it’s not happening with clang parser.

I know that I can temporarily circumvent this problem by adding the parameter -Xclang -fno-wchar to idaclang.exe and adding typedef unsigned short wchar_t; to the cpp header/source file, but I think this is a bug in idaclang.exe, because I printed the entire log of idaclang and saw the following two lines output:

IDACLANG: predefined   (null):75 kind=macro definition(501) name=_WCHAR_T_DEFINED type=(0) sizeof=-1 body=1
IDACLANG: predefined   (null):76 kind=macro definition(501) name=_NATIVE_WCHAR_T_DEFINED type=(0) sizeof=-1 body=1

Therefore, it can be assumed that clang’s AST is fine, and the problem lies in idaclang’s parsing of this AST.