Binary static analysis problem: null pointer check

I was working on the IR of a rosemary and encountered a very interesting problem. After several rounds of troubleshooting, I discovered that this type of problem has a name: null pointer check.

1. The Beginning of the Problem

I was trying to identify jumptables and vtables. The corresponding IR should look like this:

call:blr(load([(load([X1]) + 0x48)]))

Here, X1 should be this or some other pointer. However, during debugging, I found a lot of call:blr(load([(load([0x0]) + 0x48)]))

The pointer here is a constant 0x0. If it were a constant in .data or .bss, this would be easy to understand, but it’s 0, which is very frustrating.

2. Debugging

The function code:

.text:00000000004C1FD0 ; __unwind { // sub_4DAAD8
.text:00000000004C1FD0                 STP             X28, X27, [SP,#-0x10+var_50]!
.text:00000000004C1FD4                 STP             X26, X25, [SP,#0x50+var_40]
.text:00000000004C1FD8                 STP             X24, X23, [SP,#0x50+var_30]
.text:00000000004C1FDC                 STP             X22, X21, [SP,#0x50+var_20]
.text:00000000004C1FE0                 STP             X20, X19, [SP,#0x50+var_10]
.text:00000000004C1FE4                 STP             X29, X30, [SP,#0x50+var_s0]
.text:00000000004C1FE8                 ADD             X29, SP, #0x50
.text:00000000004C1FEC                 SUB             SP, SP, #0x180
.text:00000000004C1FF0                 ADRP            X25, #__stack_chk_guard_ptr@PAGE
.text:00000000004C1FF4                 LDR             X25, [X25,#__stack_chk_guard_ptr@PAGEOFF]
.text:00000000004C1FF8                 MOV             X22, X5
.text:00000000004C1FFC                 MOV             X19, X4
.text:00000000004C2000                 MOV             X20, X2
.text:00000000004C2004                 LDR             X25, [X25]
.text:00000000004C2008                 MOV             X21, X1
.text:00000000004C200C                 ADD             X8, SP, #0x1D0+var_1B8
.text:00000000004C2010                 STR             X25, [X8]
.text:00000000004C2014                 STP             XZR, XZR, [SP,#0x1D0+var_170]
.text:00000000004C2018                 STR             XZR, [SP,#0x1D0+var_178]
.text:00000000004C201C                 ADD             X8, SP, #0x1D0+var_180
.text:00000000004C2020                 MOV             X0, X3
.text:00000000004C2024                 BL              sub_4BB978
.text:00000000004C2028                 ADRP            X1, #off_6C84B0@PAGE
.text:00000000004C202C                 LDR             X1, [X1,#off_6C84B0@PAGEOFF]
.text:00000000004C2030                 ADD             X0, SP, #0x1D0+var_180
.text:00000000004C2034                 BL              sub_4D12A4
.text:00000000004C2038                 LDR             X8, [X0]
.text:00000000004C203C                 LDR             X8, [X8,#0x60]
.text:00000000004C2040                 ADRP            X1, #off_6C84A8@PAGE ; "0123456789abcdefABCDEFxX+-pPiInN"
.text:00000000004C2044                 LDR             X1, [X1,#off_6C84A8@PAGEOFF] ; "0123456789abcdefABCDEFxX+-pPiInN"
.text:00000000004C2048                 SUB             X3, X29, #-var_C0
.text:00000000004C204C                 ADD             X2, X1, #0x1A
.text:00000000004C2050                 BLR             X8
.text:00000000004C2054                 LDR             X0, [SP,#0x1D0+var_180]
.text:00000000004C2058                 BL              sub_4D87DC
.text:00000000004C205C                 STP             XZR, XZR, [SP,#0x1D0+var_190]
.text:00000000004C2060                 STR             XZR, [SP,#0x1D0+var_198]
.text:00000000004C2064                 ADD             X0, SP, #0x1D0+var_198
.text:00000000004C2068                 MOV             W1, #0x16
.text:00000000004C206C                 MOV             W2, WZR
.text:00000000004C2070                 ADD             X23, SP, #0x1D0+var_198
.text:00000000004C2074                 BL              sub_7C2E4
.text:00000000004C2078                 LDRB            W8, [SP,#0x1D0+var_198]
.text:00000000004C207C                 LDR             X9, [SP,#0x1D0+var_188]
.text:00000000004C2080                 ORR             X26, X23, #1
.text:00000000004C2084                 STR             WZR, [SP,#0x1D0+var_1AC]
.text:00000000004C2088                 TST             W8, #1
.text:00000000004C208C                 ADD             X10, SP, #0x1D0+var_160
.text:00000000004C2090                 SUB             X27, X29, #-var_C0
.text:00000000004C2094                 CSEL            X23, X26, X9, EQ
.text:00000000004C2098                 ADD             X28, SP, #0x1D0+var_1A8
.text:00000000004C209C                 STP             X10, X23, [SP,#0x1D0+var_1A8]
.text:00000000004C20A0                 B               loc_4C20AC
.text:00000000004C20A4 ; ---------------------------------------------------------------------------
.text:00000000004C20A4
.text:00000000004C20A4 loc_4C20A4                              ; CODE XREF: sub_4C1FD0+220↓j
.text:00000000004C20A4                 ADD             X8, X8, #4
.text:00000000004C20A8                 STR             X8, [X21,#0x18]
.text:00000000004C20AC
.text:00000000004C20AC loc_4C20AC                              ; CODE XREF: sub_4C1FD0+D0↑j
.text:00000000004C20AC                                         ; sub_4C1FD0+234↓j
.text:00000000004C20AC                 CBZ             X21, loc_4C20C4
.text:00000000004C20B0                 LDP             X8, X9, [X21,#0x18]
.text:00000000004C20B4                 CMP             X8, X9
.text:00000000004C20B8                 B.EQ            loc_4C20D0
.text:00000000004C20BC                 LDR             W0, [X8]
.text:00000000004C20C0                 B               loc_4C20E0
.text:00000000004C20C4 ; ---------------------------------------------------------------------------
.text:00000000004C20C4
.text:00000000004C20C4 loc_4C20C4                              ; CODE XREF: sub_4C1FD0:loc_4C20AC↑j
.text:00000000004C20C4                 MOV             X21, XZR
.text:00000000004C20C8                 MOV             W24, #1
.text:00000000004C20CC                 B               loc_4C20EC
.text:00000000004C20D0 ; ---------------------------------------------------------------------------
.text:00000000004C20D0
.text:00000000004C20D0 loc_4C20D0                              ; CODE XREF: sub_4C1FD0+E8↑j
.text:00000000004C20D0                 LDR             X8, [X21]
.text:00000000004C20D4                 LDR             X8, [X8,#0x48]
.text:00000000004C20D8                 MOV             X0, X21
.text:00000000004C20DC                 BLR             X8
.text:00000000004C20E0
.text:00000000004C20E0 loc_4C20E0                              ; CODE XREF: sub_4C1FD0+F0↑j
.text:00000000004C20E0                 CMN             W0, #1
.text:00000000004C20E4                 CSET            W24, EQ
.text:00000000004C20E8                 CSEL            X21, XZR, X21, EQ
.text:00000000004C20EC
.text:00000000004C20EC loc_4C20EC                              ; CODE XREF: sub_4C1FD0+FC↑j
.text:00000000004C20EC                 CBZ             X20, loc_4C2124
.text:00000000004C20EC                 CBZ             X20, loc_4C2124
.text:00000000004C20F0                 LDP             X8, X9, [X20,#0x18]
.text:00000000004C20F4                 CMP             X8, X9
.text:00000000004C20F8                 B.EQ            loc_4C2104
.text:00000000004C20FC                 LDR             W0, [X8]
.text:00000000004C2100                 B               loc_4C2114
.text:00000000004C2104 ; ---------------------------------------------------------------------------
.text:00000000004C2104
.text:00000000004C2104 loc_4C2104                              ; CODE XREF: sub_4C1FD0+128↑j
.text:00000000004C2104                 LDR             X8, [X20]
.text:00000000004C2108                 LDR             X8, [X8,#0x48]
.text:00000000004C210C                 MOV             X0, X20
.text:00000000004C2110                 BLR             X8
.text:00000000004C2114
.text:00000000004C2114 loc_4C2114                              ; CODE XREF: sub_4C1FD0+130↑j
.text:00000000004C2114                 CMN             W0, #1
.text:00000000004C2118                 B.EQ            loc_4C2124
.text:00000000004C211C                 CBNZ            W24, loc_4C212C
.text:00000000004C2120                 B               loc_4C2224
.text:00000000004C2124 ; ---------------------------------------------------------------------------
.text:00000000004C2124
.text:00000000004C2124 loc_4C2124                              ; CODE XREF: sub_4C1FD0:loc_4C20EC↑j
.text:00000000004C2124                                         ; sub_4C1FD0+148↑j
.text:00000000004C2124                 MOV             X20, XZR
.text:00000000004C2128                 TBNZ            W24, #0, loc_4C2224
.text:00000000004C212C
.text:00000000004C212C loc_4C212C                              ; CODE XREF: sub_4C1FD0+14C↑j
.text:00000000004C212C                 LDRB            W8, [SP,#0x1D0+var_198]
.text:00000000004C2130                 LDR             X9, [SP,#0x1D0+var_190]
.text:00000000004C2134                 LDR             X10, [SP,#0x1D0+var_1A0]
.text:00000000004C2138                 LSR             X11, X8, #1
.text:00000000004C213C                 TST             W8, #1
.text:00000000004C2140                 CSEL            X24, X11, X9, EQ
.text:00000000004C2144                 ADD             X8, X23, X24
.text:00000000004C2148                 CMP             X10, X8
.text:00000000004C214C                 B.NE            loc_4C219C
.text:00000000004C2150                 LSL             X1, X24, #1
.text:00000000004C2154                 MOV             W2, WZR
.text:00000000004C2158                 ADD             X0, SP, #0x1D0+var_198
.text:00000000004C215C                 BL              sub_7C2E4
.text:00000000004C2160                 LDRB            W8, [SP,#0x1D0+var_198]
.text:00000000004C2164                 MOV             W1, #0x16
.text:00000000004C2168                 TBZ             W8, #0, loc_4C2178
.text:00000000004C216C                 LDR             X8, [SP,#0x1D0+var_198]
.text:00000000004C2170                 AND             X8, X8, #0xFFFFFFFFFFFFFFFE
.text:00000000004C2174                 SUB             X1, X8, #1
.text:00000000004C2178
.text:00000000004C2178 loc_4C2178                              ; CODE XREF: sub_4C1FD0+198↑j
.text:00000000004C2178                 MOV             W2, WZR
.text:00000000004C217C                 ADD             X0, SP, #0x1D0+var_198
.text:00000000004C2180                 BL              sub_7C2E4
.text:00000000004C2184                 LDRB            W8, [SP,#0x1D0+var_198]
.text:00000000004C2188                 LDR             X9, [SP,#0x1D0+var_188]
.text:00000000004C218C                 TST             W8, #1
.text:00000000004C2190                 CSEL            X23, X26, X9, EQ
.text:00000000004C2194                 ADD             X8, X23, X24
.text:00000000004C2198                 STR             X8, [SP,#0x1D0+var_1A0]
.text:00000000004C219C
.text:00000000004C219C loc_4C219C                              ; CODE XREF: sub_4C1FD0+17C↑j
.text:00000000004C219C                 LDP             X8, X9, [X21,#0x18]
.text:00000000004C21A0                 CMP             X8, X9
.text:00000000004C21A4                 B.EQ            loc_4C21B0
.text:00000000004C21A8                 LDR             W0, [X8]
.text:00000000004C21AC                 B               loc_4C21C0
.text:00000000004C21B0 ; ---------------------------------------------------------------------------
.text:00000000004C21B0
.text:00000000004C21B0 loc_4C21B0                              ; CODE XREF: sub_4C1FD0+1D4↑j
.text:00000000004C21B0                 LDR             X8, [X21]
.text:00000000004C21B4                 LDR             X8, [X8,#0x48]
.text:00000000004C21B8                 MOV             X0, X21
.text:00000000004C21BC                 BLR             X8
.text:00000000004C21C0
.text:00000000004C21C0 loc_4C21C0                              ; CODE XREF: sub_4C1FD0+1DC↑j
.text:00000000004C21C0                 MOV             W1, #0x10
.text:00000000004C21C4                 ADD             X3, SP, #0x1D0+var_1A0
.text:00000000004C21C8                 ADD             X4, SP, #0x1D0+var_1AC
.text:00000000004C21CC                 ADD             X6, SP, #0x1D0+var_178
.text:00000000004C21D0                 ADD             X7, SP, #0x1D0+var_160
.text:00000000004C21D4                 MOV             X2, X23
.text:00000000004C21D8                 MOV             W5, WZR
.text:00000000004C21DC                 STP             X28, X27, [SP,#0x1D0+var_1D0]
.text:00000000004C21E0                 BL              sub_4C23C8
.text:00000000004C21E4                 CBNZ            W0, loc_4C2224
.text:00000000004C21E8                 LDP             X8, X9, [X21,#0x18]
.text:00000000004C21EC                 CMP             X8, X9
.text:00000000004C21F0                 B.NE            loc_4C20A4
.text:00000000004C21F4                 LDR             X8, [X21]
.text:00000000004C21F8                 LDR             X8, [X8,#0x50]
.text:00000000004C21FC                 MOV             X0, X21
.text:00000000004C2200                 BLR             X8
.text:00000000004C2204                 B               loc_4C20AC
.text:00000000004C2208 ; ---------------------------------------------------------------------------
.text:00000000004C2208
.text:00000000004C2208 loc_4C2208                              ; CODE XREF: sub_4C1FD0+3C0↓j
.text:00000000004C2208                 MOV             X19, X0
.text:00000000004C220C
.text:00000000004C220C loc_4C220C                              ; CODE XREF: sub_4C23B4+10↓j
.text:00000000004C220C                 ADD             X0, SP, #0x1D0+var_198
.text:00000000004C2210                 BL              sub_75B48
.text:00000000004C2214
.text:00000000004C2214 loc_4C2214                              ; CODE XREF: sub_4C1FD0+3D4↓j
.text:00000000004C2214                                         ; sub_4C1FD0+3DC↓j
.text:00000000004C2214                 ADD             X0, SP, #0x1D0+var_178
.text:00000000004C2218                 BL              sub_75B48
.text:00000000004C221C                 MOV             X0, X19
.text:00000000004C2220                 BL              sub_505080
.text:00000000004C2224
.text:00000000004C2224 loc_4C2224                              ; CODE XREF: sub_4C1FD0+150↑j
.text:00000000004C2224                                         ; sub_4C1FD0+158↑j ...
.text:00000000004C2224                 LDR             X8, [SP,#0x1D0+var_1A0]
.text:00000000004C2228                 SUB             X1, X8, X23
.text:00000000004C222C                 MOV             W2, WZR
.text:00000000004C2230                 ADD             X0, SP, #0x1D0+var_198
.text:00000000004C2234                 BL              sub_7C2E4
.text:00000000004C2238                 ADD             X24, SP, #0x1D0+var_1B8
.text:00000000004C223C                 LDRB            W8, [SP,#0x1D0+var_198]
.text:00000000004C2240                 ADRP            X10, #byte_6D2CB8@PAGE
.text:00000000004C2244                 LDR             X9, [SP,#0x1D0+var_188]
.text:00000000004C2248                 ADD             X10, X10, #byte_6D2CB8@PAGEOFF
.text:00000000004C224C                 LDARB           W10, [X10]
.text:00000000004C2250                 TST             W8, #1
.text:00000000004C2254                 CSEL            X23, X26, X9, EQ
.text:00000000004C2258                 AND             W8, W10, #1
.text:00000000004C225C                 TBNZ            W8, #0, loc_4C2298
.text:00000000004C2260                 ADRL            X0, byte_6D2CB8
.text:00000000004C2268                 BL              sub_4DA76C
.text:00000000004C226C                 CBZ             W0, loc_4C2298
.text:00000000004C2270                 ADRP            X1, #aC@PAGE ; "C"
.text:00000000004C2274                 MOV             X2, XZR
.text:00000000004C2278                 ADD             X1, X1, #aC@PAGEOFF ; "C"
.text:00000000004C227C                 MOV             W0, #0x1FBF
.text:00000000004C2280                 BL              .newlocale
.text:00000000004C2284                 ADRP            X8, #qword_6D2CB0@PAGE
.text:00000000004C2288                 STR             X0, [X8,#qword_6D2CB0@PAGEOFF]
.text:00000000004C228C                 ADRL            X0, byte_6D2CB8
.text:00000000004C2294                 BL              sub_4DA82C
.text:00000000004C2298
.text:00000000004C2298 loc_4C2298                              ; CODE XREF: sub_4C1FD0+28C↑j
.text:00000000004C2298                                         ; sub_4C1FD0+29C↑j
.text:00000000004C2298                 ADRP            X8, #qword_6D2CB0@PAGE
.text:00000000004C229C                 LDR             X1, [X8,#qword_6D2CB0@PAGEOFF]
.text:00000000004C22A0                 ADRL            X2, aP_1 ; "%p"
.text:00000000004C22A8                 MOV             X0, X23
.text:00000000004C22AC                 MOV             X3, X22
.text:00000000004C22B0                 BL              sub_4BFE04
.text:00000000004C22B4                 CMP             W0, #1
.text:00000000004C22B8                 B.EQ            loc_4C22C4
.text:00000000004C22BC                 MOV             W8, #4
.text:00000000004C22C0                 STR             W8, [X19]
.text:00000000004C22C4
.text:00000000004C22C4 loc_4C22C4                              ; CODE XREF: sub_4C1FD0+2E8↑j
.text:00000000004C22C4                 CBZ             X21, loc_4C22DC
.text:00000000004C22C8                 LDP             X8, X9, [X21,#0x18]
.text:00000000004C22CC                 CMP             X8, X9
.text:00000000004C22D0                 B.EQ            loc_4C22E8
.text:00000000004C22D4                 LDR             W0, [X8]
.text:00000000004C22D8                 B               loc_4C22F8
.text:00000000004C22DC ; ---------------------------------------------------------------------------
.text:00000000004C22DC
.text:00000000004C22DC loc_4C22DC                              ; CODE XREF: sub_4C1FD0:loc_4C22C4↑j
.text:00000000004C22DC                 MOV             X21, XZR
.text:00000000004C22E0                 MOV             W22, #1
.text:00000000004C22E4                 B               loc_4C2304
.text:00000000004C22E8 ; ---------------------------------------------------------------------------
.text:00000000004C22E8
.text:00000000004C22E8 loc_4C22E8                              ; CODE XREF: sub_4C1FD0+300↑j
.text:00000000004C22E8                 LDR             X8, [X21]
.text:00000000004C22EC                 LDR             X8, [X8,#0x48]
.text:00000000004C22F0                 MOV             X0, X21
.text:00000000004C22F4                 BLR             X8

At location 0x4C22F4 is a call function. To identify x8, I need to find the expression corresponding to x8.

Therefore, I should find the definition of x21 in LDR X8, [X21]. Finally found its definition at location 0x4C20C4: MOV X21, XZR. Here, it’s confirmed that the predecessor of the 0x4C22F4 basic block does indeed set X21 to NULL.

Then, I continued tracing its assembly code and discovered the final problem: The actual place where x21 is set to 0 is at 4C22DC, which is the taken target of CBZ X21. Then, it directly calls B loc_4C2304, without reaching 4C22E8/4C22F4. Therefore, although the definition X21 = XZR exists, it is unreachable from the path 4C22F4.

3. Solution:

During the lifter, the basic semantics of conditional statements such as cbz/cbnz need to be expanded. This allows for the direct pruning of many unreachable branches during control flow backtracking.

This is called null pointer check.