Tuesday, September 23, 2014

atoi (in 32 bit x86 Assembly)

During my Amazon days some years back, an inside joke was established between some of us regarding the C function, atoi. One of our teammates named Kevin was fixated on the most efficient implementation of atoi in C. Since our charter wasn't even to code or maintain any C code, Kevin was being rueful about his college days versus than anything else.

After the discussion died down, the following day another team member cracked, "Well, Kevin, now it's time to tackle itoa." We all chuckled but writing atoi became an inside joke with various wisecracks such as "Oh... Kevin's next big project probably is writing atoi with one line of code."

A couple of years later when Kevin and I were no longer coworkers, he told me he was interviewing candidates at Amazon and I wisecracked, "You should have the candidates write atoi... in x86 assembly!" Again, we chuckled.

Well, it turns out I have a degree in computer engineering and outside of electrical engineering classes during my undergrad, I had an engineering class involving the use of a microcontroller board and having to write assembly code for the labs. So that evening, I took on my own challenge, i.e., writing atoi in x86 assembly. Nowadays I work in the network engineering space so this isn't something I do on any active basis, even so, before too long, I wrote code that follows below (no "googling" involved here).

The experience back then gave me an invaluable understanding in computing that few achieve today. That's why when security advisories come out calling out arbitrary code execution, the advisory doesn't simply go in one ear and out the other. Which is a segue for an anecdote -- once a vendor in a large IT security fair was demonstrating his product and he asked a room of about 150 IT professionals, "Anyone here know what a NOP sled is?" and I was the only person who raised their hand...

atoi: push ebp
      mov ebp,esp ; Establish stack frame for args

      push ecx    ; Counter for strlen
      push edx    ; used in another loop
      push esi    ; index for loop
      push edi    ; Used for multiplying powers 

                  ; of 10
      mov esi [ebp+1]
      mov ecx, 0 

      cmp 0, [esi]
      je len_end
      inc ecx     ; will hold strlen at loop's end
      inc esi

      cmp 0, ecx           ; Anything to convert?
      mov eax, 0           ; If not, return 0, like

                           ; C stdlib

      je atoi_end          ; if ecx == 0 we're done

      lea esi, [ebp+1]+ecx ; Point ESI to end 

                           ; of string passed to us
      dec ecx

      mov edx, 1           ; Start with lowest

                           ; order power of 10
      mov edi, 0

      lea eax, [esi]-30    ; non-destructive add
                           ; *esi has a single digit 

                           ; ASCII char subtracting 
                           ; 0x30 leaves us with 
                           ; digit for that power 
                           ; of 10

      mul edx              ; MUL multplies EAX 

                           ; with arg

      lea edi, edi+eax

      mov edx, eax
      mul 10
      mov edx, eax         ; Have next highest 

                           ; order power in edx

      dec ecx
      cmp 0, ecx
      je core_loop_end
      dec si
      jmp core_loop


      ; When everything is said and done EDI will 

      ; hold value to return

      ; Canonically returns values are passed back 

      ; via EAX

      move eax, edi

      pop edi
      pop esi
      pop edx
      pop ecx
      pop ebp