Wednesday, July 05, 2006

Self replicating software - Part 1- The Recursion Theorem

This is the first in a multi part post about computing theory and self replicating software. This post assumes you have knowledge and understanding of a Turing Machine (abbreviated TM). If you aren't familiar with Turing Machines (TMs) then you may want to take a look at the Wikipedia entry on the topic at http://en.wikipedia.org/wiki/Turing_machine before continuing. Much of the material for this post comes from Michael Sipser's "An Introduction to the Theory of Computation".

Let's assume there is a computable function t which takes two inputs and yields one output. This function can be stated as:


t: E* x E* -> E*


Call the TM that implements this function TM_T. Now if one of the inputs that TM_T takes is the description of a TM (perhaps a description of TM_T or the description of another TM) we can call TM_T a "Meta TM", meaning it operates on TMs (more accurately, descriptions of TMs). TM_T could do a number of things with the description, such as emulate it (making TM_T a universal TM), or print the description to the TM's tape (making TM_T a "quine"). For the purpose of the recursion theorem, it's irrelevant what TM_T does with the description. So TM_T can be written as taking two inputs:


TM_T(<TM>, w)


Where <TM> is a description of a TM. The other argument (w) could be input to the TM described by the <TM> argument.

Now, assume there is another computable function r which takes one parameter, w, the same w passed as the second argument to function t. Function r can be stated as:


r: E* -> E*


Call the TM that implements this function TM_R. So TM_R can be written as taking one input:


TM_R(w)


The TM_R is described by <TM_R>.

The recursion theorem states (in terms of functions r and t):


r(w) = t(<TM_R>, w)


alternatively


TM_R(w) = TM_T(<TM_R>, w)


This means that the ouput from TM_T given a description of TM_R and input w yields the same output as TM_R with input w. Think about the implications of this for a moment. If TM_T prints the description of the TM passed to it (which in this case is the description of TM_R), then TM_R can do the same thing without having the description passed in explicitly, instead TM_R can calculate <TM_R>. In essence TM_R could print out a description of itself, without having the description given to it.

I'll post the proof and examples in Python in later entries. The recursion theorem can allow a virus to obtain a description of itself and "print" the description into a file. Although, there are other methods that virii can use to obtain descriptions of themselves, which will also be covered in later posts.





<< Home

This page is powered by Blogger. Isn't yours?

My blog has moved! Redirecting…

You should be automatically redirected. If not, visit http://www.forensicblog.org and update your bookmarks.