Credit: EvalBlog
Credit: EvalBlog
One of the things I do at DreamHost is help with hacked sites. This means when WP is hacked, I look at it, figure out how, and explain to the person how to fix it, or how to tell their tech folks what needs doing. There are occasions where I’ll delete things for them, but usually that happens when there’s a folder or file with weird permissions.

We have a lot of tricks with what we look for, like base64, but recently I started to find files that missed my scan, but not my “Hey, wait, wp-mai1.php isn’t a WordPress file…” check. Files like this:

$a51a0e6bb0e53a=str_rot13('tmhapbzcerff');$a51a0e6bb0e5e4=str_rot13(strrev('rqbp rq_46rfno'));

Now obviously I can just add str_rot13 to my checklist (nothing in WordPress core uses it), but .. how do I look for those eval strings?

Eval is a funny thing. In JavaScript: The Good Parts, Douglas Crockford states “eval is Evil: The eval function is the most misused feature of JavaScript. Avoid it” but he’s taking JS and I’m looking at php files. So with the (current) assumption that I can ignore js I can try this(I also use ack for this half the time, depends on my mood)(You can leave out ‘exclude SVN’ stuff if you want to. Most users don’t have it.):

grep -R --exclude-dir="\.svn" --exclude="*.js" "eval" .

That gets me a lot of files, though, and I don’t want to parse what I don’t need to. By the way, there’s one and only one file in all of WP that uses eval() in a ‘nefarious’ way, and that’s ./wp-admin/js/revisions-js.php, which is the WordPress easter egg. That’s also the only place you’ll see p,a,c,k,e,r code. But clearly I want to look for eval( or even eval($ because that’s more exact, and that should give me a better result.

This is a two edged sword, of course. If I’m too precise, I will miss some of their shenanigans. If I’m not close enough to what I’m looking for, I get too much. And worst of all, I don’t always know what I’m looking for. Quite a lot of finding new hacks is a world where “I’ll know it when I see it.” So let’s take it down and say I want to find no JS, nothing in .svn, and anything with eval and a paren:

grep -R --exclude-dir="\.svn" --exclude="*.js" -e 'eval(' .

That’s a lot better, and in fact, this is a good start! But it’s hard to read because of how long the lines are:

./wp-admin/includes/class-pclzip.php://      eval('$v_result = '.$p_options[PCLZIP_CB_PRE_EXTRACT].'(PCLZIP_CB_PRE_EXTRACT, $v_local_header);');
./wp-admin/js/revisions-js.php:eval(function(p,a,c,k,e,r){e=function(c){return(c<a?'':e(parseInt(c/a)))+((c=c%a)>35?String.fromCharCode(c+29):c.toString(36))};if(!''.replace(/^/,String)){while(c--)r[e(c)]=k||e(c);k=[function(e){return r[e]}];e=function(){return'\\\\w+'};c=1};while(c--)if(k)p=p.replace(new RegExp('\\\\b'+e(c)+'\\\\b','g'),k);return p}('6(4(){2 e=6(\\'#Q\\').v();2 i=\\'\\\\\\',.R/=\\\\\\\\S-;T"<>U?+|V:W[]X{}\\'.u(\\'\\');2 o=\\'Y[]\\\\\\\\Z;\\\\\\'10,./11{}|12:"13<>?-=14+\\'.u(\\'\\');2 5=4(s){r=\\'\\';6.15(s.u(\\'\\'),4(){2 t=16.D();2 c=6.17(t,i);r+=\\'\$\\'==t?n:(-1==c?t:o)});j r};2 a=[\\'O.E[18 e.y.19.1a\\',\\'1b 1c. 1d .1e.,1f 1g\\',\\'O.E e.1h 1i 8\\',\\'9\\',\\'0\\'];2 b=[\\'<1j. 1k \$1l\\',\\'1m. 1n 1o 1p\\',\\'1q, 1r. ,1s. 1t\\'&#93;;2 w=&#91;&#93;;2 h=6(5(\\'#1u\\'));6(5(\\'1v\\')).1w(4(e){7(1x!==e.1y){j}7(x&amp;&amp;x.F){x.F();j G}1z.1A=6(5(\\'#1B\\')).1C(\\'1D\\');j G});2 k=4(){2 l=a.H();7(\\'I\\'==J l){7(m){2 c={};c&#91;5(\\'1E\\')&#93;=5(\\'1F\\');c&#91;5(\\'1G\\')&#93;=5(\\'1H..b\\');6(5(\\'1I 1J\\')).1K(c);p();h.v().1L({1M:1},z,\\'1N\\',4(){h.K()});d(m,L)}j}w=5(l).u(\\'\\');A()};2 A=4(){B=w.H();7(\\'I\\'==J B){7(m){h.M(5(\\'1O 1P\\'));d(k,C)}N{7(a.P){d(p,C);d(k,z)}N{d(4(){p();h.v()},C);d(4(){e.K()},L)}}j}h.M(B.D());d(A,1Q)};2 m=4(){a=b;m=1R;k()};p=4(){2 f=6(\\'p\\').1S(0);2 g=6.1T(f.q).1U();1V(2 g=f.q.P;g>0;g--){7(3==f.q[g-1].1W||\\'1X\\'==f.q[g-1].1Y.1Z()){f.20(f.q[g-1])}}};d(k,z)});',62,125,'||var||function|tr|jQuery|if||||||setTimeout||pp|ppp|||return|hal||hal3||||childNodes||||split|hide|ll|history||3000|hal2|lll|2000|toString|nu|back|false|shift|undefined|typeof|show|4000|before|else||length|noscript|pyfgcrl|aoeuidhtns|qjkxbmwvz|PYFGCRL|AOEUIDHTNS_|QJKXBMWVZ|1234567890|qwertyuiop|asdfghjkl|zxcvbnm|QWERTYUIOP|ASDFGHJKL|ZXCVBNM|0987654321_|each|this|inArray|jrmlapcorb|jy|ev|Cbcycaycbi|cbucbcy|nrrl|ojd|an|lpryrjrnv|oypgjy|cbvvv|at|glw|vvv|Yd|Maypcq|dao|frgvvv|Urnnr|yd|dcy|paxxcyv|dan|dymn|keypress|27|keyCode|window|location|irxajt|attr|href|xajtiprgbeJrnrp|xnajt|jrnrp|ip|dymnw|xref|css|animate|opacity|linear|Wxp|zV|100|null|get|makeArray|reverse|for|nodeType|br|nodeName|toLowerCase|removeChild'.split('|'),0,{}))
./wp-admin/press-this.php:		var my_src = eval(
./wp-admin/press-this.php:			var my_src = eval(
./wp-admin/press-this.php:							eval(data);
./wp-includes/class-json.php: * Javascript, and can be directly eval()'ed with no further parsing
./wp-includes/functions.php:		if ( doubleval($bytes) >= $mag )

Okay, lets get smarter!

grep -R --exclude-dir="\.svn" --exclude="*.js" -e 'eval(' .|cut -c -80

Now I’m telling it to cut up after 80 characters, because it’s easier to pick out the bad with just that much. Look:

./wp-admin/includes/class-pclzip.php://      eval('$v_result = '.$p_options[PCLZ
./wp-admin/press-this.php:		var my_src = eval(
./wp-admin/press-this.php:			var my_src = eval(
./wp-admin/press-this.php:							eval(data);
./wp-includes/class-json.php: * Javascript, and can be directly eval()'ed with n
./wp-includes/functions.php:		if ( doubleval($bytes) >= $mag )

Part of the reason this works is I know what I’m looking for. WordPress, in general, doesn’t encrypt content. Passwords and security stuff, yes, but when it does that, it uses variables so you would get eval('$v_result = '.$p_options[PCLZIP_CB_PRE_EXTRACT].'(PCLZIP_CB_PRE_EXTRACT, $v_local_header);');, which remains totally human readable. By that I mean I can see clear words that are easy to search for in a doc, or via grep or awk without being forced to copy/paste. I can remember “PCLZIP underscore CB…”

RandomCharacters_320Those random characters are not human readable at all. That’s how I know they’re bad. Of course, if someone got clever-er, they would start naming those variables things that ‘make sense’ in the world of WP, and I have a constant fear that by pointing out how I can tell this is a hack, I give them ideas on how to do evil-er things to us.

It’s for reasons like this that I, when faced with a hack or asked to clean one up, always perform Scorched Earth Security. I delete everything and reinstall it. I look for PHP and JS files in wp-content/uploads, or .htaccess files anywhere they shouldn’t be (in clean WP, you have two at most: at the root of your site and in akismet). I make sure I download my themes and plugins from known clean locations. I’m careful. And I always change my passwords. Heck, I don’t even know what mine are right now!

But none of this is static enough for me to say “This is the fix forever and ever” or “this is how you will always find the evil…” By the time we’ve codified and discussed best methods, the hackers have moved on. The logic of what to look for now may not last long, but the basic concept of looking for wrong and how to search for it should remain a good starting point for a while yet.

Do you have special tricks you use to find the evil? Like what Topher did to clean up a hack?

Reader Interactions


  1. Look for the gibberish instead.

    grep -R "[0-9A-Za-z]\{30,\}" * | grep "[a-zA-Z][0-9]" | grep "[0-9][a-zA-Z]"

    This finds alphanumeric sequences at least 30 characters long and which also contain both transitions between letters and numbers. Helps eliminate things like $variable20 and so on, because it’s looking for both the letter-to-number and the number-to-letter transitions in the same line.

    Not perfect, but useful for spotting oddities.

    • I finally tested that. Comes up with a lot of images! 😆

      grep -R --exclude-dir="\.svn" --exclude="*.jpg" "[0-9A-Za-z]\{30,\}" * | grep "[a-zA-Z][0-9]" | grep "[0-9][a-zA-Z]"

      You may even want to take it further and tell it to exclude anything named ‘backup’ as some plugins are idiots.

    • Yeah, it’s not perfect. Might limit it to PHP files only.

      But it’s easier to find gibberish in an eyeball scan than to find maliciousness. Then you can walk backwards.

%d bloggers like this: