We've been quite happy at Jam for a few contracts to improve KnowledgeTree. These not only improve KnowledgeTree for everyone in the specific area of the contract, but in the process we're given the opportunity to look at surrounding code, and think about improvements. Such is the latest of our contracts, which involves some wide-spread changes that I'm having to prepare for by improving various things about KnowledgeTree - mostly getting rid of legacy decisions and methods. Here's the start of that work.

So many of the questions about KnowledgeTree on the SF.net forums and lists have to do with people not making certain configuration setting changes. So, now I've taken it out of their hands.

They used to have to set the web site address, the path to the root of the KnowledgeTree application, the error_reporting level. Now I just set this all up inside the application:

// Default settings differ, we need some of these, so force the matter.
// Can be overridden here if actually necessary.
error_reporting(E_ALL & ~E_NOTICE);
ini_set('display_errors', '1');
ini_set('display_startup_errors', '1');
ini_set('magic_quotes_runtime', '0');

// If not defined, set KT_DIR based on my usual location in the tree
if (!defined('KT_DIR')) {
    define('KT_DIR', realpath(dirname(__FILE__) . '/..'));
}

$default->fileSystemRoot = KT_DIR;
$default->serverName =$_SERVER['HTTP_HOST'];

Two legacy decisions were made - enabling magic_quotes_gpc and register_globals. We even had a workaround if register_globals wasn't set to import them.

Why disable magic_quotes_gpc? Well, we want to have pristine request parameters, as then we can be sure we won't double-quote things going into the database, or have quoted stuff going to files. Magic Quotes don't really buy stuff once you have the right framework for dealing with data. Sure, we aren't there yet in KnowledgeTree, but this change makes sure we get there quickly. You can't disable magic_quotes_gpc at runtime, but you can dequote the request arrays, which is what we do.

We also don't want the request parameters in the global scope. Firstly, our handlers that need access to request parameters may (or maybe even should) be in functions, without easy access to the global scope anyway. We also should keep track accurately of what parameters we want from the request. I don't think KnowledgeTree has this problem (but it sure won't after this change), but it is very easy for a amateur programmer to mistakenly allow someone access to something if you're checking a variable that's set via a request parameter. We overcome register_globals by removing each request parameter in the global scope that is equivalent in value to what's in the request parameter - a trick TikiWiki uses.

/// {{{ KTInit
class KTInit {
    // {{{ cleanGlobals()
    function cleanGlobals () {
        /*
         * Borrowed from TikiWiki
         *
         * Copyright (c) 2002-2004, Luis Argerich, Garland Foster,
         * Eduardo Polidor, et. al.
         */
        if (ini_get('register_globals')) {
            foreach (array($_ENV, $_GET, $_POST, $_COOKIE, $_SERVER) as $superglob) {
                foreach ($superglob as $key => $val) {
                    if (isset($GLOBALS[$key]) && $GLOBALS[$key] == $val) {
                        unset($GLOBALS[$key]);
                    }
                }
            }
        }
    }
    // }}}

    // {{{ cleanMagicQuotesItem()
    function cleanMagicQuotesItem (&$var) {
        if (is_array($var)) {
            foreach ($varas $key => $val) {
                KTInit::cleanMagicQuotesItem($var[$key]);
            }
        } else {
            $var = stripslashes($var);
        }
    }
    // }}}

    // {{{ cleanMagicQuotes()
    function cleanMagicQuotes () {
        if (get_magic_quotes_gpc()) {
            KTInit::cleanMagicQuotesItem($_GET);
            KTInit::cleanMagicQuotesItem($_POST);
            KTInit::cleanMagicQuotesItem($_REQUEST);
            KTInit::cleanMagicQuotesItem($_COOKIE);
        }
    }
    // }}}
}
// }}}

Now how do I go about fixing all the abuse of the fact request parameters were in the global scope? Luckily, we were using Hungarian Notation, and all form/request-derived parameters were prefixed with 'f'.

I used the following combination of Bourne Shell and Perl to find such variables in a file:

#!/bin/sh
# listFormVariables.sh

FILE=$1

grep '$f' $FILE | perl ~/fs/bin/getFormVariables.pl | sort

grep extractGPC $FILE
#!/usr/bin/perl
# getFormVariables.pl

my(%vars);

while (<>) {
        while (/\$f/) {
                s/^.*?(\$f[a-zA-Z0-9]+)[^a-zA-Z0-9]//;
                # print $1 . "\n";
                $vars{$1} = 1;
        }
}

foreach $k (keys %vars) {
        print $k . "\n";
}

Once I'd identified the form/request variable usage, I either converted them to using $_REQUEST, or I used the following to grab only the requested variables into the global scope:

<?php

// {{{ KTUtil
class KTUtil {
    function extractGPC () {
        foreach (func_get_args() as $var) {
            if (array_key_exists($var, $_REQUEST)) {
                $GLOBALS["$var"] = $_REQUEST["$var"];
            }
        }
    }

}
// }}}

?>

So, that didn't take me very long to run through all the files, while adding a lot of TODO comments when I noticed strange things.

(And maybe I can push KT back to 5th position in the SF.net activity rankings.)