Archive

Archive for the ‘Development’ Category

Understanding and Dealing with Click Fraud

November 13th, 2010
Comments Off

This is an excerpt from a whitepaper that I was commissioned to write for the FBI. It was part of an internal research project where they were to determine the effects and solutions regarding the rampant Click Fraud concern with online advertising and the like. – Brian Grayless

What is click fraud?

Let’s start out by defining some key terms that are important in understanding click fraud.

  • Advertiser – The entity that pays money to get traffic to their site in the way of bidding on keywords or topical categories (bid auctions).
  • Publisher – Any entity which displays advertiser ads on their web site or in some other publicly viewable medium.
  • Visitor – A legitimate user who clicked on something to get to the appropriate target web site.
  • Click – A visitor to the advertiser’s site that came by route of one or more publishers.
  • PPC (Pay Per Click) – An internet advertising model where the amount the advertiser pays is dictated on a per click basis for the terms (keyword or categories) being bid on.
  • CPC (Cost Per Click) – The amount (bid price) paid by the advertiser to receive one visitor for a particular term. The amount is paid only if the visit occurs.
  • CPA (Cost Per Action) – An advertising cost associated to a particular desired visitor action, i.e. purchased a product or service, filled out a survey, or signed up for a newsletter.
  • Conversion – A completion of the advertiser’s desired action under a CPA advertising model.
  • Click Stream – The route the click traffic takes from the time the click is made through the time the web user arrives on the advertiser’s targeted URL. There can often be URL redirects and several publishers (usually tracked by cookies or ID’s in the URL) that receive information for each click, completely transparent to the user.
  • Rev Share – A single publisher’s fraction of the revenue generated by specific click and conversion sources. For example, a smaller publisher might arrange to send click traffic into a larger publisher’s click stream, providing the larger publisher with more traffic and retaining a 5% rev share of the total per click amount for the smaller publisher. Rev Share can be seen as a multi-tier sales commission.
  • Ad Feed – ad listings/data provided by an n-tier publisher by request to display to users on another publisher’s web site or application.

Click fraud, generally speaking, occurs when something (person, web bot, etc…) posing as a legitimate internet user follows (or clicks) a paid advertisement URL to the advertisers web site from which money is generated for some entity other than the advertiser.

Valid User

An advertiser pays good money for advertising, expecting that a portion of the traffic received in return will generate revenue in some fashion. Non-legitimate visitors produce bad clicks which in effect spend advertiser dollars with no hope of a return for the advertiser. This expense is instead divvied up between the layers (rev share) of publishers that are likely to be present in the click stream. Publishers, especially the ones on the end of the chain, often have the most to gain from this practice and will devise all sorts of innovative ways to game the system. Larger publishers in the click stream will often ignore or downplay this activity, knowing that it lines their pockets in the process.

In short, advertisers are being hijacked of their advertising dollars from inflated term bidding marketplaces because of traffic that is posing as real, live, interested web site visitors. It is theft akin to diverting fractions of a penny from financial transactions to a private account.

See more information on Click Fraud and get the complete and detailed whitepaper.

Development , , , , , , , ,

Rails Sessions Across Multiple Subdomains

May 26th, 2010
Comments Off

Okay, so I’m working on a new Rails project. Things are coming along great. Then we hit a snag where our SSL is not working as expected. We want it to work on Staging and Production only, and only for the actions that we need them on. So, the SslRequirement Gem did the trick.

However, we have many (and many more to come) sub-subdomains which caused another dilemma. We have a wildcard SSL certificate, however, although we can get one that also handles sub-subdomains, it’s not necessarily supported by the user’s browser. So, our other option was to put all the public stuff on the subdomains and have all the private stuff on a “private”.domain.com address which would adequately be handled by SSL at the application and certificate levels. After some finagling, I managed to dynamically change the subdomain based on whether or not the action requested should be SSL’d.

Everything seemed to be humming along, but this new code snippet was relying on something that we hadn’t previously tested thoroughly… sessions. Session are just supposed to work right? However, evidently they don’t work by default across subdomains. So, after some hunting around, this little snippet put into my “/config/[environment].rb” file did the trick.

1
config.action_controller.session = { :domain => ".[domain].com" }

Evidently, this tells the session to share across anything within the main domain. You can also restrict it further by using “.[subdomain].[domain].com”.

Works like a charm.

Ruby , ,

PHP That Just Works

September 18th, 2009
Comments Off

I’m not one of those developers that likes to waste time setting up my dev environment. If I have a project to complete, I should be coding, not messing with config files, compiling Apache or messing with PHP to load that extension I just found out I needed. I like messing with my machine but not with an impending deadline.

With all this in mind, I’ve tried to simplify my entire dev environment over the years… not from a perspective of using simple tools and sticking to basics, but instead from a perspective of optimizing my workflow and keeping my development moving. In the middle of a project, I feel that systems admin focus should be on tweaking the production machines, rather than screwing around with my local dev box.

Zend Server CE Control PanelAlways looking for ways to make development easier, I decided to give the Zend Server CE (Community Edition) a try. The idea is that it installs your PHP, Apache, MySQL (with PHPMyAdmin) and a great management console that allows you to install extensions with just a click. You can still customize our Apache conf and other things, but it works well out of the box (I’m on a Mac). While you can run it along side another Apache installation, I tweaked it to run on port 80 and handle multiple virtual hosts. While this may not be ideal for all teams, it can allow everyone to have the same environment without having to mess with poorly updated all-in-one dev environments.

One of the reasons this excites me, being on a Mac, is that every time in the past that I’ve updated my Mac OS, the install kills something on my system causing my development environment to go all wonky. Then I have to spend precious work time to fix it. The Zend Server CE install keeps everything nice and tidy and, to my knowledge, doesn’t rely on other stuff outside of the install to function (unless you are setting host entries in your hosts file, /etc/hosts on a Mac).

This, oh so sweet, environment gets a little better. While I still have a variety of development tools at my disposal, my main IDE has become Zend Studio (Eclipse). I know, I know… there are a lot of purists out there that say it’s too heavy, or sluggish, or isn’t simplistic enough. There are occasional bugs or things that annoy me, but at the end of the day it is integrated enough that it lets me get my work done. That’s the whole point of an IDE. It also integrates with Flex Builder plugin which is a plus for me.

Development, PHP, Reviews , , , , , ,

Agility Futility

April 30th, 2009
Comments Off

Agile development methods have really taken the IT world by storm. In the last few years Agile has become THE way to manage and develop software, especially among young, emerging companies. It brings to the table a flexible model for communication and progress as well as a sense of anti-corporatism which is heavily embraced in many IT workplace cultures.

While this almost hippie-ish movement of peace, love and agileness has really relaxed a lot of work cultures and has been a boon for productivity and customer interaction, there are some often ignored pitfalls which eventually leave a work culture devastated and disallusioned.

Do it for the right reasons
It’s not enough to adopt Agile just because it works well for some or because you read about it on a trade blog. Agile, or any other hastily adopted process or methodology, cannot solve all your problems. It will simply make you more of what you already are. Your weaknesses, if not already apparent, will eventually surface and you must be ready and willing to acknowledge and address them.

Successfully adopting any methodology like this requires that you have an adequate paradigm about people, business and clients which instills respect and integrity and is in sync with the methodology. If your efforts are only surface-level rhetoric, and no paradigm shift occurs, the process will fail and you’ll be looking for the next “great thing” to fix your woes.

Use best practices
While Agile lends itself to a more rapid pace of development, it can be easy to leave crucial parts of the SDLC out of the equation in the interest of time. Adequate quality assurance and testing are often the first to go. Test-driven development, which utilizes a testing process as part of your development, is a great way to minimize QA overhead while maintaining work quality. Building test code as, or before you develop may add a little to your initial timeline but will result in fewer deployment panics and provide built-in specifications for your code to adhere to.

Don’t sacrifice quality
Cutting corners is a big no-no. Decide what are features and what are bugs. Determine which of them are in your critical path and develop them properly. If you can’t do them right, choose not to do them or arrange for more time to complete the project. NO ONE benefits from poorly thought out, shoddy work. Management only seems happy until they realize the problem they rushed you to fix ends up worse than before. It is the developer’s job to speak up and communicate risks and issues which then translate into proper timeline and feature negotiation.

Don’t ignore problems. Moderately plan for the future and proactively address problems and improvements through iterations. Ignored problems build up over time and eventually result in a complete rewrite. Iterative development can be your friend. Keep track of issues and slip some into each iteration so you can keep up with the change.

Be realistic
There is an old project management addage that explains how with every project, three factors are desired: speed, low cost, and great quality. You can pick two. Having all three is a fantasy propagated by poor sales teams. This is because any improvement in any one or two of the factors will negatively affect the third. For example, if speed is crucial, it will likely affect quality and cost. If very low cost is required, completion times will often be longer and quality will suffer. The only way to realistically improve one of the factors is to improve your effectiveness in all three of them. Attempting to use Agile development concepts to short-sightedly manipulate any of these factors is counter-productive.

Avoid burn-out
Finally, keep in mind that overworking your developers is counter-intuitive in an Agile model. With a more top-down, waterfall approach, you may get away with piling on extra hours, shoving more into a deadline and driving with a whip. Burn-out doesn’t make for solid code, good morale, communication and low turnover; all which are factors behind a well-functioning Agile machine. Utilize iterations to drive realistic deadlines and continually reassess based on top priorities to keep everyone focused on the same goal.

Development, Technology , , ,

GE Brings Minority Report to Life?

March 9th, 2009
Comments Off

Okay, well not quite, but I thought this was pretty amazing. GE “brings good things to life”, almost literally.

Tom Cruise doing virtual computing

Tom Cruise doing virtual computing

For those of you that have seen Minority Report, you know that that people have been trying to recreate that type of computing model in the real world since the movie came out. Well, it doesn’t quite exist yet, but GE may be headed in the right direction.

Tom Cruise doing virtual computing

Tom Cruise doing virtual computing

As a way to draw interest to their Smart Grid energy technology, GE has created an interactive 3D experience that is pretty startling at first. It almost seems unreal… until you realize that it is actually interacting with you. Check out a video of the guys at doppelagent.de experiencing this first hand, although you will want to try it out for yourself.

Okay, so it’s not quite Minority Report level computing. However, with the live human 3D interaction inside of a virtual, yet real, space, all done in the comfort of your browser using Adobe Flash… this is quite amazing, nonetheless. I would love to see this kind of technology take off and be available in a browser… maybe we’re not far off.

Flex, Technology , , , , ,

Variable Conflicts in JavaScript

March 8th, 2009
Comments Off

It is quite common to find yourself with a heinous JavaScript error on a page that until recently seemed to work flawlessly. Perhaps you changed your JavaScript. Maybe you included a 3rd party script or a script from another domain onto your page. Now, everything that was once peachy has turned to sour grapes!

More than likely the problem is that with all the varying scripts on the page, variables from other functions will conflict with variables in the existing code, causing failures and errors, and even worse, overwrite variable values without any notification. It can take hours to track down variables that conflict between scripts before it finally works. Some developers figure that these kinds of issues are probably just inherent in client-side web development and use that as another “reason” as to why JavaScript is inferior.

I don’t think client-side development should be looked at as inherently quirky. Sure there are some browser nuances and environment issues that you can’t control, but you can develop very robust code that works well and adequately serves its purpose.

There are a few key things that you can do to make sure you code is clean and runs in it’s own scope.

First, anytime you create a variable in a function that should not be available outside of the function use the “var” identifier to initialize the variable and restrict it to the local scope.

1
var item_count = 20;

Second, I would recommend putting much of your code into JavaScript prototype objects. The prototype method of creating objects is JavaScript’s way of creating a class-like object (although prototypes are quite different from actual classes, read up on JavaScript prototypes for more info). In short they allow you to create a group of related function that can share assets between prototype functions (method equivalents).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// start by creating your initial prototype function, like a constructor
function Calendar(month, year)
{
    this.month = month;
    this.year = year;
    date = new Date();
    this.current_year = date.getYear();
}

// create functions that inherit characteristics of the prototype Calendar
Calendar.prototype.display_Month = function()
{
    var max_days_in_month = 31;
   
    ... continue body of function
}

// reference variables from the prototype using "this"
Calendar.prototype.display_Week = function()
{
    if(this.year < this.current_year)
    {
        // return error of some kind
    }
    ... continue body of function
}

Later in your code you can instantiate one or more of the prototype classes, each having their own scope and assets. This will keep them from conflicting with any other code.

1
2
3
cal_1 = new Calendar(5, 2005);

cal_2 = new Calendar(3, 2006);

If you abstract your code well enough, using the power of JavaScript in this manner allows you to create very reusable code that can be used in any application with any combination of JavaScript without problems. There are other things you can do to abstract your JavaScript and make it more functional, but these examples serve the purpose of resolving scope issues and get you on the road to cleaner, reusable code.

Development, JavaScript , , , , , ,

PHP Optimization

February 12th, 2009
Comments Off

Many developers don’t really put much thought into code optimization. Frankly, their applications don’t see enough traffic for optimization to be much of an issue. However, regardless of your application’s actual needs and whether or not you are having speed issues, there are some good habits that you can develop that will either help you in a bind or just ensure that all of your applications are finely tuned machines.

Optimization can be frustrating at times. I am quite familiar with server systems, however, I do not consider myself a full-fledged systems admin. The more I learn, the more I realize I don’t know. I do know that finding the right balance between server and code optimization takes skill. Too much customization in either direction can make the code or server difficult to manage. Sometimes, throwing more hardware at the problem can do the trick, but this is usually just temporary as the problem usually multiplies itself by the number of servers your application is running on. Inversely, good optimization improves performance across all the servers the application is running on.

Typically, I figure that if a particular change makes something overly difficult to manage, then it probably is not worth doing, because there are usually other people involved and there is too much room for mistakes. I will sometimes break this rule for my own personal stuff since I am the only one involved. You need to decide at what point it becomes to difficult. Good documentation goes a long way.

Here are some of the things (using PHP as the example) that I will often do to optimize things at the software level, without going too far.

  • It’s good practice to use literal strings wherever possible. Using the doublequotes tells the parser to expect potential interpretable values in the string, slowing down processing just a bit. This can add up with HTML intensive applications.
  • Keep code files from getting too large. If two chunks of code are rarely used together split them into separate files so that PHP doesn’t have to load more code than necessary.
  • Keep file inclusions to a minimum. Inclusions require additional disk reads and adds more time to processing. Don’t go overboard and sacrifice code organization in the process… includes can be your friend. If the include makes sense, do it. MVC development often ignores this because of the nature of the methodology, but it is still a good practice to keep in mind whether using MVC or not.
  • Try to convert any uses of division to multiplication. Division eats up processing, especially if iterated several times. Example: instead of $var / 10 do $var * .1
  • Utilize break or continue to control code flow in loops, etc… If you review enough of your code, there are probably some areas where you are using a for() or foreach() that are running beyond their necessary iterations. In other words, you run the loop to accomplish a certain task or value, but the loop continues to run through all possible interations, even after the task or value is complete. Either find ways to use while() or use break or continue where possible.
  • Sometime a caching application can help ease unnecessary processing. TurkMM or APC can dramatically improve PHP processing speeds by keeping realtime code compiling to a minimum.
  • Other types of caching can be done using Memcached or other similar code. This type of caching can cache files, database results, large data objects, etc… If there is any data that requires processing that doesn’t change much between pages or between users, this type of caching can drastically speed up response times. This is not only helpful for bypassing unnecessary processing, but it can also limit your application’s need to hit the database. Memcached can be used on many levels of the application to reduce processing and web service request overhead. For example, a list of users may not change very often, so there isn’t a need to retrieve a fresh list upon every request. You could cache the list for 2 hours, for instance, and your application would only have to query the database for the list once every 2 hours, instead of every page load.
  • Only use SSL where necessary. SSL encryption slows the response of your application.
  • A browser will hold on to the connection with the webserver as long as it is waiting to receive data, hence requiring the webserver, in this case Apache, to standby until everything is processed and ready to send. In some cases, your data may not require the user to review anything afterward, so it may be a good thing to consider forking your code into multiple threads. This can be tricky, but if done properly, Apache will respond to the user faster, leaving connections open for other users and allowing the code to finish in it’s own time. There are also functions in PHP that allow you to check if the user’s browser is still responding to the connection and close the connection if needed.
  • When a database is involved, use good SQL and data handling processes. I won’t go into a lot of SQL specifics here, but the following are things to consider:
    • Only request the data you need from the database. The more data that is requested, the more that has to be sent across the wire and get parsed by the application.
    • When your application is done processing a large data set, it should release the result set to free up memory for other operations. If only parts of the data are needed for processing later in the request, those parts of the data can be copied to another data object while the original object is cleared. Most applications don’t work with large enough data sets to have this concern, but when you do, you’ll find that automatic garbage collection won’t be enough.
    • Most modern databases will allow you to combine multiple operations into a single SQL request. Taking advantage of this can GREATLY minimize your application’s overhead of going back and forth to the database. For example, you can combine multiple SELECT queries using UNIONs. You can do conditional INSERTs when you might normally do a SELECT, check the logic in your code, then do an INSERT. Take advantage of subqueries. Also, in some cases it can be beneficial to do a few large queries early in your application and avoid the several smaller ones that might be required later.
    • Take advantage of query caching and preparing if your database and code support this functionality.
    • When possible, have the database be on the same machine as the application. This introduces some scalability challenges (to be discussed at another time), but can be worth the effort, even with large, distributed applications. This will keep your code-to-database processing times VERY fast.
  • Keep what your store in sessions to a minimum. You can also use Memcached as a custom session handler between servers to replace database sessions across multiple servers, for improved performance.
  • When possible, “minify” your JavaScript, CSS and HTML. Keep the output to a minimum and write your code so that the browser will take optimal advantage of cached CSS, JavaScript and HTML
  • PHP is an interpreted scripting language. When it comes down to it, it can only move so fast. For larger applications that do a lot of processing, it can greatly decrease the system load to offload major processing to a program written using a compiled language such as C, C++, or yes, even Java. You can build an RPC interface to the compiled app using JSON, SOAP or XMLRPC. Service communication may take some overhead, but the compiled app will more than make up for it. Using this method allows you to keep your interface code flexible using PHP while gradually putting any labor-intensive operations on to something more suitable.

Doing any or all of these things will benefit your applications greatly and still keep your code manageable. This doesn’t address all of the innovative things you can do on the UI and client-side of an application to improve performance. Of course there is always more you can do, but this should get you started.

Development, PHP , , , ,

PHP Error Handling

January 5th, 2009
Comments Off

For many developers, error handling is somewhat of a myth. They’ve heard about it but with pressing deadlines and management ignorance they are often not given ample opportunity to learn about, let alone implement, proper error handling and debugging techniques.

So, like many things, there are many ways to skin this cat. When working with debugging and error handling, I usually keep the following things in mind:

  • How to trap errors and debug messages
  • How to collect messages
  • How to output messages without disrupting the display
  • How to integrate throughout entire application

One of the most important things to keep in mind with debugging is that OOP is your friend. I won’t get into the pros and cons of OOP here, but just mention that keeping things in classes allows you to abstract your debugging, handle it cleanly and keep it specific to the needs of the class. With this in mind, my following recommendations will be based on this premise.

How to Trap Errors and Debug Messages

Trapping code problems is relatively straightforward. You want to do some kind of test or comparison to determine if there is a problem. Then you determine how to get the message that adequately explains the problem and make it available to the rest of your program. Sometimes a simple if/else block will do the trick to test values that would not otherwise error out, but may need to be set for certain logic to function properly. PHP 5+ provides exception handling using try/catch which is very useful for catching errors that would normally kill your script, allowing you to then handle it accordingly and capture the error message. For example:

1
2
3
4
5
6
7
8
9
10
try
{
    file('my_file.txt');
}
catch(Exception $e)
{
    $error = $e;

    // do something here to compensate for the error
}

You can also create your own exception class that allows you to automate specific actions when certain errrors occur, like logging to a file, emailing the admin, etc.. You can find more information about this at http://www.php.net/manual/en/language.exceptions.php

How to Collect Messages

The first thing you’ll want to do is determine a standard method of collecting messages that you can use in all your applications and classes. You can either write a separate class to do this and extend it in your other classes, or you can create methods within each class that handle things specific to the classes needs. In my example, I collect the messages local to the class.

You can do the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
class My_Class
{
    private $last_message = '';
    private $message_array = array();
    private $error_array = array();
    private $debug = false;

    function __contruct($debug)
    {
        $this->debug = $debug;
    }

    private function add_Message($message)
    {
        $this->last_message = $message;
        $this->message_array[] = $message;

        return true;
    }

    private function add_Error($error)
    {
        if($this->debug)
        {
            $this->error_array[] = $error;
        }

        return true;
    }

    public function get_Last_Message()
    {
        return $this->last_message;
    }

    public function get_Message_Array()
    {
        return $this->message_array;
    }

    public function get_Error_Array()
    {
        return $this->error_array();
    }

    public function another_Method()
    {
        // some code goes here
       
        if($problem_found)
        {
            $this->add_Message('This did not work.');
            $this->add_Error('Very complex error for only me to see.');
        }

        // continue method
    }

    public function yet_Another_Method()
    {
        // some code goes here

        try
        {
            // some code that could fail
        }
        catch(Exception $e)
        {
            $this->add_Error('Another very complex error message. ' . $e);
        }
    }
}

This allows you to add a set of methods to a class that help to control messaging as needed. The methods add_Message() and add_Error() are always called if there is a problem, however, error messages are only collected if $debug is passed in as ‘true’ into the constructor. This allows your user to see friendly messages and react as needed, but keeps horrid errors from displaying all over the screen.

Implementation of the class would look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$debug = true;

$my_instance = new My_Class($debug);
$my_instance->another_Method();
$my_instance->yet_Another_Method();

// imporant message for user to see
echo $my_instance->get_Last_Message();

// if you need to show the user the entire list of user friendly messages
echo implode('<br />', $my_instance->get_Message_Array());

// output debug and error info and set to string to be handled later
if($debug)
{
    $msg_output = implode('<br />', $my_instance->get_Error_Array());
}

This example, however, is somewhat limiting as it only collects debug data within the class its self. This may be useful for a really low-level class like a database abstraction layer or some other core application class where you cannot guarantee that assets like a database connection or session will be available to do more advanced debugging logging or output. In this scenario, you can abstract your debug collection methods to their own class or have them present in each class and build methods to pass the debug around so it will “bubble to the top” of your application where you can display it and make other logical decisions.

In cases where you are dealing with more advanced classes or you know that a database connection and/or session is available, you have many other options. I use the method above for my core classes, otherwise I use a session and database dependent static class that allows me to make all kinds of debug calls throughout my application. Because they can be stored in the session, and optionally the database, I can access them for display when needed but also track recurring issues across sessions. Combined with custom exception handlers, I am also able to respond to different levels of errors appropriately and send notifications as needed.

How to Output Messages Without Disrupting the Display

Now that you can trap the errors and collect them, you need to properly output them so that they work well with your display elements. Perhaps you only want to show the user the user friendly messages and output actual errors elsewhere if you are in debug mode. I usually do two things to accomplish this properly:

  • Never echo or print ANYTHING unless it is a method or class that outputs something for display.
  • Collect the display content until all necessary logic has been processed and it is safe to determine whether or not it should be displayed.

Doing these two things alone, can make your applications operate much smoother. Here’s two examples: a good and bad way to display page content.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// GOOD WAY
$html = '';

$html .= '
    This is some page HTML and content.
'
;

$html .= '
    This is more page content.
'
;

// processing is done, echo content
echo $html;



// BAD WAY
echo 'This is some page HTML and content.';

echo 'This is more page content.';

In the BAD WAY, if an error happens between the two echos, it’s too late to properly respond. Using the GOOD WAY, if an error happens, the user display can be changed or even redirected to a new page before anything is displayed to the user. Something else to keep in mind is that if you are coding a display method that is used somewhere deep in your program, you may want to consider returning a string rather than echoing out at the end of the method. This way you can still give control to the top level display methods and echo it out when you are ready.

How to Integrate Throughout the Entire Application

Now you are capturing debug and error messages, you have them integrated into your class, and you are outputting the info at the right time, however, your application is probably more than one class, so how do you integrate this into a large application?

If all of your classes are handling errors and messages like the examples above, you can pass the arrays of messages up through the class hierarchy and merge arrays of messages, so that the final display class can handle them properly. It’s ideal to also pass the $debug parameter down into all your classes so that as they extend or instantiate each other, everything has the same debug mode, either on or off. Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
class New_Class
{
    private $last_message = '';
    private $message_array = array();
    private $error_array = array();
    private $debug = false;

    function __contruct($debug)
    {
        $this->debug = $debug;
    }

    public function some_Method()
    {
        // some code goes here
       
        // instantiate needed class with debug mode from this class    
        $m_class = new My_Class($this->debug);
        $m_class->another_Method();

        // collect messages and errors from $m_class
        $this->message_array = array_merge($this->message_array, $m_class->get_Message_Array());   
        $this->error_array = array_merge($this-> error_array, $m_class->get_Error_Array());

        // continue method
    }

    private function add_Message($message)
    {
        $this->last_message = $message;
        $this->message_array[] = $message;

        return true;
    }

    private function add_Error($error)
    {
        if($this->debug)
        {
            $this->error_array[] = $error;
        }

        return true;
    }

    public function get_Last_Message()
    {
        return $this->last_message;
    }

    public function get_Message_Array()
    {
        return $this->message_array;
    }

    public function get_Error_Array()
    {
        return $this->error_array();
    }

}

Obviously, there are several ways to handle these kinds of things. As long as you are consistent with all your code and provide a simple way for you to capture errors, you’ll have a great start… and your process will grow and mature with your application requirements. These concepts only scratch the surface, as they can be abstracted into a very dynamic and robust debugging framework with a lot of flexibility… but that is a topic for another time. Regardless of how you do it, if you put some good thought into it, you’ll save yourself a lot of time and headache down the road.

Development, PHP , , , ,

Handling Foreign HTTP Variables

December 19th, 2008
Comments Off

I have been asked several times about various ways to secure e-commerce applications and other systems from unexpected or badly formatted POST/GET variables. This is a common issue as many developers only develop for the expected and test their application accordingly. Developing for the unexpected can be a bit tricky.

I’ll try to address this issue generically enough that it can be useful to anyone who has a similar issue. I’ll show examples using PHP, however, this process can also be implemented in a similar way using any similar language.

Scenario: You have a small shopping cart that you built for a client and have recently found out that one of the client’s customers had found a way to submit incorrect data to the shopping cart by saving the HTML form page to his computer, then modifying some of the values, and sending the form from his computer. This allowed him to add more options to his product and also lower the price at the same time. Your system didn’t check for anything like that, so you didn’t find it until last week, a whole month later, when you noticed the purchase price in the database was drastically different.

There are a few things you can do to safeguard your application against this kind of problem.

First, you need to make sure that the request comes from a page that is on the same server as the application receiving the request by comparing the referring domain, or you can even go the level of checking the page it came from as well.

Example of domain validation in PHP with Apache:

1
2
3
4
5
6
7
8
// regex match to find referrer "www.domain.com"
preg_match("/https?://([a-zA-Z0-9\-\.]+)/", $_REQUEST['HTTP_REFERER'], $matches);

// check if referrer domain is the same as local
if($matches[1] == $_SERVER['SERVER_NAME'])
{
    // continue processing
}

The previous example verifies that the referring domain IS actually the application domain. You can’t just do a check to see if the server name exists in the referrer, as that can be fooled as well. If the referrer is http://www.mydomain.com/yourdomain.com/hack_you.php, the following would allow the referrer to pass and send invalid data.

1
2
3
4
if(strpos($_REQUEST['HTTP_REFERER'], $_SERVER['SERVER_NAME']) > -1)
{
    // continue processing
}

Second, it’s bad practice to pass anything as a form field value if you can reference using a database id instead. In other words, if a price is associated to a product in a database table, don’t pass the price of $10.00, pass the product id (say 342) and reference the price in your code using that id to extract the information you need from the database.

It’s good practice in general to do whatever you can to never expose data that could be harmful if modified. Along with the previous price scenario, you want to keep user data, database config data, product information, etc… out of sessions, cookies, form fields, etc… Use id’s and other types of identifiers to reference the data in your code.

Now, it’s important to understand that anything is hackable if the attacker has enough time and resources, so there is always the possibility that someone could not only spoof your domain, but with enough attention to detail, they could learn id’s and other values that you are passing them and find ways to send them or modify them in transit or while doing a domain spoof. If your database table structure is setup right and you have guarded against SQL injection (which is a whole separate topic), if id 342 is being passed in your program and they change the id to 549, when your program queries 549 in the database, it should find the appropriate price and associated information. In contrast, if you pass multiple variables in association with one product, changing one of the values, could completely change what you expect to see in your program.

Third, consider creating some tight restrictions on the incoming variables. The first way to do this is to not use the $_REQUEST array, as this allows either incoming POST or GET values. If your form uses POST or GET, you should use the corresponding $_POST or $_GET arrays to access the data. This will limit the ways a hacker can send values into your application as well as lessen the amount of validation you need to do.

In addtion to using the proper incoming array, you can create a variable register that limits the values you expect on that particular page. For example, if you are expecting to see “product_number” and “product_id’ in $_POST, but your application also receives a “price”, you can be sure that someone is sending values to your application that are unexpected. You can either set these values to null, or set flags or logs in your system to notify you that you may have someone trying to manipulate your application.

Checking for unwanted variables in PHP:

1
2
3
4
5
6
7
8
9
10
11
$allowed_vars = array('product_number', 'product_id');

foreach($_POST as $key => $value)
{
    // if not in array of keys, delete it
    if(!in_array($key, $_POST))
    {
        // be careful not to delete vars that may be automatically sent by PHP or Apache
        unset($_POST[$key]);
    }
}

The last thing you’ll want to do with your most important incoming values is verify that limits and values are not out of range and are acceptable. If someone is registering for something and they get free points or bonuses or something, you want to make sure that any variables being sent that contain these values are checked for minimums and maximums, etc… The same can be applied to various other types of values. Just be smart and validate things that your know are crucial to the functionality of your application.

Development, PHP, Ruby , ,

Smarty For Dummies

September 5th, 2008
Comments Off

Are Smarty Templates really that useful? Well, I have my opinions, but decide for yourself. I hope you’ll do a doubletake after reading this.

For a long time now, I’ve avoided using Smarty Templates as much as possible. Perhaps it’s the cheesy assuming name, or perhaps it’s that I like to keep my applications simple to deploy and free of unneeded dependencies. More than anything, I think it’s because templating has been a solved problem for me for some time now and I didn’t have the desire to fix what was not broken. 

A while back, I had the opportunity (tongue in cheek) to use Smarty Templates for an existing project I was working on, and I was able to evaluate its functionality, its proposed NEW paradigm and how it integrates into the development process. Now, by no means am I a Smarty expert now and I don’t know everything about its benefits, but for better or for worse, I have formed an opinion that seems to adequately describe the “reality” of Smarty Templates. They pretty much work about how I expected. No surprises there, but here are some things to think about. 

  • Smarty Templates are simply one way of implementing a template based system into a website. It adds another layer of processing to your development. It has the benefit of caching parts of the design so that processing is lessened.
  • The main idea behind any templating system or CMS is to provide separation between the various layers of development: design, site structure, basic logic, core libraries, etc… Some systems provide some basic separation and others provide many layers of separation. Smarty Templates aim to resolve a lot of the problems that come about with having multiple roles working on a site: designer, interactive developer, programmer, etc… Separation of these roles can often be difficult. I think much of this comes down to a core problem with the PHP community as a whole. PHP has become a very popular language but has also been adopted by a lot of people that are not programmers by trade. Thus, the lack of standards and experienced developers is not at the same level that you might find in other web or application language communities. Smarty Templates, although seen by many as a godsend, seems to be a temporary solution for a community that for the most part lacks the structure and standardization to solve the problem in a more appropriate way.
  • Smarty Templates allow you to build backend code, allows designers to build templates, and then allows the developers to hodge-podge them all together with a series of inclusions. This, like much of the PHP code I see on the web, lends itself to disasterous application structures and does not enforce a paradigm. I’m all for flexibility, but all Smarty Templates have done is add another arbitrary level of confusion on top of what is usually already messy code.
  • I understand that Smarty Templates is supposed to shelter core logic from the designer, allowing them to use a “templating language” to create the display. However, the Smarty language itself uses programming methodologies to display data. So, not only do you have programming on top of programming, but the designer still has a way to destroy the interface from lack of knowledge, understanding and perspective. If the interface looks wrong or breaks in some fashion, where do you go to fix the problem?
  • Smarty Templates were obviously created from a developers perspective. I am a developer with a design background and I have to say that with good understanding of both perspectives the lines between design and development are blurred with Smarty Templates. A developer can simply develop less logic and assume that the designer will take care of the rest with the variables he has access to, or a developer could limit the designer by only making available the core necessary values. The point is that, this line is blurry and just adds yet another level of abstraction with which internal standards must be enforced. Not so ironically, although developers have told me that they use Smarty Templates to allow the site to scale and to allow the designers to maintain the templates, I have never seen a situation where the roles were separated. The developer creates the PHP, then the same developer has to go in and modify the templates, bypassing the whole proposed benefit of using Smarty Templates in the first place.
  • With some more forethought, developers can create very structured and modular code, that with the proper API’s and CSS integration, can give the designer all the control necessary while still protecting data. Ideally, you would have a robust CMS, that allows your designers to have control over certain components, while still being able to deploy your core logic and modules. Then, for areas of the site that need more flexibility, you could use Smarty Templates within the CMS for those designers with a bit more technical skill and allow them to control sections of the page with Smarty Templates, not the entire site structure.
  • Although Smarty Templates can be useful if standards and structure is enforced, it seems as though it’s just another bandwagon thing that gets adopted out of pure social acceptance before the needs and resources of the project are adequately assessed.

I know that some of you may LOVE Smarty Templates and I understand that it may play a crucial role in your development, but please keep in mind that I am always ready and willing to use technologies that TRULY benefit my development (I’ve used Smarty on blogs for instance). This isn’t the place for flaming, but perhaps you can share something that I have missed. 

If I think really hard about what my needs are as a developer, Smarty kinda makes sense. The problem is that Smarty Templates seem to be the result of a desparate developer who’s fed up with designers messing up his apps, and/or who simply hates HTML (or all of the presentation layer for that matter) and doesn’t want to do anything but pump out logic. I don’t see how Smarty Templates benefit the designer or the project management process. I know, I know…. there are all the arguments about separation of presentation, from logic, from data, etc…. blah, blah, blah. Smarty is no more a clean separation of these layers than adding more frivolous layers of management to an organization to make it more effective. Smarty attempts to make the separation, but it’s not clean. The designers job should be completely separate from any kind of backend logic, allowing them to focus on HTML, graphics, CSS, etc… Smarty Templates don’t enforce this separation as HTML can be done by the designer, or a Smarty variable can contain HTML. So where is this separation they speak of? The second a developer includes HTML into any Smarty variable, the notion that the designer has full control over the UI goes right out the window and all you’re left with is another blurry layer of logic….. logic that cannot adequately communicate errors or debug info back to the backend logic. 

Ideally, if you want a cleaner separation of code and design, consider using or creating an MVC style framework which more strictly enforces this separation and allows for constrained usage of Smarty Templates or similar templating code. Zend Framework, Code Igniter and Cake are a few worth looking at.

I’m not trying to deter people from using Smarty Templates. I want to make it clear that I am simply urging you to use them as a tool to aid and not replace proper development. I see the value that Smarty Templates has the potential to bring to the table and will probably continue to use it for some things in my development. However, in the meantime, I am jumping off the wagon until it becomes something that I cannot live without.

PHP , , , ,



Sponsored Links



agile ajax black hat Cake PHP centering clifford stoll css cuckoo's egg energy energy drinks espionage flash Flex hacker jquery modular MVC objects optimization performance PHP script timer smarty smarty templates stylesheet up-time uptime variable scope web 2.0 Zend Framework