Many developers don’t really put much thought into code optimization. Frankly, their applications don’t see enough traffic for optimization to be much of an issue. However, regardless of your application’s actual needs and whether or not you are having speed issues, there are some good habits that you can develop that will either help you in a bind or just ensure that all of your applications are finely tuned machines.
Optimization can be frustrating at times. I am quite familiar with server systems, however, I do not consider myself a full-fledged systems admin. The more I learn, the more I realize I don’t know. I do know that finding the right balance between server and code optimization takes skill. Too much customization in either direction can make the code or server difficult to manage. Sometimes, throwing more hardware at the problem can do the trick, but this is usually just temporary as the problem usually multiplies itself by the number of servers your application is running on. Inversely, good optimization improves performance across all the servers the application is running on.
Typically, I figure that if a particular change makes something overly difficult to manage, then it probably is not worth doing, because there are usually other people involved and there is too much room for mistakes. I will sometimes break this rule for my own personal stuff since I am the only one involved. You need to decide at what point it becomes to difficult. Good documentation goes a long way.
Here are some of the things (using PHP as the example) that I will often do to optimize things at the software level, without going too far.
- It’s good practice to use literal strings wherever possible. Using the doublequotes tells the parser to expect potential interpretable values in the string, slowing down processing just a bit. This can add up with HTML intensive applications.
- Keep code files from getting too large. If two chunks of code are rarely used together split them into separate files so that PHP doesn’t have to load more code than necessary.
- Keep file inclusions to a minimum. Inclusions require additional disk reads and adds more time to processing. Don’t go overboard and sacrifice code organization in the process… includes can be your friend. If the include makes sense, do it. MVC development often ignores this because of the nature of the methodology, but it is still a good practice to keep in mind whether using MVC or not.
- Try to convert any uses of division to multiplication. Division eats up processing, especially if iterated several times. Example: instead of $var / 10 do $var * .1
- Utilize break or continue to control code flow in loops, etc… If you review enough of your code, there are probably some areas where you are using a for() or foreach() that are running beyond their necessary iterations. In other words, you run the loop to accomplish a certain task or value, but the loop continues to run through all possible interations, even after the task or value is complete. Either find ways to use while() or use break or continue where possible.
- Sometime a caching application can help ease unnecessary processing. TurkMM or APC can dramatically improve PHP processing speeds by keeping realtime code compiling to a minimum.
- Other types of caching can be done using Memcached or other similar code. This type of caching can cache files, database results, large data objects, etc… If there is any data that requires processing that doesn’t change much between pages or between users, this type of caching can drastically speed up response times. This is not only helpful for bypassing unnecessary processing, but it can also limit your application’s need to hit the database. Memcached can be used on many levels of the application to reduce processing and web service request overhead. For example, a list of users may not change very often, so there isn’t a need to retrieve a fresh list upon every request. You could cache the list for 2 hours, for instance, and your application would only have to query the database for the list once every 2 hours, instead of every page load.
- Only use SSL where necessary. SSL encryption slows the response of your application.
- A browser will hold on to the connection with the webserver as long as it is waiting to receive data, hence requiring the webserver, in this case Apache, to standby until everything is processed and ready to send. In some cases, your data may not require the user to review anything afterward, so it may be a good thing to consider forking your code into multiple threads. This can be tricky, but if done properly, Apache will respond to the user faster, leaving connections open for other users and allowing the code to finish in it’s own time. There are also functions in PHP that allow you to check if the user’s browser is still responding to the connection and close the connection if needed.
- When a database is involved, use good SQL and data handling processes. I won’t go into a lot of SQL specifics here, but the following are things to consider:
- Only request the data you need from the database. The more data that is requested, the more that has to be sent across the wire and get parsed by the application.
- When your application is done processing a large data set, it should release the result set to free up memory for other operations. If only parts of the data are needed for processing later in the request, those parts of the data can be copied to another data object while the original object is cleared. Most applications don’t work with large enough data sets to have this concern, but when you do, you’ll find that automatic garbage collection won’t be enough.
- Most modern databases will allow you to combine multiple operations into a single SQL request. Taking advantage of this can GREATLY minimize your application’s overhead of going back and forth to the database. For example, you can combine multiple SELECT queries using UNIONs. You can do conditional INSERTs when you might normally do a SELECT, check the logic in your code, then do an INSERT. Take advantage of subqueries. Also, in some cases it can be beneficial to do a few large queries early in your application and avoid the several smaller ones that might be required later.
- Take advantage of query caching and preparing if your database and code support this functionality.
- When possible, have the database be on the same machine as the application. This introduces some scalability challenges (to be discussed at another time), but can be worth the effort, even with large, distributed applications. This will keep your code-to-database processing times VERY fast.
- Keep what your store in sessions to a minimum. You can also use Memcached as a custom session handler between servers to replace database sessions across multiple servers, for improved performance.
- PHP is an interpreted scripting language. When it comes down to it, it can only move so fast. For larger applications that do a lot of processing, it can greatly decrease the system load to offload major processing to a program written using a compiled language such as C, C++, or yes, even Java. You can build an RPC interface to the compiled app using JSON, SOAP or XMLRPC. Service communication may take some overhead, but the compiled app will more than make up for it. Using this method allows you to keep your interface code flexible using PHP while gradually putting any labor-intensive operations on to something more suitable.
Doing any or all of these things will benefit your applications greatly and still keep your code manageable. This doesn’t address all of the innovative things you can do on the UI and client-side of an application to improve performance. Of course there is always more you can do, but this should get you started.