Thursday, October 28. 2004
Cgiapp 1.5 has been released; you may now download it.
This release fixes a subtle bug I hadn't encountered before; namely, when a
method name or function name is passed as an argument to mode_param(), run()
was receiving the requested run mode... and then attempting to process that
as the mode param. The behaviour is now fixed, and is actually simpler than
the previous (non-working) behaviour.
Also, on reading Chris Shiflet's paper on
PHP security, I decided to reinstate the query() method. I had been using
$_REQUEST to check for a run mode parameter; because this combines the GET,
POST, and COOKIE arrays, it's considered a bit of a security risk.
query() now creates a combined array of GET and POST variable (POST taking
precedence over GET) and stores them in the property $_CGIAPP_REQUEST; it
returns a reference to that property. run() uses that property to determine
the run mode now.
Enjoy!
Friday, October 22. 2004
I've been playing with parameter testing in my various Cgiapp classes, and
one test that seemed pretty slick was the following:
if (!array_key_exists('some_string', $_REQUEST)) {
// some error
}
Seems pretty straight-forward: $_REQUEST is an associative array, and I want
to test for the existence of a key in it. Sure, I could use isset(), but it
seemed... ugly, and verbose, and a waste of keystrokes, particularly when
I'm using the param() method:
if (!isset($_REQUEST[$this->param('some_param')])) {
// some error
}
However, I ran into a pitfall: when it comes to array_key_exists(),
$_REQUEST isn't exactly an array. I think what's going on is that $_REQUEST
is actually a superset of several other arrays -- $_POST, $_GET, and
$_COOKIE -- and isset() has some logic to descend amongst the various keys,
while array_key_exists() can only work on a single level.
Whatever the explanation, I ended up reverting a bunch of code.
Wednesday, October 20. 2004
Inspired by a Slashdot book
review of High
Performance MySQL.
I've often suspected that I'm not a SQL guru... little things like being
self taught and having virtually no resources for learning it. This has been
confirmed to a large degree at work, where our DBA has taught me many tricks
about databases: indexing, when to use DISTINCT, how and when to do JOINs,
and the magic of TEMPORARY TABLEs. I now feel fairly competent, though far
from being an expert -- I certainly don't know much about how to tune a
server for MySQL, or tuning MySQL for performance.
Last year around this time, we needed to replace our MySQL server, and I
got handed the job of getting the data from the old one onto the new. At the
time, I looked into replication, and from there discovered about binary
copies of a data store. I started using this as a way to backup data,
instead of periodic mysqldumps.
One thing I've often wondered since: would replication be a good way to do
backups? It seems like it would, but I haven't investigated.
One post on the aforementioned Slashdot article addressed this, with the
following summary:
- Set up replication
- Do a locked table backup on the slave
Concise and to the point. I only wish I had a spare server on which to
implement it!
Tuesday, October 12. 2004
I've standardized my PHP programming to use the environment variable
SCRIPT_NAME when I want my script to refer to itself in links and
form actions. I've known that PHP_SELF has the same information, but
I was more familiar with the name 'SCRIPT_NAME' from using it in perl, and
liked the feel of it more as it seems to describe the resource better
('PHP_SELF' could stand for the path to the PHP executable if I were to go
by the name only).
However, I just noticed a post on the php.general newsgroup where somebody
asked what the difference was between them. Semantically, there isn't any;
they should contain the same information. However, historically and
technically speaking, there is. SCRIPT_NAME is defined in the CGI 1.1
specification, and is thus a standard. However, not all web servers
actually implement it, and thus it isn't necessarily portable.
PHP_SELF, on the other hand, is implemented directly by PHP, and as
long as you're programming in PHP, will always be present.
Guess I have some grep and sed in my future as I change a bunch of
scripts...
Friday, October 8. 2004
Occasionally, I've needed to process a lot of information from a script, but
I don't want to worry about PHP timing out or the user aborting the script
(by clicking on another link or closing the window). Initially, I
investigated register_shutdown_function()
for this; it will fire off a process once the page finishes loading.
Unfortunately, the process is still a part of the current connection, so it
can be aborted in the same way as any other script (i.e., by hitting stop,
closing the browser, going to a new link, etc.).
However, there's another setting initialized via a function that can
override this behaviour -- i.e., let the script continue running after the
abort. This is ignore_user_abort(). By
setting this to true, your script will continue running after the fact.
This sort of thing would be especially good for bulk uploads where the
upload needs to be processed -- say, for instance, a group of images or
email addresses.
Thursday, October 7. 2004
In the past two days, I've seen two references to Practical PHP Programming, an
online book that serves both as an introduction to programming with PHP5 and
MySQL as well as a good advanced reference with many good tips.
This evening, I was browsing through the Performance chapter (chapter 18),
and found a number of cool things, both for PHP and MySQL. Many were common
sense things that I've been doing for awhile, but which I've also seen and
shaken my head at in code I've seen from others (calculating loop
invariables at every iteration, not using variables passed to a function,
not returning a value from a function, not using a return value from a
function). Others were new and gave me pause for thought (string
concatenation with the '.' operator is expensive, especially when done more
than once in an operation; echo can take a comma separated list).
Some PHP myths were also dispelled, some of which I've been wondering about
for awhile. For instance, the amount of comments and whitespace in PHP are
not a factor in performance (and PHP caching systems will often strip them
out anyways); double quotes are not more expensive than single quotes unless
variable interpolation occurs.
It also has some good advice for SQL optimization, and, more importantly,
MySQL server optimization. For instance, the author suggests
running 'OPTIMIZE TABLE table;' on any table that has been
added/updated/deleted from to any large extent since creation; this will
defrag the table and give it better performance. Use CHAR() versus
VARCHAR(); VARCHAR() saves on space, but MySQL has to calculate how much
space was used each time it queries in order to determine where the next
field or record starts. However, if you have any variable length fields, you
may as well use as many as you need -- or split off variable length fields
(such as a TEXT() field) into a different table in order to speed up
searching. When performing JOINs, compare on numeric fields instead of
character fields, and always JOIN on rows that are indexed.
I haven't read the entire book, but glancing through the TOC, there are some
potential downfalls to its content:
- It doesn't cover PhpDoc
- It doesn't appear to cover unit testing
- Limited coverage of templating solutions (though they are mentioned)
- Limited usage of PEAR. The author does mention PEAR a number of times,
and often indicates that use of certain PEAR modules is preferable to using
the corresponding low-level PHP calls (e.g., Mail and Mail_MIME, DB), but in
the examples rarely uses them.
- PHP-HTML-PHP... The examples I browsed all created self-contained
scripts that did all HTML output. While I can appreciate this to a degree,
I'd still like to see a book that shows OOP development in PHP and which
creates re-usable web components in doing so. For instance, instead of
creating a message board script, create a message board
class that can be called from anywhere with metadata specifying the
database and templates to use.
All told, there's plenty of meat in this book -- I wish it were in dead tree
format already so I could browse through it at my leisure, instead of in
front of the computer.
|
|