LogoPhly, boy, phly
the weblog and site of Matthew Weier O'Phinney

Sunday, July 10. 2005

Thoughts on form validation

I've been doing a lot of thinking on form validation recently. Among other things, I want to be using a standard set of tools for validating form input at work; I'm also rewriting the family website in PHP, and want to have consistency there as well. Finally, I truly buy into Chris Shiflett's top two security practices: filter input, escape output. Validation should always be done, and should be done rigorously; don't allow anything more than is necessary to get the work done.

I flirted briefly in the past month with HTML_QuickForm. Being an observer on the CGI::Application mailing lists, HQF looks like PHP's answer to perl's Data::FormValidator. HQF has a high frequency of posts on the php-pear-general lists. A lot of people seem happy with it. I decided to try it out as an example plugin for Cgiapp for the latest release.

My problem is that I want to be able to define form validation in a file outside my script. The reason for this is that as I extend and reuse classes, I often find that I can use the same general run-modes for a method... just so long as the form validation logic is separate. This allows me, for instance, to decide that in one application instance I will require fields A-M, but in another, I only need A-E (or vice versa). But it requires no changes to the actual application logic, as the validations are kept separately, and I have the application instance indicate which validation file to utilize.

My approach with HQF was to create some utility methods for setting up forms from configuration files. This would allow the programmer to define the form in a file, and then pass the location of that file to the class in the instance script. Then, in the application class, the programmer would simply pass the parameter to the utility methods, and voila! form validation is done.

Unfortunately, HQF is, quite simply, next to impossible to code this way. I went to some pretty serious effort to do so, but the best I got was to utilize nested arrays in a file that gets eval()'d -- not a viable solution for a security conscious programmer. The problems I saw were:

  • Validation and filtering are kept separate from the actual element definitions. To my thinking, the validation and filtering are on an element; the element should be the basic block, and the validations and filters are attributes or properties of it. While I understand the idea behind HQF's decision, I found it non-intuitive in practice, and also felt it created more code.
  • Elements, validations, and filters often accepted parameters that were difficult to define in a static file (things like the form action attribute, or configuration arrays). Special elements were too difficult to create. Select elements with options and such were simply too difficult to create via a definition file.

In the end, the code I created to parse a file that contained a form validation was much larger than any code I could hand write with HQF. And the form validation file itself was of similar size to hand-coding equivalent HQF code.

I started working on a form validation library this past week, and after many hours of effort, realized I was creating something as large in scope as HQF. Granted, I was building it with the idea of using a SimpleXML file to contain the validation logic, and it was going to accomodate that, but in the end, it was a hairy piece of code, and for most of my forms, overkill.

And then it hit me: just about every form I create is slightly different, and, in general, I find one of the following occur:

  • The amount of input is so small that I can validate it myself in fewer lines than utilizing an established library.
  • There's a lot of data, but much of it is in radios, checkboxes, or dropdowns, and can be validated by checking against arrays.
  • The amount of data is highly specialized, and I have to validate by hand anyways.

In summary: it's rare that I get any development benefit from using a monolithic validation library. By development benefit, I mean savings in time or effort.

Where does that leave me? Well, on further analysis, I realized that the main reason I could see to using a library would be for those sets of data that I often need to validate, but for which there isn't a built-in way in PHP to do so: email addresses, URIs, phone numbers, dates, etc. Additionally, I may want a few pre-filters -- things like stripping all non-numerics or non-alphas, stripping tags, trim()ing, etc. I still want to automate as much as possible, but only the common types.

I envison being able to use an INI-style file like the following:

[name]
label="Name:"
error="Please provide your name; use only alphabetical characters, commas, hyphens, periods, and single quotes"
required=true
rule1type=regex
rule1data="/^[a-z .,'-]+$/"
filter1type=trim
filter2type=htmlentities

[email]
label="Email:"
error="Please provide a valid email address"
required=true
rule1type=email
filter1type=trim

[state]
label="State of residence:"
error="Please select your state from the drop-down list provided"
required=false
rule1type=in_array
rule1data=ME,NH,VT,MA,NY,CT,RI

This style would catch 80% of the cases I have, which would simplify and expedite my development by leaving me to deal with only the other 20%.

I was considering how I was going to code this up -- what structure to use in the class, whether to require class instantiation or use static methods, etc. -- and then realized that Paul M. Jones had given me some pointers on the use of Solar_Valid when I suggested to him that I'd like to include it as a plugin on the next release of Cgiapp. I looked at his class, and it does exactly what I was considering coding for the validations. With a similar class for filtering (yes, Paul, I'll contribute that code, if you'd like!), it should become fairly trivial to write a validation routine that could parse a file like the above and then perform as I desire.

This wasn't meant to be a plug for Paul, however, nor a call for developers. I want to stimulate discussion: how do others validate forms? Do we all come to the same conclusions after having done hundreds of form validations -- that there is no magic bullet? Or have I missed the magic bullet? Is some automation a good thing? Or should every form have its own specific programmatic logic? Is there a nice lean library already that does this stuff well and simply? Or is that unattainable? Did I miss the boat on HQF? Or is it bloat?

Leave your comment!

Posted by Matthew Weier O'Phinney in PHP at 22:17 | Comments (5) | Trackback (1)

Trackbacks
Trackback specific URI for this entry

Form madness
Those who follow my blog may remember an earlier entry on form validation. I looked into some of the possible solutions those who commented provided, but other than Solar_Form, each was either trying to generate HTML, or not generating HTM
Weblog: phly, boy, phly
Tracked: Jul 28, 00:15

Comments
Display comments as (Linear | Threaded)

Hi Matthew -- great article! I love the narrative of progression from one solution to the next.

As far as working with forms, no magic bullet that I've found. It still requires quite a bit of setup. I'm trying to do it with Solar_Form these days; your .ini file looks a lot like the Solar_Form $element array (which itself appears a lot like the array for patForms elements).

The only exception is that, while I do have validation available, I don't have filtering built into Solar_Form. I can see where that would be a good thing to have.

No docs on Solar_Form yet, but you can see the code here:

http://solarphp.com/svn/pkg/Solar/Form.php

Again, great write-up, and I'm happy to receive any patches you want to send along. :-)
#1 Paul M. Jones (Link) on 2005-07-11 01:12 (Reply)
Patforms may be worth a look http://php-tools.de/site.php?file=patForms/overview.xml

I saw a demo by Stephan Schmidt and was quite impressed.
#2 Arnaud on 2005-07-11 02:39 (Reply)
I have been working on something similar. A forms engine in PHP which will take a data structure that defines the form elements and automatically produces the form markup and validation code: You can see an beta prototype here:

http://richardathome.no-ip.com/examples/forms_engine/
#3 Richard@Home (Link) on 2005-07-11 05:15 (Reply)
Interesting -- the array looks very similar to the one I was building last week, and to the one I'm thinking to build from my INI file.

I didn't mention in the entry, but one thing I didn't use from HQF is the HTML generation features. I use templates, as I like to keep my presentation layer separate from the application layer. I simply want to do validation -- no more, no less.

Is the logic for your forms engine kept separate from the display portion? Can you throw a phps up so I can look at the code?
#3.1 Matthew Weier O'Phinney (Link) on 2005-07-11 06:41 (Reply)
Hi Matthew.

I basically have two functions: form_build() and form_validate(). Both take the same form structure.

form_validate() runs through the elements of submitted form and adds error messages to the elements in the structure that fail to validate.

form_build() then outputs the html needed to display the form (and any error messages that may have been added during form_validate()

The major benefit to building your HTML from the same validation structure is of course only having to change one thing in one place. Hand coding your forms means that if you need to add another field (and the associated validation), you need to do it in two places.

The example on my website is a very, very pre-beta test I threw together as a proof of concept. I have a production version I wrote for work which is more robust and flexible.

I have a blog article under way that discusses the forms engine in more detail. Rather than show you buggy half-written code, can you wait till I finish the article?

Feel free to email me if you need anything further...
#3.1.1 Richard@Home (Link) on 2005-07-11 06:56 (Reply)

Add Comment

Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

 
 
  • Home
  • Resume
  • Blog
  • Phly PEAR Channel
  • Contact Me
  • About this site

ZCE

Zend Education Advisory Board Member

Add to Technorati Favorites

Calendar

Back November '08 Forward
Mon Tue Wed Thu Fri Sat Sun
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

Quicksearch

Links

  • PHLY - PHp LibrarY
  • Paul M. Jones
  • Mike Naberezny
  • Shahar Evron
  • Planet PHP
  • Zend Where I now work
  • Garden.org Where I once worked

Archives

November 2008
October 2008
September 2008
Recent...
Older...

Categories

XML Linux
XML Personal
XML Aikido
XML Family
XML Programming
XML Dojo
XML Perl
XML PHP

All categories

Syndicate This Blog

XML RSS 0.91 feed
XML RSS 1.0 feed
XML RSS 2.0 feed
ATOM/XML ATOM 0.3 feed
ATOM/XML ATOM 1.0 feed
XML RSS 2.0 Comments

Show tagged entries

xml best practices
xml books
xml conferences
xml dojo
xml dpc08
xml file_fortune
xml linux
xml mvc
xml oop
xml pear
xml personal
xml php
xml phpworks08
xml programming
xml ubuntu
xml vim
xml webinar
xml zendcon
xml zendcon08
xml zend framework
© 2004 - present, Matthew Weier O'Phinney
matthew-web <at> weierophinney.net