Client side validation has always been a potential headache for front-end programmers. Embedded blocks with a mixture of imperative JavaScript and declarative regex can be a mess. HTML5 has ambition to add abstraction layers that would make this a bit easier. As I’ll explain below, theres’ still a long way to go before it’s rock solid.
There are two ideas that enters the scene now:
- The
<input>
tag has newtype
attribute values likeurl
,email
,date
,telephone number
, andcolor
. - The
<input>
tag has the new attributepattern
where you can describe allowed input with a regex.
Note that it’s only validation. It would have been nice to have filtering (e.g. remove spaces in a credit card number) or even replacing (euro is sent to server, whether the user enters euro or €).
In case (1) as well as (2), a nice red-green feedback lets the user know if the user entered text is correct. The tool-tip of the input widget can also have a descriptive message of what the system expects from the user. You just set a value of the title
attribute. More on that below.
1. New values for the type
attribute of the <input>
tag
To use the type
attribute is simple. Here’s an example with the new value email
:
<input type="email" required />
This made me curious. I guess that email
is implemented with a regex under the hood. What does it look like? I don’t know, but it’s not correct. As a matter of fact the spec for the email
attribute value is incorrect. It looks like this:
A valid e-mail address is a string that matches the ABNF production 1*( atext / “.” ) “@” ldh-str *( “.” ldh-str ) where atext is defined in RFC 5322 section 3.2.3, and ldh-str is defined in RFC 1034 section 3.5.
So currently, the HTML5 browsers accepts the email -@-
and doesn’t accept "staffan nöteberg"@rekursiv.se
— I tried. It should be the other way around. (Yes, spaces and diaeresis makes sense to the left of the @ sign, as it’s a local mailbox routing that might involve a not so SMTP:ish system. For the record I tried…
echo 'hello!' |
/usr/lib/sendmail '"staffan nöteberg"@rekursiv.se'
…and it works!).
However, even though it’s already implemented in many browsers, W3C makes it clear that it’s only a working draft. For the moment there’s a note in the document that they are aware of this error:
NOTE: This requirement is a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the “@” character), too vague (after the “@” character), and too lax (allowing comments, white space characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.
My recommendation is to NOT use the email
attribute until it has a better implementation.
2. New attribute pattern
of the <input>
tag
The input
tag has several new attributes to specify constraints: autocomplete
, min
, max
, multiple
, pattern
, and step
. I’m particularly interested in the pattern
attribute. It’s more generic than the new values of the type
attribute mentioned above.
The pattern
value is a regex. In what regex dialect? Yes, you guessed it: JavaScript according to ECMA-262 Edition 5. This is a major drawback, since the regex support in JavaScript is modest (e.g. there’s even no meta class to match a letter — many other regex engines support the Unicode \p{L}
). The whole user input must be matched by the regex, not only a fraction. You can look at it as if your regex is prefixed with ^(?:
and suffixed with )$
.
Here are three pragmatic (but not globally perfect) examples I created:
- Strong password:
<input title="at least eight symbols containing at least one number, one lower, and one upper letter" type="text" pattern="(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}" required />
- Email address:
<input type="text" title="email" required pattern="[^@]+@[^@]+\.[a-zA-Z]{2,6}" />
- Phone number:
<input type="text" required pattern="(\+?\d[- .]*){7,13}" title="international, national or local phone number"/>
I leave it as a reader exercise to interpret these regexes. And you can try them too! They are online in this test page:
If you combine type="email"
and pattern
then both constraints must be fulfilled.
Summary
HTML5 form validation is a good idea. The pattern
tag is very generic, albeit its rather limited regex dialect. Be careful with the new values of the type
attribute, as they are only in prototype status currently.
Finally: What about browser support. I’m in deep water here, but I understand it as there’s support for this kind of validation in IE 10+, Firefox 8+, Chrome 16+, Opera 11.6+, and Opera Mobile 10+. There’s partial or none support in Safari and Android.
