I ran into an issue on a client project this week where we needed to validate a URL in our Ruby on Rails application, but wanted to check that it actually existed in addition to validating the format with a regular expression. After some minor searching, I ran across Ilya Grigorik's blog (that's been happening a lot lately for some reason.) He provided a nice little ActiveRecord validator that uses Net:HTTP to ping a domain and validate that it returns a 200 code (HTTPSuccess).

As it turns out, his post (and thereby his method) was a bit outdated, so I put together an updated validator that takes advantage of the new "sexy validations" provided in Rails 3. And here it is:

require 'net/http'
# Thanks Ilya! http://www.igvita.com/2006/09/07/validating-url-in-ruby-on-rails/
# Original credits: http://blog.inquirylabs.com/2006/04/13/simple-uri-validation/
# HTTP Codes: http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTPResponse.html
class UriValidator < ActiveModel::EachValidator
def validate_each(object, attribute, value)
raise(ArgumentError, "A regular expression must be supplied as the :format option of the options hash") unless options[:format].nil? or options[:format].is_a?(Regexp)
configuration = { :message => "is invalid or not responding", :format => URI::regexp(%w(http https)) }
configuration.update(options)
if value =~ configuration[:format]
begin # check header response
case Net::HTTP.get_response(URI.parse(value))
when Net::HTTPSuccess then true
else object.errors.add(attribute, configuration[:message]) and false
end
rescue # Recover on DNS failures..
object.errors.add(attribute, configuration[:message]) and false
end
else
object.errors.add(attribute, configuration[:message]) and false
end
end
end

To get up and running, create a new file in your Rails lib directory called "uri_validator.rb", and copy/paste the above code. If you have added the lib directory to your autoload paths, then you're done! Otherwise, you'll want to include the file in your environment.rb or application.rb files (in your config directory) like so:

...
require 'uri_validator'
view raw environment.rb hosted with ❤ by GitHub

Finally, to use the new sexy validator, simply add the :uri option to any existing validates call:

validates :url, :presence => true, :uri => { :format => /(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix }
view raw model.rb hosted with ❤ by GitHub

This allows you to specify a custom format (must be a valid regular expression); otherwise, the format will default to URI::regexp(%w(http https)).

There it is, if you come up with your own variations of the validator or regex, don't forget to tell me about them!

By the way, just wanted to give a personal shout out to Ilya for his fantastic support of his new non-blocking EventMachine based ruby web server, Goliath. Hey man, you might want to change the name to David: I was going to make some crack here about his sling (not to be - or maybe? - confused with David's Sling) being asynchronous, but you get the idea :). Anyway, rock on.