Short answer: OpenURI doesn’t support the “feed://” pseudo protocol and if you try it with a hash of header options it gives you the same error as if, like some dumb muppet, you hadn’t required the library in the first place. In other words, it falls through to Kernel#open and leaves you scratching your head.
Long answer: Read on, code fiends. Read on.
Tonight I decided to earn some HusbandPoints™ by helping my wife get a large number of tagged photos off Picasa for a project that she’s working on. Downloading them by hand would’ve been a pain in the ass timewise and also would’ve been a big pain opportunity-cost wise as well, since she would’ve had to take time out from the main body of the project (a homemade cookbook for a friend’s wedding) to do a dumb photo-by-photo clickfest through the entire large Picasa album she’d assembled with her friends. Plus it gave me a reason to mess around w/ the Google APIs some — knowledge that would almost certainly come in handy later.
Now, the easiest way to go about scripting this w/ Ruby involves using open-uri to pass in the authorization token from Google into every request, per their ClientLogin authentication method. You do that with a piece of code like this:
1 2 3 4 5 6 7 8 9 10 | # Assuming that @auth_token is set by a login method def http_header {"Authorization" => "GoogleLogin auth=#{@auth_token}"} end # HTTP GET a Google content feed (Atom) def get(url) response = open(url, http_header){ |f| f.read() } Hpricot.XML(response) end |
Here we’re getting the content from Google (which will come as an Atom feed, as all of their various pieces of content do) and then parsing the result with Hpricot. We pass the http_header Hash to OpenURI’s open method to specify a set of HTTP header variables. This is supposed to be easy, but tonight it wasn’t, and my wife was treated to the inelegant sounds of me cursing at my laptop screen for 10 or 15 minutes until I figured out what the problem was.
‘feed://’ don’t go ’round here
The problem turmed out to be the “feed://” pseudo protocol. Safari likes it (because it fancies itself a feed reader), and decided to make the RSS link provided by Google for the tag set my wife wanted to download into a “feed://” URL. Of course, there’s no such protocol, and “feed://” itself is a pretty lame. People have been bitching about its lameness for a long, long time. It’s almost as lame as me not catching it.
But the lamest thing of all (which was causing the cursing) is how OpenURI handles this:
1 2 3 4 5 6 | TypeError: can't convert Hash into String method initialize in open-uri.rb at line 32 method open_uri_original_open in open-uri.rb at line 32 method open in open-uri.rb at line 32 method get in picasa.rb at line 62 |
This is the same thing you get when you try to use open on a URL with a hash of header arguments and you’ve forgotten to require the OpenURI library in the first place.
The problem here seems to be with this part:
1 2 3 4 5 6 7 8 9 10 11 | def open(name, *rest, &block) # :doc: if name.respond_to?(:open) name.open(*rest, &block) elsif name.respond_to?(:to_str) && %r{\A[A-Za-z][A-Za-z0-9+\-\.]*://} =~ name && (uri = URI.parse(name)).respond_to?(:open) uri.open(*rest, &block) else open_uri_original_open(name, *rest, &block) end end |
It’s not calling the part you might think — the piece where it asks if the name can be converted to a string and if it conforms to a loose URI regex pattern. It’s instead calling it with the original, version of open, the one that the Kernel class provides so you can easily open files and URLs (but without all the tasty options given you by OpenURI). This error gets thrown by Kernel when you try to use open outside the context of OpenURI (as this guy points out).
Since we can tell that a URL that starts with “feed://” should pass the first of the two tests in the “elsif” clause (the regex pattern), that means that it’s not passing some part of the the URI.parse test. Here’s what that URI.parse method looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | def self.parse(uri) scheme, userinfo, host, port, registry, path, opaque, query, fragment = self.split(uri) if scheme && @@schemes.include?(scheme.upcase) @@schemes[scheme.upcase].new(scheme, userinfo, host, port, registry, path, opaque, query, fragment) else Generic.new(scheme, userinfo, host, port, registry, path, opaque, query, fragment) end end |
No great clues there. But if you run through the code in the OpenURI#open method’s elsif clause, it turns out that if you parse the offending “feed://”-based URI, you don’t get a “URI::HTTP” object. You get a “URI::Generic” object, which doesn’t respond to open. Obviously, the library doesn’t support this kind of URL, and if it weren’t overriding a Kernel method, it’d probably say so, but it can’t make assumptions about what you’re trying to do with open, so it instead falls through to the call to the overridden Kernel#open and you get the same error you’d get if you never used “require ‘open-uri’” in the first place.
Lesson learned, boys and girls — pseudo protocols aren’t supported by much at all other than self-important feed reading software.
Thanks to the Gimite Google Spreadsheet library for inspiration on the auth code







Copyright © 2012 Catapult Creative - info(at)catapult(hyphen)creative(dot)com - Powered by