Commit b4f9d2a5 by Jon Yurek

Merge pull request #416 from jmileham/paperclip

---

Hi there,

Im building a site that needs to securely obfuscate attachments in a public S3 bucket.  The solution I arrived at seems like it might be useful to the community as a general-purpose tool that remains customizable enough to allow developers with different requirements to implement it differently without having to hack into Paperclip core.  It is a set of storage-agnostic extensions to Attachment and Interpolations and supporting tests.

It features:

  * No extra data model required
  * Choice of HMAC digest algorithm on a per-attachment basis via openssl (modeled after ActiveSupport::MessageVerifier, default algo is SHA1)
  * Choice of fields to include in the hash on a per-attachment basis, which permits many developer-selected tradeoffs between security and flexibility -- some examples:
    * `:hash_data => ":class/:attachment/:id"` - Unique per asset, long-lived filename with deducible style URLs (e.g. you want a 3rd party site to be able to generate style URLs from a single URL trunk that you provide once and always get the latest attachment)
    * `:hash_data => ":class/:attachment/:id/:style"` - Adds non-deducible style URLs (e.g. you want to store uncompressed :originals but dont want end users to be able to find them and kill your bandwidth bill)
    * `:hash_data => ":class/:attachment/:id/:style/:updated_at"` - Adds non-deducible versions (e.g. if a users private profile photo URL makes it out into the wild via a malicious friend, they can unfriend, upload a new photo and the leak is plugged) -- this is the default

Since the hashes are extremely unlikely to collide, you can also remove other elements of the `:path` that were previously required for global uniqueness (like `:id`), which can give your users plausible deniability of ownership, or even completely obliterate the directory structure (`:path => ":hash"` -- or less controversially `:path => ":hash.:extension"`) such that an attacker cant even prove what kind of attachment he/she has gained access to.  Paired with a large `:hash_secret` and HTTPS, your users are left with a similar security guarantee to the analog hole -- attackers would need an authorized-user mole, or would need to compromise either your server-side secret or an authorized users machine in order to gain access to protected attachments, and then would be able to freely distribute them (sadly, using your hosting infrastructure as an accomplice).

### The `:timestamp` bug

While building this, I ran across what I think is a bug: The `:timestamp` interpolation is non-deterministic in the presence of per-thread time_zones, e.g. displaying time zones per user location, which would affect any such site that uses `:timestamp` in an attachment :path or :url (or if I had used `:timestamp` in `:hash`).  While using `:timestamp` in paths and URLs is probably a little-used feature given the general wordiness of `Time#to_s`, and the intersection of sites that attempt to use it as well as implement per-user time zones is a vanishingly small slice of Paperclip users, I think its worth taking a look at.

I took a stab at dealing with it by adding a new attachment config parameter `:use_default_time_zone` (defaulted to true) that explicitly churns out :timestamp interpolations in the server-wide default time zone.  This is not perfect, because its possible that a site owner would want to change the default time zone via `config.time_zone=`, which would in turn invalidate all of their existing attachment URLs.  One option available to such an implementor would be to redefine my new method `Attachment#time_zone` explicitly to meet their requirements.  Or I could add a `:time_zone` option on `Attachment`, defaulted to null, and falling through to the present `Attachment#time_zone` implementation.  Such an option could even optionally take a block a la `options[:url]` and `options[:path]`.

While the ideal path for `:timestamp` is unclear to me as Im not deeply familiar with the design philosophy behind Paperclip or the direction it wants to head in (aggressively fixing holes like this or supporting users who may rely on legacy behavior?), it seemed natural to me to expose the integer `Attachment#updated_at` value as an interpolation as an alternative to `:timestamp`.  Epoch seconds dont carry any time zone information and are therefor immune to this issue, plus theyre presumably faster to convert to than text, and fewer characters to hash to boot.

So my present implementation has a provisionally "improved" behavior for :timestamp (which should be backward compatible for everybody who didnt run into the bug personally), and a new `:updated_at` interpolation that merely exposes the existing `Attachment` instance method.  Id be eager for feedback on what if anything should be done with that stuff.

The caveats:

  * Im pretty new to testing, and this was definitely my first time writing tests using shoulda and test/unit.  Feel free to slap me for transgressions in style and abuse of mocks/stubs/fakes/dummies.
  * I tried to follow house style to the extent I could derive it, but Im sure I got it wrong in places.
  * Im not sure what went on with the appraisal gemfile.locks reverting to an earlier version dependency on Rails -- perhaps this is expected?  Again, never used appraisal before either.  Tips would be more than welcome.

Id be very excited if you see some or all of these changes as valuable to the Paperclip mainstream, and am happy to iterate this into something worthy of pulling.

Thanks a lot!

-john
parents 0422cd89 13c86cf1
......@@ -2,7 +2,7 @@
source "http://rubygems.org"
gem "ruby-debug"
gem "rails", ">=3.0.3"
gem "rails", "~>3.0.0"
gem "rake"
gem "sqlite3-ruby", "~>1.3.0"
gem "shoulda"
......
......@@ -90,7 +90,7 @@ DEPENDENCIES
appraisal
aws-s3
mocha
rails (>= 3.0.3)
rails (~> 3.0.0)
rake
ruby-debug
shoulda
......
......@@ -8,16 +8,19 @@ module Paperclip
def self.default_options
@default_options ||= {
:url => "/system/:attachment/:id/:style/:filename",
:path => ":rails_root/public:url",
:styles => {},
:processors => [:thumbnail],
:convert_options => {},
:default_url => "/:attachment/:style/missing.png",
:default_style => :original,
:storage => :filesystem,
:use_timestamp => true,
:whiny => Paperclip.options[:whiny] || Paperclip.options[:whiny_thumbnails]
:url => "/system/:attachment/:id/:style/:filename",
:path => ":rails_root/public:url",
:styles => {},
:processors => [:thumbnail],
:convert_options => {},
:default_url => "/:attachment/:style/missing.png",
:default_style => :original,
:storage => :filesystem,
:use_timestamp => true,
:whiny => Paperclip.options[:whiny] || Paperclip.options[:whiny_thumbnails],
:use_default_time_zone => true,
:hash_digest => "SHA1",
:hash_data => ":class/:attachment/:id/:style/:updated_at"
}
end
......@@ -32,24 +35,28 @@ module Paperclip
options = self.class.default_options.merge(options)
@url = options[:url]
@url = @url.call(self) if @url.is_a?(Proc)
@path = options[:path]
@path = @path.call(self) if @path.is_a?(Proc)
@styles = options[:styles]
@normalized_styles = nil
@default_url = options[:default_url]
@default_style = options[:default_style]
@storage = options[:storage]
@use_timestamp = options[:use_timestamp]
@whiny = options[:whiny_thumbnails] || options[:whiny]
@convert_options = options[:convert_options]
@processors = options[:processors]
@options = options
@queued_for_delete = []
@queued_for_write = {}
@errors = {}
@dirty = false
@url = options[:url]
@url = @url.call(self) if @url.is_a?(Proc)
@path = options[:path]
@path = @path.call(self) if @path.is_a?(Proc)
@styles = options[:styles]
@normalized_styles = nil
@default_url = options[:default_url]
@default_style = options[:default_style]
@storage = options[:storage]
@use_timestamp = options[:use_timestamp]
@whiny = options[:whiny_thumbnails] || options[:whiny]
@use_default_time_zone = options[:use_default_time_zone]
@hash_digest = options[:hash_digest]
@hash_data = options[:hash_data]
@hash_secret = options[:hash_secret]
@convert_options = options[:convert_options]
@processors = options[:processors]
@options = options
@queued_for_delete = []
@queued_for_write = {}
@errors = {}
@dirty = false
initialize_storage
end
......@@ -197,6 +204,21 @@ module Paperclip
time && time.to_f.to_i
end
# The time zone to use for timestamp interpolation. Using the default
# time zone ensures that results are consistent across all threads.
def time_zone
@use_default_time_zone ? Time.zone_default : Time.zone
end
# Returns a unique hash suitable for obfuscating the URL of an otherwise
# publicly viewable attachment.
def hash
raise ArgumentError, "Unable to generate hash without :hash_secret" unless @hash_secret
require 'openssl' unless defined?(OpenSSL)
data = interpolate(@hash_data)
OpenSSL::HMAC.hexdigest(OpenSSL::Digest.const_get(@hash_digest).new, @hash_secret, data)
end
def generate_fingerprint(source)
data = source.read
source.rewind if source.respond_to?(:rewind)
......
......@@ -48,8 +48,18 @@ module Paperclip
end
# Returns the timestamp as defined by the <attachment>_updated_at field
# in the server default time zone unless :use_global_time_zone is set
# to false. Note that a Rails.config.time_zone change will still
# invalidate any path or URL that uses :timestamp. For a
# time_zone-agnostic timestamp, use #updated_at.
def timestamp attachment, style_name
attachment.instance_read(:updated_at).to_s
attachment.instance_read(:updated_at).in_time_zone(attachment.time_zone).to_s
end
# Returns an integer timestamp that is time zone-neutral, so that paths
# remain valid even if a server's time zone changes.
def updated_at attachment, style_name
attachment.updated_at
end
# Returns the Rails.root constant.
......@@ -94,6 +104,12 @@ module Paperclip
attachment.fingerprint
end
# Returns a the attachment hash. See Paperclip::Attachment#hash for
# more details.
def hash attachment, style_name
attachment.hash
end
# Returns the id of the instance in a split path form. e.g. returns
# 000/001/234 for an id of 1234.
def id_partition attachment, style_name
......
......@@ -96,6 +96,78 @@ class AttachmentTest < Test::Unit::TestCase
assert_equal "1024.omg/1024-bbq/1024what/000/001/024.wtf", @dummy.avatar.path
end
end
context "An attachment with :timestamp interpolations" do
setup do
@file = StringIO.new("...")
@zone = 'UTC'
Time.stubs(:zone).returns(@zone)
@zone_default = 'Eastern Time (US & Canada)'
Time.stubs(:zone_default).returns(@zone_default)
end
context "using default time zone" do
setup do
rebuild_model :path => ":timestamp", :use_default_time_zone => true
@dummy = Dummy.new
@dummy.avatar = @file
end
should "return a time in the default zone" do
assert_equal @dummy.avatar_updated_at.in_time_zone(@zone_default).to_s, @dummy.avatar.path
end
end
context "using per-thread time zone" do
setup do
rebuild_model :path => ":timestamp", :use_default_time_zone => false
@dummy = Dummy.new
@dummy.avatar = @file
end
should "return a time in the per-thread zone" do
assert_equal @dummy.avatar_updated_at.in_time_zone(@zone).to_s, @dummy.avatar.path
end
end
end
context "An attachment with :hash interpolations" do
setup do
@file = StringIO.new("...")
end
should "raise if no secret is provided" do
@attachment = attachment :path => ":hash"
@attachment.assign @file
assert_raise ArgumentError do
@attachment.path
end
end
context "when secret is set" do
setup do
@attachment = attachment :path => ":hash", :hash_secret => "w00t"
@attachment.stubs(:instance_read).with(:updated_at).returns(Time.at(1234567890))
@attachment.stubs(:instance_read).with(:file_name).returns("bla.txt")
@attachment.instance.id = 1234
@attachment.assign @file
end
should "interpolate the hash data" do
@attachment.expects(:interpolate).with(@attachment.options[:hash_data]).returns("interpolated_stuff")
@attachment.hash
end
should "result in the correct interpolation" do
assert_equal "fake_models/avatars/1234/original/1234567890", @attachment.send(:interpolate,@attachment.options[:hash_data])
end
should "result in a correct hash" do
assert_equal "d22b617d1bf10016aa7d046d16427ae203f39fce", @attachment.path
end
end
end
context "An attachment with a :rails_env interpolation" do
setup do
......
......@@ -90,7 +90,7 @@ end
class FakeModel
attr_accessor :avatar_file_name,
:avatar_file_size,
:avatar_last_updated,
:avatar_updated_at,
:avatar_content_type,
:avatar_fingerprint,
:id
......
......@@ -112,9 +112,25 @@ class InterpolationsTest < Test::Unit::TestCase
should "return the timestamp" do
now = Time.now
zone = 'UTC'
attachment = mock
attachment.expects(:instance_read).with(:updated_at).returns(now)
assert_equal now.to_s, Paperclip::Interpolations.timestamp(attachment, :style)
attachment.expects(:time_zone).returns(zone)
assert_equal now.in_time_zone(zone).to_s, Paperclip::Interpolations.timestamp(attachment, :style)
end
should "return updated_at" do
attachment = mock
seconds_since_epoch = 1234567890
attachment.expects(:updated_at).returns(seconds_since_epoch)
assert_equal seconds_since_epoch, Paperclip::Interpolations.updated_at(attachment, :style)
end
should "return hash" do
attachment = mock
fake_hash = "a_wicked_secure_hash"
attachment.expects(:hash).returns(fake_hash)
assert_equal fake_hash, Paperclip::Interpolations.hash(attachment, :style)
end
should "call all expected interpolations with the given arguments" do
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment