Comment 6 for bug 767205

Revision history for this message
Drew Smathers (djfroofy) wrote :

Yes, I plan on resuming work on this week since I'd like to get this working for a project at work which is still uploading files through txaws only by dent of some dirty monkey patches.

I began on this branch (https://code.launchpad.net/~djfroofy/txaws/agent-767205) but hit a wall b/c my approach smelled like ripe garbage for some reason.

In sum, it's really hard to needle this in while retaining backwards compatibility and still have a meaningful interface.

Problems:

1. Meaningfulness of existing APIs

Here's the current put_object API:

     def put_object(self, bucket, object_name, data, content_type=None,
                   content_length=None, metadata={}, amz_headers={}):
         ...

It's confusing from an API standpoint to overload "data" (which arguably just sounds "a str" to most devs). Possible solutions might be making a separate API. example:

    def produce_object(self, bucket, object_name, producer=None, content_type=None,
          content_md5=None, content_length=None, metadata=None, amz_headers=None)

Or optionally add more keyword arguments to put_object (seems to make things convoluted that way).

2. Custom Producers

Right now my work is hard coded to one producer but I don't like this. Who knows how the data gets produced? Filesystem? AMQP? 0MQ? pyaudio? Producers should be pluggable through either extension of the Query class or method parameterization.

3. Computing md5

Right now this internal to implementation - user can't control - and string based. I've done some weird stuff to handle both strings and maybe a producer - I think the user should be allowed to compute the md5 externally (caching for example with md5 file, for example) or a least provide a producer that also implements some known interface (IMD5Generator.generateMD5, or something)

4. _newclient.py, _oldclient.py makes me feel ill, somehow

The only logical route I could see to basing things on Agent was to essentially rewrite the base Query. The implicit upgrading (if user has Agent makes me slightly nervous). Should we default to using new Query for other calls that don't need streaming? Warn the user the eventually a modern version of twisted will be required and thus deprecate Query in _old_client?

Feedback Please

The above are all essentially my desiderata for the Agent-based s3 interface for uploading objects. I consider multipart uploads etc outside of the scope but would probably implement something for that soon after agreeing on interface for using Agent and implementing. I'd like to know other's thoughts on the ideal interface or safe ways to replacing he current getPage()-like implementation with Agent.