unicode support for doc_ids/content buggy and or inconsistent

Bug #988856 reported by Samuele Pedroni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
U1DB
In Progress
High
Unassigned

Bug Description

right now we get the following behaviors:

Python 2.7.3 (default, Apr 10 2012, 22:21:37)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import u1db
>>> x = u1db.open('foo.db', create=True)
>>> x.create_doc('{}', doc_id=u"\xab")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xab' in position 9: ordinal not in range(128)
>>> x.create_doc(u'{"\xab": 0}', doc_id=u"ab")
Document(ab, 514785fd3a664695b5fe15fc261da0fb:1, u'{"\xab": 0}')

we need to decided what we really want here, don't take unicode, accept (byte)strings that are utf-8... accept ascii only for doc_ids?...

Related branches

Changed in u1db:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
John Lenton (chipaca) wrote :

What's the use case for non-ascii doc ids?
I'm tempted to say doc ids should be non-whitespace printable ascii. Anything else will get weird very quickly; even that has some concerns.

Changed in u1db:
assignee: nobody → Eric Casteleijn (thisfred)
status: Confirmed → In Progress
Revision history for this message
John A Meinel (jameinel) wrote :

I'm pretty sure the bits that were covered in this bug are addressed. If we find more unicode bugs, we should just file new bugs.

Changed in u1db:
status: In Progress → Fix Released
Changed in u1db:
status: Fix Released → Fix Committed
summary: - unicode support for doc_ids/content buggy and or incosistent
+ unicode support for doc_ids/content buggy and or inconsistent
Changed in u1db:
status: Fix Committed → In Progress
Changed in u1db:
assignee: Eric Casteleijn (thisfred) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.