BibleGateway importer crashes on non unicode urls

Bug #1251437 reported by Phill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenLP
Status tracked in Trunk
2.0
Fix Released
Medium
Phill
Trunk
Fix Released
Medium
Tomas Groth

Bug Description

The user tried to import "Dette er bibelen på dansk" from BibleGateway and got the following error. Reported on 2.0.3 confirmed on trunk.

See: http://support.openlp.org/issues/2139

*OpenLP fejlrapport*Udgave: {u'full': u'2.0.3', u'version': u'2.0.3',
u'build': None}--- Undtagelsens detaljer. ---prøvede at importer en
bibel --- Undtagelsens traceback ---Traceback (most recent call last):
File
D:\OpenLP_Code\2.0\build\pyi.win32\OpenLP\out00-PYZ.pyz\openlp.core.ui.wizard,
line 188, in onCurrentIdChanged File
D:\OpenLP_Code\2.0\build\pyi.win32\OpenLP\out00-PYZ.pyz\openlp.plugins.bibles.forms.bibleimportform,
line 714, in performWizard File
D:\OpenLP_Code\2.0\build\pyi.win32\OpenLP\out00-PYZ.pyz\openlp.plugins.bibles.lib.http,
line 537, in do_import File
D:\OpenLP_Code\2.0\build\pyi.win32\OpenLP\out00-PYZ.pyz\openlp.plugins.bibles.lib.http,
line 276, in get_books_from_http File
D:\OpenLP_Code\2.0\build\pyi.win32\OpenLP\out00-PYZ.pyz\openlp.core.utils,
line 467, in get_web_pageUnicodeDecodeError: 'ascii' codec can't decode
byte 0xc3 in position 54: ordinal not in range(128)--- Information om
system ---Platform: Windows-7-6.1.7601-SP1--- Biblioteksudgaver
---Python: 2.7.3Qt4: 4.8.3Phonon: 4.4.0PyQt4: 4.9.5QtWebkit:
534.34SQLAlchemy: 0.7.7SQLAlchemy Migrate: 0.7.2BeautifulSoup:
3.2.1lxml: 2.3.0Chardet: 1.0.1PyEnchant: 1.6.5PySQLite: 1.0.1Mako:
0.7.0pyUNO bridge: -Inholdet i fejlrapporten bedes skrives på engelsk,
da udviklerne af OpenLP er fra mange forskellige nationaliteter.

Related branches

Phill (phill-ridout)
tags: added: bible bible-import support-system
Phill (phill-ridout)
summary: - BibleGateway importer crashes on non ASCII names
+ BibleGateway importer crashes on non unicode urls
Revision history for this message
Phill (phill-ridout) wrote :

It appears that when a url encoded in unicode is requested with urllib2.urlopen .getUrl() returns unicode. However the url that we request in the biblegateway importer contains redirects. In this case the redirect uses an utf-8 url (im guessing that) urllib2 takes this as an utf-8 encoded url, so returns getUrl() encoded as utf-8. For the 2.0 branch I've just detected if getUrl return unicode or not. In trunk it might be wise to encode urls as utf-8 before we request them?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.