ResultSet aggregates do not respect distinct option
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Fix Released
|
Low
|
Michael Nelson | ||
Storm |
Fix Released
|
High
|
James Henstridge |
Bug Description
The ResultSet.count() method is described as "Get the number of objects represented by this ResultSet.". However, it ignores the ResultSet._distinct flag.
Using the data from the tests/store/base.py tests, consider the following query:
result = store.find(Foo, Foo.id == Link.foo_id)
This result has 6 rows (3 with foo_id=10, 2 with foo_id=20 and 1 with foo_id=30). Issuing result.count() generates a query like:
SELECT COUNT(*) FROM foo WHERE foo.id = link.foo_id
We then configure it to be distinct:
result.
The result set now has 3 items, but result.count() still returns 6 because it is ignoring the setting and using the same query. Some possible correct queries are:
SELECT COUNT(DISTINCT foo.id) FROM foo WHERE foo.id = link.foo_id;
SELECT COUNT(*) FROM (SELECT DISTINCT foo.id FROM foo WHERE foo.id = link.foo_id);
SELECT COUNT(*) FROM foo WHERE foo.id IN (SELECT DISTINCT foo.id FROM foo WHERE foo.id = link.foo_id);
The first works great for a standard count of a table with a single primary key column. The second should work for multi-column primary keys. The third would probably be needed for counts other than "COUNT(*)".
This problem looks like it would also affect the avg() and sum() aggregates for distinct result sets, hence the title.
Related branches
- Jamu Kakar (community): Approve
- Sidnei da Silva (community): Approve
- Jamu Kakar (community): Approve
- Stuart Bishop (community): Approve
- Diff: 204 lines
Changed in storm: | |
importance: | Undecided → High |
Changed in storm: | |
assignee: | nobody → therve |
status: | New → In Progress |
Changed in storm: | |
milestone: | none → 0.15 |
affects: | launchpad → launchpad-foundations |
Changed in storm: | |
status: | Fix Committed → Fix Released |
tags: | added: tech-debt |
tags: |
added: qa-ok removed: qa-needstesting |
Changed in launchpad-foundations: | |
status: | Fix Committed → Fix Released |
The same problem happens with 'limit' value, it's ignored for all aggregated functions.
{{{ select( limit=1) assertEquals( len(list( result) ), 1) assertEquals( result. count() , 2)
result = self.Person.
# EXECUTE: 'SELECT person.age, person.id, person.name, person.ts FROM person ORDER BY Person.name DESC LIMIT 1', ()
self.
# EXECUTE: 'SELECT COUNT(*) FROM person', ()
self.
}}}