Full table scans very slow in OQGRAPH

Bug #635745 reported by Philip Stoev
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Won't Fix
High
Unassigned
OQGRAPH
Fix Committed
Wishlist
Antony T Curtis

Bug Description

Full table scans, such as those used in the SELECT * FROM table query issued by mysqldump are extremely slow. A 1-million node table takes more than 30 minutes to dump, which means less than 1000 rows per second. This makes it practically impossible to dump tables at server shutdown for the purpose of obtaining some data persistency.

The backtrace is as follows:

#0 open_query::edges_cursor::fetch_row (this=0xa735afd8, row_info=..., result=...) at graphcore.cc:1070
#1 0x0011ff58 in open_query::oqgraph::fetch_row (this=0xa736a618, result=...) at graphcore.cc:935
#2 0x0011e3c2 in ha_oqgraph::rnd_next (this=0xa735f8f0, buf=0xa735fc38 <incomplete sequence \343>) at ha_oqgraph.cc:810
#3 0x081a2fc2 in handler::ha_rnd_next (this=0xa735f8f0, buf=0xa735fc38 <incomplete sequence \343>) at sql_class.h:3281
#4 0x0839c22d in rr_sequential (info=0xa735cf60) at records.cc:390
#5 0x08303d5e in sub_select (join=0xa735b578, join_tab=0xa735cf20, end_of_records=false) at sql_select.cc:11665
#6 0x083038ab in do_select (join=0xa735b578, fields=0xb2cfd98, table=0x0, procedure=0x0) at sql_select.cc:11416
#7 0x082ecb90 in JOIN::exec (this=0xa735b578) at sql_select.cc:2328
#8 0x082ed2ad in mysql_select (thd=0xb2ce3c0, rref_pointer_array=0xb2cfdfc, tables=0xa735b308, wild_num=1, fields=..., conds=0x0, og_num=0, order=0x0,
    group=0x0, having=0x0, proc_param=0x0, select_options=2149894656, result=0xa735b560, unit=0xb2cfa80, select_lex=0xb2cfd04) at sql_select.cc:2528
#9 0x082e5b41 in handle_select (thd=0xb2ce3c0, lex=0xb2cfa24, result=0xa735b560, setup_tables_done_option=0) at sql_select.cc:280
#10 0x08289955 in execute_sqlcom_select (thd=0xb2ce3c0, all_tables=0xa735b308) at sql_parse.cc:5105
#11 0x08280292 in mysql_execute_command (thd=0xb2ce3c0) at sql_parse.cc:2288
#12 0x0828bb47 in mysql_parse (thd=0xb2ce3c0, inBuf=0xa735b168 "SELECT /*!40001 SQL_NO_CACHE */ * FROM `oqgraph_bulgaria`", length=57,
    found_semicolon=0xa746f22c) at sql_parse.cc:6055
#13 0x0827dca7 in dispatch_command (command=COM_QUERY, thd=0xb2ce3c0, packet=0xb31f7d1 "57150\t276446930\001\061\373\373\031", packet_length=57)
    at sql_parse.cc:1204
#14 0x0827d103 in do_command (thd=0xb2ce3c0) at sql_parse.cc:898
#15 0x0827a1ec in handle_one_connection (arg=0xb2ce3c0) at sql_connect.cc:1154
#16 0x00bea919 in start_thread () from /lib/libpthread.so.0
#17 0x00b2ccbe in clone () from /lib/libc.so.6

And looping seems to happen here:

(gdb) list
1065 int edges_cursor::fetch_row(const row &row_info, row &result)
1066 {
1067 edge_iterator it, end;
1068 reference ref;
1069 size_t count= position;
1070 for (tie(it, end)= edges(share->g); count && it != end; ++it, --count)
1071 ;
1072 if (it != end)
1073 ref= reference(position+1, *it);
1074 if (int res= fetch_row(row_info, result, ref))

bzr version-info:

revision-id: <email address hidden>
date: 2010-08-26 16:20:27 +0300
build-date: 2010-09-11 16:30:45 +0300
revno: 2849
branch-nick: maria-5.2

Revision history for this message
Arjen Lentz (arjen-lentz) wrote :

While not disregarding the issue, your last argument makes no sense, unless you qualify it with use of the word 'some'. But I'd suggest you either have persistence or you don't, just like a woman can't be "somewhat" pregnant ;-)
Taking a mysqldump from the Mk.II oqgraph implementation is merely a snapshot - if the server were to fail later, you'd miss data that can only be retrieved through replaying the binlog. In addition, oqgraph data tends to reference other tables with additional data. So typically, another table actually has all the data including the link references - the oqgraph table duplicates the link columns for quick searching. This duplication is accomplished through (for instance) an INSERT ... SELECT ... on startup and then either periodically updated or kept in sync via triggers while the server is running.
So, with an actual deployment, there a) is persistence of the dataset "somewhere else", and b) the oqgraph table can be excluded from any mysqldump.

Changed in oqgraph:
importance: Undecided → Wishlist
summary: - Full table scans very slow
+ Full table scans very slow in OQGRAPH
Changed in oqgraph:
status: New → Triaged
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

I understand your two-table scenario. It would be nice if it is documented better. I will use it in my future testing.

My point were situations where building the graph would be costly, so people may prefer to dump and restore it rather than build it on server startup. At this time, once your graph is sufficiently large, it is not possible to take any data out of it, including via ALTER TABLE ENGINE , so keeping a copy in some other engine is imperative.

Changed in maria:
milestone: none → 5.2
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

To clarify, the mysqldump of a 1M node table did not complete within 1 hour, becoming progressively slower to a point where it seems to have halted altogether. A backup tool that was not instructed to exclude oqgraph tables may hang and fail to backup the rest of the database.

Revision history for this message
Arjen Lentz (arjen-lentz) wrote :

yea. I agree that regardless of other practicalities, a plain "SELECT * FROM table" (no WHERE) should be fast. Set to confirmed and high priority.

Changed in maria:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Kristian Nielsen (knielsen) wrote :

It's been >1 year with no activity from upstream. There seems little point in keeping this open for mariadb - there are no plans from mariadb project to take over maintenance of OQGraph.

Changed in maria:
status: Confirmed → Won't Fix
Revision history for this message
Arjen Lentz (arjen-lentz) wrote :

Antony's v3 prototype covers this aspect also, because of its different architecture.

Changed in oqgraph:
assignee: nobody → Antony T Curtis (atcurtis)
status: Triaged → Confirmed
Revision history for this message
Arjen Lentz (arjen-lentz) wrote :

This bug relates to OQGraph v2.
v3 architecture is different and can't have this problem by design.

Changed in oqgraph:
status: Confirmed → Fix Committed
Revision history for this message
Arjen Lentz (arjen-lentz) wrote :

What we do need to do is have mysqldump avoid OQGRAPH tables, just like it needs to avoid VIEWs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.