transaction log not differentiating NULL values and empty string for char columns

Bug #594876 reported by Patrick Crews
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
Undecided
Joe Daly
Dexter
Fix Released
Undecided
Joe Daly

Bug Description

The transaction log is not properly differentiating between NULL values and empty strings ("") for character columns

From the test case below:
We inserted a NULL and "" into the CHAR field, b. These are different values, but both are recorded as "" in the transaction_log output (shown in comments below in the test case

select * from t1;
+---+------+
| a | b |
+---+------+
| 1 | NULL |
| 2 | |
+---+------+
2 rows in set (0 sec)

test case: run with ./test-run --mysqld=--transaction_log_enable

DROP TABLE IF EXISTS t1;

CREATE TABLE t1 (a INT NOT NULL, b CHAR(1000), PRIMARY KEY (a));
INSERT INTO t1 VALUES (1,NULL);

# Check the transaction_log contents:

SELECT PRINT_TRANSACTION_MESSAGE('transaction.log',(select max(entry_offset) from data_dictionary.transaction_log_transactions));

# Comments below are to show output, not part of the test case
# This gives us:
#| transaction_context {
# server_id: 1
# transaction_id: 60356
# start_timestamp: 1276642743948477
# end_timestamp: 1276642743991044
#}
#statement {
# type: INSERT
# start_timestamp: 1276642743948479
# end_timestamp: 1276642743991044
# insert_header {
# table_metadata {
# schema_name: "test"
# table_name: "t4"
# }
# field_metadata {
# type: INTEGER
# name: "a"
# }
# field_metadata {
# type: VARCHAR
# name: "b"
# }
# }
# insert_data {
# segment_id: 1
# end_segment: true
# record {
# insert_value: "1"
# insert_value: ""
# }
# }
#}
# |

INSERT INTO t4 VALUES (2,"");

SELECT PRINT_TRANSACTION_MESSAGE('transaction.log',(select max(entry_offset) from data_dictionary.transaction_log_transactions));
# Gives us:
#| transaction_context {
# server_id: 1
# transaction_id: 60358
# start_timestamp: 1276642757258085
# end_timestamp: 1276642757301771
#}
#statement {
# type: INSERT
# start_timestamp: 1276642757258087
# end_timestamp: 1276642757301770
# insert_header {
# table_metadata {
# schema_name: "test"
# table_name: "t1"
# }
# field_metadata {
# type: INTEGER
# name: "a"
# }
# field_metadata {
# type: VARCHAR
# name: "b"
# }
# }
# insert_data {
# segment_id: 1
# end_segment: true
# record {
# insert_value: "2"
# insert_value: ""
# }
# }
#}
# |

Related branches

Changed in drizzle:
status: New → Confirmed
Revision history for this message
David Shrewsbury (dshrews) wrote :

After looking at this, and confirming with Jay, looks like message::Statement has a design deficiency that doesn't allow us to differentiate between NULL and empty strings (or even 0 for integer fields).

For example, transaction.proto uses InsertRecord to hold the values for INSERT statements:

message InsertRecord
{
  repeated bytes insert_value = 1;
}

An inserted NULL will be represented the same as an empty string (a std::string is used internally).

The fix for this is to add a boolean field (e.g., is_null) to InsertRecord to indicate when a represented value is actually NULL.

This will most likely affect plugins that read the transaction log.

Joe Daly (skinny.moey)
Changed in drizzle:
assignee: nobody → Joe Daly (skinny.moey)
Joe Daly (skinny.moey)
Changed in drizzle:
status: Confirmed → Fix Committed
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.