contrail-docker - contrail-analytics - contrail-ansible: redis configuration not consistent in multi-interface setup

Bug #1766889 reported by Bernhard Koessler
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.1
Incomplete
High
Bernhard Koessler
R5.0
Incomplete
High
Bernhard Koessler
Trunk
Incomplete
High
Bernhard Koessler

Bug Description

after deploying contrail networking 4.1.0 for Ubuntu 16.04 Ocata with contrail-charms in a multi-interface setup, it can be seen that the redis configuration for contrail-analytics is not functional in all cases.

root@cldsv00003(analytics):/# netstat -tulpen | grep 6381
tcp 0 0 127.0.0.1:6381 0.0.0.0:* LISTEN 105 651235932 29479/redis-server
tcp 0 0 139.1.150.83:6381 0.0.0.0:* LISTEN 105 651235931 29479/redis-server
root@cldsv00003(analytics):/# tailf /var/log/contrail/contrail-analytics-api.log
04/24/2018 11:37:52 AM [contrail-analytics-api]: Starting agguve part 28 using PartInfo(ip_address=u'172.30.14.2', instance_id=u'0', acq_time=1524569854545985, port=6381)
04/24/2018 11:37:52 AM [contrail-analytics-api]: before res list is prouter
04/24/2018 11:37:52 AM [contrail-analytics-api]: res list is None
04/24/2018 11:37:52 AM [contrail-analytics-api]: before res list is vrouter
04/24/2018 11:37:52 AM [contrail-analytics-api]: res list is None
04/24/2018 11:37:52 AM [contrail-analytics-api]: usr res are None
04/24/2018 11:37:57 AM [contrail-analytics-api]: redis/collector healthcheck failed Error 111 connecting to 172.30.14.2:6381. Connection refused. for RedisInstKey(ip='172.30.14.2', port=6381)

/etc/contrailctl/analytuics.conf has the following addresses configured:

[GLOBAL]
controller_nodes = 172.30.14.2,172.30.14.3,172.30.14.4
analyticsdb_nodes = 172.30.14.2,172.30.14.3,172.30.14.4
analytics_nodes = 172.30.14.2,172.30.14.3,172.30.14.4

The interfaces on the server/docker container are as follows (as seen from ansible):
root@cldsv00003(analytics):/# cat << 'EOF' > ansible-test.yaml
> - hosts: all
> gather_facts: yes
> tasks:
> - debug: var=ansible_all_ipv4_addresses
> EOF
root@cldsv00003(analytics):/# ansible-playbook -i contrail-ansible-internal/playbooks/inventory/all-in-one ansible-test.yaml

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "ansible_all_ipv4_addresses": [
        "139.1.150.83",
        "10.0.53.1",
        "172.30.14.2",
        "172.17.0.1"
    ]
}

PLAY RECAP *********************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0

From the redis configuratino it can be seen that only the first interface is chosen which is actually a different interface as configured in the contrail configuration.

/# grep -v "^#\|^\s*$" /etc/redis/redis.conf
daemonize yes
pidfile /var/run/redis/redis-server.pid
port 6381
tcp-backlog 511
bind 139.1.150.83 127.0.0.1
timeout 0
tcp-keepalive 0
loglevel notice
logfile /var/log/redis/redis-server.log
databases 16
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dir /var/lib/redis
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 15000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
root@cldsv00003(analytics):/#

Solution would be to either:
- configure localhost for redis for all services using redis on the analytics container OR
- make sure the configured address space is used for the redis bind configuration

References:
https://github.com/Juniper/contrail-ansible-internal/blob/4411e625365b744e07d95ca6c4180ec1ea2f30a9/playbooks/roles/redis/tasks/patch-redis-conf.yml#L7

https://github.com/Juniper/contrail-ansible-internal/blob/4d2569e858a9ea08296266f9c7235af975bc49f1/playbooks/roles/contrail/common/vars/main.yml#L85

Changed in juniperopenstack:
assignee: nobody → Andrey Pavlov (apavlov-e)
Changed in juniperopenstack:
milestone: none → r4.1.1.0
Revision history for this message
Bernhard Koessler (bkoessler) wrote :

To clarify for all services on the analytics container:
- analytics-api and alarm-gen have [REDIS] redis_uve_list with the actual cluster address:port list --> if this is set to an address that redis is not listening on it will fail
- collector and query engine have [REDIS] server = 127.0.0.1 --> this works as 127.0.0.1 is always configured as bind interface

information type: Proprietary → Public
Revision history for this message
Andrey Pavlov (apavlov-e) wrote :

I can't understand how it's possible

I have next output:
        "ansible_all_ipv4_addresses": [
            "172.17.0.1",
            "10.0.11.20",
            "10.0.140.1",
            "10.0.10.20"
        ],

and from contrailctl:

controller_nodes = 10.0.10.20
analyticsdb_nodes = 10.0.10.20
analytics_nodes = 10.0.10.20

and I have correct address in redis.cof -
bind 10.0.10.20 127.0.0.1

this behavior is expect-able. intersection between addresses can give us only one correct variant.

Can we have more information from that setup?

Revision history for this message
Andrey Pavlov (apavlov-e) wrote :

I'm not sure is it a good fix to change redis_ips for analytics-api and alarm-gen to 12.0.0.1
and I don't understand how to check it
this fix should be applied to contrail-ansible-internal and can be available in next 4.1.1 build only

Revision history for this message
Andrey Pavlov (apavlov-e) wrote :
Revision history for this message
Andrey Pavlov (apavlov-e) wrote :

waiting for more customer's logs/info

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.