Fluentd vs. Logstash for OpenStack Log Management

508

Published on

Slide at OpenStack Summit Tokyo 2015.
Session Info: http://sched.co/49tt
Video: https://www.youtube.com/watch?v=1ye0-sityBw

Published in: Software
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
508
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Fluentd vs. Logstash for OpenStack Log Management

  1. 1. Fluentd vs. Logstash Masaki Matsushita NTT Communications
  2. 2. About Me ● Masaki MATSUSHITA ● Software Engineer at ○ We are providing Internet access here! ● Github: mmasaki Twitter: @_mmasaki ● 16 Commits in Liberty ○ Trove, oslo_log, oslo_config ● CRuby Commiter ○ 100+ commits for performance improvement 2
  3. 3. What are Log Collectors? ● Provide pluggable and unified logging layer Without Log Collectors With Log Collectors Images from http://fluentd.org/ 3
  4. 4. Input, Filter and Output 4 Input Plugins tail syslog Filter Plugins grep hostname Output Plugins InfluxDB Elasticsearch ● They are implemented as plugins ● Can be replaced easily Log FIles Components
  5. 5. Two Popular Log Collectors ● Fluentd ○ Written in CRuby ○ Used in Kubernetes ○ Maintained by Treasure Data Inc. ● Logstash ○ Written in JRuby ○ Maintained by elastic.co ● They have similar features ● Which one is better for you? 5
  6. 6. Agenda ● Comparisons ○ Configuration ○ Supported Plugins ○ Performance ○ Transport Protocol ● Integrate OpenStack with Fluentd/Logstash ○ Considering High Availability 6
  7. 7. Configuration: Fluentd ● Every inputs are tagged ● Logs will be routed by tag nova-api.log (tag: openstack.nova) cinder-api.log (tag: openstack.cinder) <match openstack.nova> <match openstack.cinder> Filter/Route 7
  8. 8. Fluentd Configuration: Input <source> @type tail path /var/log/nova/nova-api.log tag openstack.nova </source> Example of tailing nova-api log ● Every inputs will be tagged 8
  9. 9. Fluentd Configuration: Output <match openstack.nova> # nova related logs @type elasticsearch host example.com </match> <match openstack.*> # all other OpenStack related logs @type influxdb # … </match> Routed by tag (First match is priority) Wildcards can be used 9
  10. 10. Fluentd Configuration: Copy <match openstack.*> @type copy <store> @type influxdb </store> <store> @type elasticsearch </store> </match> Copy plugin enables multiple outputs for a tag Copied Output tag: openstack.* 10
  11. 11. Logstash Configuration ● No tags ● All inputs will be aggregated ● Logs will be scattered to outputs nova-api.log cinder-api.log Filter/Aggregate aggregated logs 11
  12. 12. Logstash Configuration input { file { path => “/var/log/nova/*.log” } file { path => “/var/log/cinder/*.log” } } output { elasticsearch { hosts => [“example.com”] } influxdb { host => “example.com”... } } 12
  13. 13. Case 1: Separated Streams Input1 Input2 Input3 Output2 Output3 Output1 ● Handle multiple streams separately 13
  14. 14. Case 1: Separated Streams Fluentd: Simple matching by tag <match input.input1> @type output1 </match> <match input.input2> @type output2 </match> <match input.input3> @type output3 </match> Logstash: Conditional Outputs output { if [type] == “input1” { output1 {} } else if [type] == “input2” { output2 {} } else if [type] == “input3” { output3 {} } } Need to split aggregated logs 14
  15. 15. Case 2: Aggregated Streams Input1 Input2 Input3 Output2 Output3 Output1 ● Streams will be aggregated and scattered 15
  16. 16. Case 2: Aggregated Streams Fluentd: Copy plugins is needed <match input.*> @type copy <store> @type output1 </store> <store> @type output2 </store> <store> @type output3 </store> </match> Logstash: Quite simple output { output1 {} output2 {} output3 {} } 16
  17. 17. Configuration ● Fluentd ○ Routed by simple tag matching ○ Suited to handle log streams separately ● Logstash ○ Logs are aggregated ○ Suited to handle logs in gather-scatter style 17
  18. 18. Plugins ● Both provide many plugins ○ Fluentd: 300+, Logstash: 200+ ● Popular plugins are bundled with Logstash ○ They are maintained by the Logstash project ● Fluentd contains only minimal plugins ○ Most plugins are maintained by individuals ● Plugins can be installed easily by one command 18
  19. 19. Performance ● Depends on circumstances ● More than enough for OpenStack logs ○ Both can handle 10000+ logs/s ● Applying heavy filters is not a good idea ● CRuby is slow because of GVL? ○ GVL: Global VM (Interpreter) Lock ○ It’s not true for IO bound loads 19
  20. 20. GVL on IO bound loads ● IO operation can be performed in parallel 20 Thread 1 Thread 2 Idle : User Space: Kernel Space: Actual Read/Write Ruby Code Execution GVL Released/ Acquired IO operations in parallel
  21. 21. Transport Protocol ● Both collectors have their own transport protocol. ○ Failure Detection and Fallback ● Logstash: Lumberjack protocol ○ Active-Standby only ● Fluentd: forward protocol ○ Active-Active (Load Balancing), Active-Standby ○ Some additional features 21
  22. 22. Logstash Transport: lumberjack ● Active-Standby lumberjack { #config@source hosts => [ “primary”, “secondary” ] port => 1234 ssl_certificate => … } primary secondary source secondary is used when primary fails Fail Fallback 22
  23. 23. Fluentd Transport: forward ● Active-Active (Load Balancing) <match openstack.*> type forward <server> host dest1 </server> <server> host dest2 </server> </match> source dest1 dest2 Equally balanced outputs 23
  24. 24. Fluentd Transport: forward ● Active-Standby <match openstack.*> type forward <server> host primary </server> <server> host secondary standby </server> </match> primary secondary source Fail Fallback 24
  25. 25. Fluentd Transport: forward ● Weighted Load Balancing <match openstack.*> type forward <server> host dest1 weight 60 </server> <server> host dest2 weight 40 </server> </match> source dest1 dest2 60% 40% 25
  26. 26. Fluentd Transport: forward ● At-least-one Semantics (may affect performance) <match openstack.*> type forward require_ack_response <server> host dest </server> </match> destsource send logs ACK Logs are re-transmitted until ACK is received 26
  27. 27. Transport Protocol ● Both can be configured as Active-Standby mode. ● Fluentd has great features: ○ Active-Active Mode (Load Balancing) ○ At-least-one Semantics ○ Weighted Load Balancing 27
  28. 28. Forwarders ● Fluentd/Logstash have their own “forwarders” ○ Lightweight implementation written in Golang ○ Low memory consumption ○ One binary: Less dependent and easy to install 28 Node Tail log files Forwarder Log AggregatorForward/ Lumberjack Protocol
  29. 29. Forwarders: Config Example fluentd-forwarder: [fluentd-forwarder] to = fluent://fluentd1:24224 to = fluent://fluentd2:24224 logstash-forwarder: "network": { "servers": [ "logstash1:5043", "logstash2:5043" ] }Always send logs to both servers. Pick one active server and send logs only to it. Fallback to another server on failure. 29
  30. 30. Integration with OpenStack ● Tail log files by local Fluentd/Logstash ○ must parse many form of log files ● Rsyslog ○ installed by default in most distribution ○ can receive logs in JSON format ● Direct output from oslo_log ○ oslo_log: logging library used by components ○ Logging without any parsing 30
  31. 31. Log Aggregators OpenStack nodes Tail Log Files 31 Tail log files Forward Protocol dest1 dest2
  32. 32. Tail Log Files • Must handle many log files… syslog kern.log apache2/access.log apache2/error.log keystone/keystone-all.log keystone/keystone-manage.log keystone/keystone.log cinder/cinder-api.log cinder/cinder-scheduler.log neutron/neutron-server.log neutron/neutron-server.log nova/nova-api.log nova/nova-conductor.log nova/nova-consoleauth.log nova/nova-manage.log nova/nova-novncproxy.log nova/nova-scheduler.log mysql/error.log mysql/mysql-slow.log mysql.log mysql.err nova/nova-compute.log nova/nova-manage.log... 32
  33. 33. Tail Log Files • But you can use wildcard Fluentd: <source> type tail path /var/log/nova/*.log tag openstack.nova </source> Logstash: input { file { path => [“/var/log/nova/*.log”] } } 33
  34. 34. Parse Text Log ● Welcome to regular expression hell! <source> type tail # or syslog path /var/log/nova/nova-api.log format /^(?<asctime>.+) (?<process>d+) (?<loglevel>w+) (? <objname>S+)( [(-|(?<request_id>.+?) (?<user_identity>.+))])? ((?<remote>S*) "(?<method>S+) (?<path>[^"]*) S*?" status: (? <code>d*) len: (?<size>d*) time: (?<res_time>S)|(?<message>. *))/ </source> 34
  35. 35. Log Aggregators OpenStack nodes Rsyslog 35 via /dev/log Syslog Protocol (TCP or UDP) rsyslog
  36. 36. Rsyslog: Logging.conf ● Logging Configuration in detail ● Handler: Syslog, Formatter: JSON # /etc/{nova,cinder…}/logging.conf [handler_syslog] class = handlers.SysLogHandler args = ('/dev/log', handlers.SysLogHandler.LOG_LOCAL1) formatter = json [formatter_json] class = oslo_log.formatters.JSONFormatter 36
  37. 37. Example Output: JSONFormatter { "levelname": "INFO", "funcname": "start", "message": "Starting conductor node (version 13.0.0)", "msg": "Starting %(topic)s node (version %(version)s)", "asctime": "2015-09-29 18:29:57,690", "relative_created": 2454.8499584198, "process": 25204, "created": 1443518997.690932, "thread": 140119466896752, "name": "nova.service", "process_name": "MainProcess", "thread_name": "GreenThread-1", ... 37
  38. 38. Syslog Facilities ● Assignment of local0..7 Facilities for components ● Logs are tagged as like “syslog.local0” in Fluentd ● Example: ○ local0: Keystone ○ local1: Nova ○ local2: Cinder ○ local3: Neutron ○ local4: Glance 38
  39. 39. Rsyslog: Config@OpenStack nodes ● Active-Standby Configuration # /etc/rsyslog.d/rsyslog.conf user.* @@primary:5140 $ActionExecOnlyWhenPreviousIsSuspended on &@@secondary:5140 39
  40. 40. Rsyslog: Config@Aggregator Fluentd: <source> type syslog port 5140 protocol_type tcp format json tag syslog </source> Logstash: input { syslog { codec => json port => 5140 } } Listen on both TCP and UDP Specify TCP or UDP 40
  41. 41. Rsyslog: Config@Aggregator Fluentd: <source> type syslog port 5140 protocol_type tcp format json tag syslog </source> Logstash: input { syslog { codec => json port => 5140 } } 41
  42. 42. Log AggregatorsOpenStack nodes 42 via FluentHandler Forward Protocol Direct output from oslo_log Local Fluentd for buffering/load balancing (Logstash also can be used)
  43. 43. Direct output from oslo_log # logging.conf: [handler_fluent] class = fluent.handler.FluentHandler # fluent-logger formatter = fluent args = (’openstack.nova', 'localhost', 24224) [formatter_fluent] class = fluent.handler.FluentFormatter # our Blueprint 43 Format logs as Dictionary
  44. 44. Our BP in oslo_log: FluentFormatter { "hostname":"allinone-vivid", "extra":{"project":"unknown","version":"unknown"}, "process_name":"MainProcess", "module":"wsgi", "message":"(4132) wsgi starting up on http://0.0.0.0:8774/", "filename":"wsgi.py", "name":"nova.osapi_compute.wsgi.server", "level":"INFO", "traceback":null, "funcname":"server", "time":"2015-10-15 10:09:12,255" } Don’t need to parse! 44
  45. 45. Conclusion ● Log Handling ○ Fluentd: Logs are distinguished by tag ○ Logstash: No tags. Logs are aggregated ● Transport Protocol ○ Both supports active-standby mode ○ Fluentd supports some additional features ■ Client-side load balancing (Active-Active) ■ At-least-one semantics ■ Weighted load balancing 45
  46. 46. Conclusion ● Integration with OpenStack ○ Tail log files: regular expression hell ○ Rsyslog: No agents are needed ○ Direct output from oslo_log w/o any parsing ○ Review is welcome for our Blueprint (oslo_log: fluent-formatter) 46
  47. 47. Thank you! Please visit our booth! Robot Racing over WebRTC! →
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×