Fluentd   unified logging layer
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Fluentd unified logging layer

  • 149 views
Uploaded on

RubyConf 2014: Building the Unified Logging Layer with Fluentd and Ruby

RubyConf 2014: Building the Unified Logging Layer with Fluentd and Ruby

More in: Software
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
149
On Slideshare
114
From Embeds
35
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
2

Embeds 35

https://twitter.com 34
http://news.google.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Kiyoto Tamura Nov 17, 2014 RubyConf 2014 Fluentd Unified Logging Layer
  • 2. whoami Kiyoto Tamura GitHub/Twitter: kiyoto/kiyototamura Treasure Data, Inc. Director of Developer Relations Fluentd maintainer 2
  • 3. a ruby n00b
  • 4. Fluentd n00b too
  • 5. why me? Busy writing code! Just gave a talk! I’m giving a talk! Busy writing code! Busy as CTO! San Diego’s nice!
  • 6. What’s Fluentd? simple core + plugins like syslogd An extensible & reliable data collection tool buffering, HA (failover), load balance, etc.
  • 7. data collection tool
  • 8. Metrics Blueflood Analysis MongoDB MySQL Hadoop Archiving Amazon S3 Access logs Apache App logs Frontend Backend System logs syslogd Your system bash scripts ruby scripts log file cron rsync python scripts bash custom loggger other custom scripts... ✓ duplicated code for error handling... ✓ messy code for retrying mechnism...
  • 9. (this is painful!!!)
  • 10. Metrics Blueflood Analysis MongoDB MySQL Hadoop Archiving Amazon S3 Access logs Apache App logs Frontend Backend System logs syslogd Your system filter / buffer / route
  • 11. extensible
  • 12. Core Plugins 12 • Divide & Conquer • Buffering & Retries • Error Handling • Message Routing • Parallelism • Read Data • Parse Data • Buffer Data • Write Data • Format Data
  • 13. Core Plugins 13 • Divide & Conquer • Buffering & Retries • Error Handling • Message Routing • Parallelism • Read Data • Parse Data • Buffer Data • Write Data • Format Data Common Concerns Use Case Specific
  • 14. reliable
  • 15. reliable data transfer
  • 16. Divide & Conquer & Retry error retry retry error retry retry
  • 17. reliable process
  • 18. This? 18
  • 19. Or this? 19
  • 20. M x N → M + N Alerting Nagios Analysis MongoDB MySQL Hadoop Archiving Amazon S3 Access logs Apache App logs Frontend Backend System logs syslogd Databases buffer/filter/route
  • 21. use cases
  • 22. Simple Forwarding 22
  • 23. # logs from a file <source> type tail path /var/log/httpd.log format apache2 tag backend.apache </source> # logs from client libraries <source> type forward port 24224 </source> # store logs to ES and HDFS <match backend.*> type mongo database fluent collection test </match>
  • 24. Less Simple Forwarding 24
  • 25. Lambda Architecture 25
  • 26. # logs from a file <source> type tail path /var/log/httpd.log format apache2 tag web.access </source> # logs from client libraries <source> type forward port 24224 </source> # store logs to ES and HDFS <match backend.*> type copy <store> type elasticsearch logstash_format true </store> <store> type webhdfs host namenode port 50070 path /path/on/hdfs/ </store> </match>
  • 27. CEP for Stream Processing 27
  • 28. Container Logging 28
  • 29. Fluentd on Kubernetes
  • 30. architecture
  • 31. Internal Architecture Input Parser Buffer Output Formatter
  • 32. Internal Architecture Input Parser Buffer Output Formatter “input-ish” “output-ish”
  • 33. Input plugins HTTP+JSON (in_http) File tail (in_tail) Syslog (in_syslog) ... ✓ Receive logs ✓ Or pull logs from data sources ✓ non-blocking Input
  • 34. Input plugins module Fluent class NewTailInput < Input Plugin.register_input('tail', self) def initialize super @paths = [] @tails = {} end end # Little more code end
  • 35. Input plugins module Fluent class NewTailInput < Input Plugin.register_input('tail', self) def initialize super @paths = [] @tails = {} end config_param :path, :string config_param :tag, :string config_param :rotate_wait, :time, :default => 5 config_param :pos_file, :string, :default => nil config_param :read_from_head, :bool, :default => false config_param :refresh_interval, :time, :default => 60 attr_reader :paths def configure(conf) super @paths = @path.split(',').map {|path| path.strip } if @paths.empty? raise ConfigError, "tail: 'path' parameter is required on tail input" end unless @pos_file $log.warn "'pos_file PATH' parameter is not set to a 'tail' source." $log.warn "this parameter is highly recommended to save the position to resume tailing." end configure_parser(conf) configure_tag @multiline_mode = conf['format'] == 'multiline' @receive_handler = if @multiline_mode method(:parse_multilines) else method(:parse_singleline) end end def configure_parser(conf) @parser = TextParser.new @parser.configure(conf) end def configure_tag if @tag.index('*') @tag_prefix, @tag_suffix = @tag.split('*') @tag_suffix ||= '' else @tag_prefix = nil @tag_suffix = nil end end def start if @pos_file @pf_file = File.open(@pos_file, File::RDWR|File::CREAT, DEFAULT_FILE_PERMISSION) @pf_file.sync = true @pf = PositionFile.parse(@pf_file) end @loop = Coolio::Loop.new refresh_watchers @refresh_trigger = TailWatcher::TimerWatcher.new(@refresh_interval, true, log, &method(:refresh_watchers)) @refresh_trigger.attach(@loop) @thread = Thread.new(&method(:run)) end def shutdown @refresh_trigger.detach if @refresh_trigger && @refresh_trigger.attached? stop_watchers(@tails.keys, true) @loop.stop rescue nil # when all watchers are detached, `stop` raises RuntimeError. We can ignore this exception. @thread.join @pf_file.close if @pf_file end def expand_paths date = Time.now paths = [] @paths.each { |path| path = date.strftime(path) if path.include?('*') paths += Dir.glob(path) else # When file is not created yet, Dir.glob returns an empty array. So just add when path is static. paths << path end } paths end # in_tail with '*' path doesn't check rotation file equality at refresh phase. # So you should not use '*' path when your logs will be rotated by another tool. # It will cause log duplication after updated watch files. # In such case, you should separate log directory and specify two paths in path parameter. # e.g. path /path/to/dir/*,/path/to/rotated_logs/target_file def refresh_watchers target_paths = expand_paths existence_paths = @tails.keys unwatched = existence_paths - target_paths added = target_paths - existence_paths 700 lines!
  • 36. Input plugins module Fluent class TcpInput < SocketUtil::BaseInput Plugin.register_input('tcp', self) config_set_default :port, 5170 config_param :delimiter, :string, :default => "n" # syslog family add "n" to each message and this seems only way to split messages in tcp stream def listen(callback) log.debug "listening tcp socket on #{@bind}:#{@port}" Coolio::TCPServer.new(@bind, @port, SocketUtil::TcpHandler, log, @delimiter, callback) end end end
  • 37. Input plugins class BaseInput < Fluent::Input # some code def on_message(msg, addr) @parser.parse(msg) { |time, record| unless time && record log.warn "pattern not match: #{msg.inspect}" return end record[@source_host_key] = addr[3] if @source_host_key Engine.emit(@tag, time, record) } # some code end
  • 38. Input plugins class BaseInput < Fluent::Input # some code def on_message(msg, addr) @parser.parse(msg) { |time, record| unless time && record log.warn "pattern not match: #{msg.inspect}" return end record[@source_host_key] = addr[3] if @source_host_key Engine.emit(@tag, time, record) } # some code end
  • 39. Parser plugins JSON Regexp Apache/Nginx/Syslog CSV/TSV, etc. ✓ Parse into JSON ✓ Common formats out of the box ✓ v0.10.46 and above Parser
  • 40. Parser plugins <source> type tcp tag tcp.data format /^(?<field_1>d+) (?<field_2>w+)/ </source>
  • 41. Parser plugins def call(text) m = @regexp.match(text) # some code time = nil record = {} m.names.each {|name| if value = m[name] case name when "time" time = @mutex.synchronize { @time_parser.parse(value) } else record[name] = if @type_converters.nil? value else convert_type(name, value) end end end } # some code end
  • 42. Buffer plugins ✓ Improve performance ✓ Provide reliability Buffer Memory (buf_memory) ✓ Provide thread-safety File (buf_file)
  • 43. Buffer plugins ✓ Chunk = adjustable unit of data ✓ Buffer = Queue of chunks chunk chunk chunk output Input
  • 44. Output plugins ✓ Write to external systems ✓ Buffered & Non-buffered ✓ 200+ plugins Output File (out_file) Amazon S3 (out_s3) MongoDB (out_mongo) ...
  • 45. Output plugins class FileOutput < TimeSlicedOutput Plugin.register_output('file', self) # some code def write(chunk) path = generate_path(chunk) FileUtils.mkdir_p File.dirname(path) case @compress when nil File.open(path, "a", DEFAULT_FILE_PERMISSION) {|f| chunk.write_to(f) } when :gz File.open(path, "a", DEFAULT_FILE_PERMISSION) {|f| gz = Zlib::GzipWriter.new(f) chunk.write_to(gz) gz.close } end return path # for test end # more code
  • 46. Formatter plugins ✓ Format output ✓ Only partially supported for now Formatter JSON ✓ v0.10.49 and above CSV/TSV “single value”
  • 47. Formatter plugins class SingleValueFormatter include Configurable config_param :message_key, :string, :default => 'message' config_param :add_newline, :bool, :default => true def format(tag, time, record) text = record[@message_key].to_s text << "n" if @add_newline text end end
  • 48. Internal Architecture Input Parser Buffer Output Formatter
  • 49. Adding Filter in v0.12! Input Parser Filter Buffer Output Formatter
  • 50. Roadmap 50 2014 2015 Nov Dec Jan Feb Mar Apr May v0.12 • filter • label v0.14 • plugin API • ServerEngine V1.0!? • we can use help!
  • 51. goodies
  • 52. fluentd-ui 52
  • 53. Treasure Agent • Treasure Data distribution of Fluentd • including Ruby, core libraries and QA’ed 3rd party plugins • rpm/deb/dmg • 2.1.2 is released TODAY with fluentd-ui 53
  • 54. fluentd-forwarder • Forwarding agent written in Go • mainly for Windows support • less mature than Fluentd • Bundle TCP input/output and TD output • No plugin mechanism 54
  • 55. Thank you! kiyoto@treasuredata.com @kiyototamura