Your SlideShare is downloading. ×
Fluentd Unified Logging Layer At Fossasia
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Fluentd Unified Logging Layer At Fossasia

341
views

Published on

Fluentd talk at Fossasia

Fluentd talk at Fossasia

Published in: Technology

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
341
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Masahiro Nakagawa Mar 14, 2015 Fossasia 2015 Fluentd Unified logging layer
  • 2. Who am I > Masahiro Nakagawa > github: @repeatedly > Treasure Data, Inc. > Senior Software Engineer > Fluentd / td-agent developer > Living at OSS :) > D language - Phobos, a.k.a standard library, committer > Fluentd - Main maintainer > MessagePack / RPC - D and Python (only RPC) > The organizer of several meetups (Presto, DTM, etc…) > etc…
  • 3. Structured logging ! Reliable forwarding ! Pluggable architecture http://fluentd.org/ github:fluent/fluentd
  • 4. What’s Fluentd? > Data collector for unified logging layer > Streaming data transfer based on JSON > Simple core + plugins written in Ruby > Gem based various plugins > http://www.fluentd.org/plugins > List of users > http://www.fluentd.org/testimonials
  • 5. Before ✓ duplicated code for error handling... ✓ messy code for retrying mechanism...
  • 6. So painful!
  • 7. After
  • 8. Concept / Design
  • 9. Core Plugins > Divide & Conquer
 > Buffering & Retrying
 > Error handling
 > Message routing
 > Parallelism > Read / receive data > Parse data > Filter data > Buffer data > Format data > Write / send data

  • 10. Core Plugins > Divide & Conquer
 > Buffering & Retrying
 > Error handling
 > Message routing
 > Parallelism > Read / receive data > Parse data > Filter data > Buffer data > Format data > Write / send data
 Common Concerns Use Case Specific
  • 11. > default second unit > from data source Event structure(log message) ✓ Time > for message routing > where is from? ✓ Tag > JSON format > MessagePack
 internally > schema-free ✓ Record
  • 12. Reliable streaming data transfer error retry error retry retry retry Batch Stream Other stream (micro batch)
  • 13. Nagios PostgreSQL Hadoop Alerting Amazon S3 Analysis Archiving Elasticsearch Apache Frontend Access logs syslogd App logs System logs Backend Databases buffering / retrying / routing M x N → M + N plugins
  • 14. Use case
  • 15. Simple forwarding
  • 16. # logs from a file <source> type tail path /var/log/httpd.log pos_file /tmp/pos_file format apache2 tag backend.apache </source> ! # logs from client libraries <source> type forward port 24224 </source> ! # store logs to MongoDB <match backend.*> type mongo database fluent collection test </match>
  • 17. Less Simple Forwarding - At-most-once / At-least-once
 - HA (failover) - Load-balancing
  • 18. All data Near realtime and batch combo! Hot data
  • 19. # logs from a file <source> type tail path /var/log/httpd.log pos_file /tmp/pos_file format apache2 tag web.access </source> ! # logs from client libraries <source> type forward port 24224 </source> ! # store logs to ES and HDFS <match web.*> type copy <store> type elasticsearch logstash_format true </store> <store> type webhdfs host namenode port 50070 path /path/on/hdfs/ </store> </match>
  • 20. CEP for Stream Processing Norikra is a SQL based CEP engine: http://norikra.github.io/
  • 21. Container Logging
  • 22. > Kubernetes ! ! ! ! ! > Google Compute Engine > https://cloud.google.com/logging/docs/install/compute_install Fluentd on Kubernetes / GCE
  • 23. Slideshare http://engineering.slideshare.net/2014/04/skynet-project-monitor-scale-and-auto-heal-a-system-in-the-cloud/
  • 24. Log Analysis System And its designs in LINE Corp. 2014 early
  • 25. Architecture
  • 26. Internal Architecture Input Parser Buffer Output FormatterFilter OutputFormatter
  • 27. Internal Architecture Input Parser Buffer Output FormatterFilter “input-ish” “output-ish”
  • 28. Input plugins File tail (in_tail) Syslog (in_syslog) HTTP (in_http) HTTP/2 (in_http2 WIP) ... ✓ Receive logs ✓ Or pull logs from data sources ✓ non-blocking InpuInput
  • 29. Parser plugins JSON Regexp Apache/Nginx/Syslog CSV/TSV
 etc. ✓ Parse into JSON ✓ Common formats out of the box ✓ Some inputs plugin depends on
 Parser plugin ✓ v0.10.46 and above ParseParser
  • 30. Filter plugins grep record_transformer suppress … ✓ Filter / Mutate record ✓ Record level and Stream level ✓ v0.12 and above ParseParserFilter
  • 31. Buffer plugins ✓ Improve performance ✓ Provide reliability ✓ Provide thread-safety Memory (buf_memory) File (buf_file) BuffeBuffer
  • 32. Buffer internal ✓ Chunk = adjustable unit of data ✓ Buffer = Queue of chunks chunk chunk chunk output Input
  • 33. Formatter plugins ✓ Format output ✓ Some plugins depends on
 Formatter plugins ✓ v0.10.46 and above JSON CSV/TSV “single value” msgpack FormattFormatter
  • 34. Output plugins ✓ Write to external systems ✓ Buffered & Non-buffered ✓ 200+ plugins Outpu File (out_file) Amazon S3 (out_s3) MongoDB (out_mongo) ... Output
  • 35. Roadmap > v0.10 (old stable) > v0.12 (current stable) > Filter / Label / At-least-once > v0.14 (spring, 2015) > New plugin APIs, ServerEngine, Time… > v1 (summer, 2015) > Fix new features / APIs https://github.com/fluent/fluentd/wiki/V1-Roadmap
  • 36. Goodies
  • 37. fluent-bit > Made for Embedded Linux > OpenEmbedded & Yocto Project > Intel Edison, RasPi & Beagle Black boards > https://github.com/fluent/fluent-bit > Standalone application or Library mode > Built-in plugins > input: cpu, kmsg, output: fluentd > First release at the end of Mar 2015
  • 38. fluentd-ui > Manage Fluentd instance via Web UI > https://github.com/fluent/fluentd-ui
 
 
 
 
 

  • 39. Treasure Agent (td-agent) > Treasure Data distribution of Fluentd > including Ruby and QA’ed plugins > Treasure Agent 2 is current stable > We recommend to use v2, not v1 > including fluentd-ui > Next release, 2.2.0, uses fluentd v0.12
  • 40. Embulk > Bulk Loader version of Fluentd > Pluggable architecture > JRuby, JVM languages > High performance parallel processing > Share your script as a plugin > https://github.com/embulk http://www.slideshare.net/frsyuki/embuk-making-data-integration-works-relaxed
  • 41. HDFS MySQL Amazon S3 Embulk CSV Files SequenceFile Salesforce.com Elasticsearch Cassandra Hive Redis ✓ Parallel execution ✓ Data validation ✓ Error recovery ✓ Deterministic behaviour ✓ Idempotent retrying Plugins Plugins bulk load
  • 42. Check: treasuredata.com Cloud service for the entire data pipeline