http://blog.csdn.net/hijk139/article/details/8308224
業務系統需要收集監控系統日志,想到了hadoop的flume。經過試驗,雖說功能不算足夠強大,但基本上能夠滿足功能需求。Flume 是一個分布式、可靠和高可用的服務日志收集工具,能夠和hadoop,hive等配置完成日志收集,存儲,分析處理等工作,更詳細的介紹可以參見apache網站。下面介紹下簡單的安裝配置方法
1,網上下載flume-ng安裝包,分別部署在收集和接收日志文件的服務器上,服務器上需安裝jdk 1.6以上,
http://flume.apache.org/download.html
tar -zxvf apache-flume-1.3.0-bin.tar.gz
2, 日志文件接收端端新建conf/flume-conf.properties server端的具體配置如下
從avro source端接收數據,然后寫入到HDFS文件系統中
- [flume@?conf]$?cat??flume-conf.properties??
- agent.sources?=? avrosrc??
- agent.channels?=? memoryChanne3??
- agent.sinks?=? hdfsSink??
- ??
- #?For?each?one?of?the?sources,?the?type?is?defined??
- agent.sources.avrosrc.type?=? avro??
- agent.sources.avrosrc.bind?=? 172.16.251.1??
- agent.sources.avrosrc.port?=? 44444??
- ??
- #?The?channel?can?be?defined?as?follows.??
- agent.sources.avrosrc.channels?=? memoryChanne3??
- ??
- #?Each?channel's?type?is?defined.??
- agent.channels.memoryChanne3.type?=? memory??
- agent.channels.memoryChanne3.keep-alive?=? 10??
- agent.channels.memoryChanne3.capacity?=? 100000??
- agent.channels.memoryChanne3.transactionCapacity?= 100000??
- ??
- #?Each?sink's?type?must?be?defined??
- agent.sinks.hdfsSink.type?=? hdfs??
- agent.sinks.hdfsSink.channel?=? memoryChanne3??
- agent.sinks.hdfsSink.hdfs.path?=?/logdata/%{hostname}_linux/%Y%m%d_date??
- agent.sinks.hdfsSink.hdfs.filePrefix?=?%{datacenter}_??
- agent.sinks.hdfsSink.hdfs.rollInterval?=? 0??
- agent.sinks.hdfsSink.hdfs.rollSize?=? 4000000??
- agent.sinks.hdfsSink.hdfs.rollCount?=? 0??
- agent.sinks.hdfsSink.hdfs.writeFormat?=? Text??
- agent.sinks.hdfsSink.hdfs.fileType?=? DataStream??
- agent.sinks.hdfsSink.hdfs.batchSize?=? 10??
如果flume和hadoop不是同一用戶,需要注意相關權限問題
3,日志收集端的conf/flume-conf.properties server文件配置,這里收集二個日志文件到收集端
- agent.sources?=? tailsource-1?tailsource-2??
- agent.channels?=? memoryChannel-1?memoryChannel-2??
- agent.sinks?=? remotesink?remotesink-2??
- ??
- agent.sources.tailsource-1.type?=? exec??
- agent.sources.tailsource-1.command?=? tail?-F?/tmp/linux2.log??
- agent.sources.tailsource-1.channels?=? memoryChannel-1??
- ??
- agent.sources.tailsource-2.type?=? exec??
- agent.sources.tailsource-2.command?=? tail?-F?/tmp/linux2_2.log??
- agent.sources.tailsource-2.channels?=? memoryChannel-2??
- ??
- agent.sources.tailsource-1.interceptors?=? host_int?timestamp_int?inter1??
- agent.sources.tailsource-1.interceptors.host_int.type?=? host??
- agent.sources.tailsource-1.interceptors.host_int.hostHeader?=? hostname??
- ??
- agent.sources.tailsource-1.interceptors.timestamp_int.type?=? org.apache.flume.interceptor.TimestampInterceptor$Builder??
- ??
- # agent.sources.tailsource-1.interceptors?=? inter1??
- agent.sources.tailsource-1.interceptors.inter1.type?=? static??
- agent.sources.tailsource-1.interceptors.inter1.key?=? datacenter??
- agent.sources.tailsource-1.interceptors.inter1.value?=? BEIJING??
- ??
- agent.sources.tailsource-2.interceptors?=? host_int?timestamp_int?inter1??
- agent.sources.tailsource-2.interceptors.host_int.type?=? host??
- agent.sources.tailsource-2.interceptors.host_int.hostHeader?=? hostname??
- ??
- agent.sources.tailsource-2.interceptors.timestamp_int.type?=? org.apache.flume.interceptor.TimestampInterceptor$Builder??
- ??
- # agent.sources.tailsource-1.interceptors?=? inter1??
- agent.sources.tailsource-2.interceptors.inter1.type?=? static??
- agent.sources.tailsource-2.interceptors.inter1.key?=? datacenter??
- agent.sources.tailsource-2.interceptors.inter1.value?=? linux2_2??
- ??
- agent.channels.memoryChannel-1.type?=? memory??
- agent.channels.memoryChannel-1.keep-alive?=? 10??
- agent.channels.memoryChannel-1.capacity?=? 100000??
- agent.channels.memoryChannel-1.transactionCapacity?= 100000??
- ??
- agent.channels.memoryChannel-2.type?=? memory??
- agent.channels.memoryChannel-2.keep-alive?=? 10??
- agent.channels.memoryChannel-2.capacity?=? 100000??
- agent.channels.memoryChannel-2.transactionCapacity?= 100000??
- ??
- agent.sinks.remotesink.type?=? avro??
- agent.sinks.remotesink.hostname?=? 172.16.251.1??
- agent.sinks.remotesink.port?=? 44444??
- agent.sinks.remotesink.channel?=? memoryChannel-1??
- ??
- agent.sinks.remotesink-2.type?=? avro??
- agent.sinks.remotesink-2.hostname?=? 172.16.251.1??
- agent.sinks.remotesink-2.port?=? 44444??
- agent.sinks.remotesink-2.channel?=? memoryChannel-2??
4,后臺運行
nohup bin/flume-ng agent -n agent -c conf -f conf/flume-conf.properties >1.log &
查看日志vi flume.log
端口連接情況 netstat -an|grep 44444
[flume@dtydb6 flume-1.4]$ netstat -an|grep 44444
tcp??????? 0????? 0 ::ffff:172.16.251.1:44444?? :::*??????????????????????? LISTEN ?
?
5,測試方法
可以使用如下類似的腳本,定期向日志文件寫入來進行測試
for i in {1..1000000}; do echo "LINUX2? PRESS ************* Flume log rotation $i" >> /tmp/linux3.log; sleep 0.0001; done
?
?
參考資料:
http://flume.apache.org/FlumeUserGuide.html
更多文章、技術交流、商務合作、聯系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號聯系: 360901061
您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點擊下面給點支持吧,站長非常感激您!手機微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對您有幫助就好】元
