- install JRuby (downloads)
- install Norikra (要等個幾分鐘才會開始裝)
- start Norikra (要等個幾分鐘才會啟動成功)
- Norikra web UI :http://IP:26578
- 以 web service 的 log 為例,首先定義 target 的欄位 (minimal fields set for variations of 'www' events),定義最基本的欄位就好,可再修改
- 加入 query 指令:查詢首頁(path="/")被存取的次數
- 加入 query 指令 with Views (Windows)。Views are used to define time windows or to control timing to fire events. With views, norikra controls the range of events for each queries.
- 開始將資料送到 norikra
- script
- log
- 查看 query 結果
- 取得 query 結果,取到的結果會被刪除,如沒有新的事件發生,再查看 query 結果是沒有資料的
# tar xzvf jruby-bin-1.7.15.tar.gz # mv jruby-1.7.15 /opt # echo "PATH=$PATH:/opt/jruby-1.7.15/bin" > ~/.bashrc # jruby -v jruby 1.7.15 (1.9.3p392) 2014-09-03 82b5cc3 on Java HotSpot(TM) 64-Bit Server VM 1.6.0_33-b04 +jit [linux-amd64]
# gem install norikra
# norikra start
# norikra-client target open www path:string status:integer referer:string agent:string userid:integer # norikra-client target list TARGET AUTO_FIELD www true 1 targets found.
# norikra-client query add www.toppage 'SELECT count(*) AS cnt FROM www WHERE path="/" AND status=200'
# norikra-client query add www.toppageviews 'SELECT count(*) AS cnt FROM www.win:time_batch(3 sec) WHERE path="/" AND status=200'
#!/bin/sh for i in `seq 1 1 3`; do data1='{"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3}' echo `date +"%Y/%m/%d-%H:%M:%S"` $data1 echo $data1 | norikra-client event send www sleep 3 # data2='{"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3}' echo `date +"%Y/%m/%d-%H:%M:%S"` $data2 echo $data2 | norikra-client event send www sleep 3 # done
2014/09/17-23:07:58 {"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3} 2014/09/17-23:08:13 {"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3} 2014/09/17-23:08:29 {"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3} 2014/09/17-23:08:45 {"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3} 2014/09/17-23:09:00 {"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3} 2014/09/17-23:09:16 {"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3}
# norikra-client event see www.toppage {"time":"2014/09/17 23:08:10","cnt":1} {"time":"2014/09/17 23:08:41","cnt":2} {"time":"2014/09/17 23:09:13","cnt":3} # norikra-client event see www.toppageviews {"time":"2014/09/17 23:08:13","cnt":1} {"time":"2014/09/17 23:08:16","cnt":0} {"time":"2014/09/17 23:08:43","cnt":1} {"time":"2014/09/17 23:08:46","cnt":0} {"time":"2014/09/17 23:09:16","cnt":1} {"time":"2014/09/17 23:09:19","cnt":0}
# norikra-client event fetch www.toppageviews {"time":"2014/09/17 23:08:13","cnt":1} {"time":"2014/09/17 23:08:16","cnt":0} {"time":"2014/09/17 23:08:43","cnt":1} {"time":"2014/09/17 23:08:46","cnt":0} {"time":"2014/09/17 23:09:16","cnt":1} {"time":"2014/09/17 23:09:19","cnt":0} # norikra-client event fetch www.toppageviews
心得:
- 易上手,定義 data set 後,寫少少的程式(SQL)就可開始做資料處理。
- 具備排程功能,資料處理的動作可以 run forever,可設定對全部的資料做處理或者一次處理一段時間內的資料,因資料皆存在 memory 中,對全部的資料做處理時要注意 out of memory 的問題。
- 安裝簡單,單機作業,無資料分散 or 運算分散的架構(ex: storm)。
- 目前無考慮 HA,作者說也不需要考慮。
- 想想適合用在哪裡?複雜多樣的小資料收集與前處理,定期 fetch 處理結果 or 一前處理完就往其他地方匯出,清空主機 memory。