2014年9月17日 星期三

Norikra 試用紀錄

Norikra is a open source server software provides "Stream Processing" with SQL, written in JRuby, runs on JVM, licensed under GPLv2.
  1. install JRuby (downloads)
  2. # tar xzvf jruby-bin-1.7.15.tar.gz
    # mv jruby-1.7.15 /opt
    # echo "PATH=$PATH:/opt/jruby-1.7.15/bin" > ~/.bashrc
    # jruby -v
    jruby 1.7.15 (1.9.3p392) 2014-09-03 82b5cc3 on Java HotSpot(TM) 64-Bit Server VM 1.6.0_33-b04 +jit [linux-amd64]
  3. install Norikra (要等個幾分鐘才會開始裝)
  4. # gem install norikra
  5. start Norikra (要等個幾分鐘才會啟動成功)
  6. # norikra start
  7. Norikra web UI :http://IP:26578
  8. 以 web service 的 log 為例,首先定義 target 的欄位 (minimal fields set for variations of 'www' events),定義最基本的欄位就好,可再修改
  9. # norikra-client target open www path:string status:integer referer:string agent:string userid:integer
    # norikra-client target list
    TARGET  AUTO_FIELD
    www     true
    1 targets found.
  10. 加入 query 指令:查詢首頁(path="/")被存取的次數
  11. # norikra-client query add www.toppage 'SELECT count(*) AS cnt FROM www WHERE path="/" AND status=200'
  12. 加入 query 指令 with Views (Windows)。Views are used to define time windows or to control timing to fire events. With views, norikra controls the range of events for each queries.
  13. # norikra-client query add www.toppageviews 'SELECT count(*) AS cnt FROM www.win:time_batch(3 sec) WHERE path="/" AND status=200'
  14. 開始將資料送到 norikra
    • script
    • #!/bin/sh
      for i in `seq 1 1 3`;
      do
        data1='{"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3}'
        echo `date +"%Y/%m/%d-%H:%M:%S"` $data1
        echo $data1 | norikra-client event send www
        sleep 3
        #
        data2='{"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3}'
        echo `date +"%Y/%m/%d-%H:%M:%S"` $data2
        echo $data2 | norikra-client event send www
        sleep 3
        #
      done
      
    • log
    • 2014/09/17-23:07:58 {"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3}
      2014/09/17-23:08:13 {"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3}
      2014/09/17-23:08:29 {"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3}
      2014/09/17-23:08:45 {"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3}
      2014/09/17-23:09:00 {"path":"/", "status":200, "referer":"", "agent":"MSIE", "userid":3}
      2014/09/17-23:09:16 {"path":"/login", "status":301, "referer":"/", "agent":"MSIE", "userid":3}
  15. 查看 query 結果
  16. # norikra-client event see www.toppage
    {"time":"2014/09/17 23:08:10","cnt":1}
    {"time":"2014/09/17 23:08:41","cnt":2}
    {"time":"2014/09/17 23:09:13","cnt":3}
    
    # norikra-client event see www.toppageviews
    {"time":"2014/09/17 23:08:13","cnt":1}
    {"time":"2014/09/17 23:08:16","cnt":0}
    {"time":"2014/09/17 23:08:43","cnt":1}
    {"time":"2014/09/17 23:08:46","cnt":0}
    {"time":"2014/09/17 23:09:16","cnt":1}
    {"time":"2014/09/17 23:09:19","cnt":0}
  17. 取得 query 結果,取到的結果會被刪除,如沒有新的事件發生,再查看 query 結果是沒有資料的
  18. # norikra-client event fetch www.toppageviews
    {"time":"2014/09/17 23:08:13","cnt":1}
    {"time":"2014/09/17 23:08:16","cnt":0}
    {"time":"2014/09/17 23:08:43","cnt":1}
    {"time":"2014/09/17 23:08:46","cnt":0}
    {"time":"2014/09/17 23:09:16","cnt":1}
    {"time":"2014/09/17 23:09:19","cnt":0}
    
    # norikra-client event fetch www.toppageviews

心得:
  • 易上手,定義 data set 後,寫少少的程式(SQL)就可開始做資料處理。
  • 具備排程功能,資料處理的動作可以 run forever,可設定對全部的資料做處理或者一次處理一段時間內的資料,因資料皆存在 memory 中,對全部的資料做處理時要注意 out of memory 的問題。
  • 安裝簡單,單機作業,無資料分散 or 運算分散的架構(ex: storm)。
  • 目前無考慮 HA,作者說也不需要考慮。
  • 想想適合用在哪裡?複雜多樣的小資料收集與前處理,定期 fetch 處理結果 or 一前處理完就往其他地方匯出,清空主機 memory。
參考資料:

沒有留言:

張貼留言