亚洲免费在线-亚洲免费在线播放-亚洲免费在线观看-亚洲免费在线观看视频-亚洲免费在线看-亚洲免费在线视频

日活躍用戶統(tǒng)計(jì)函數(shù)

系統(tǒng) 2205 0

題記:

  在做運(yùn)營(yíng)統(tǒng)計(jì)的時(shí)候,一個(gè)最常見(jiàn)的指標(biāo)是日活躍用戶數(shù)(DAU),它的一般性概念為當(dāng)日所有用戶的去重,但是在大部分情況下,我們獲取到的數(shù)據(jù)中會(huì)有登錄用戶與有匿名用戶,而這部分用戶是會(huì)出現(xiàn)重疊的。常規(guī)的做法是利用cookie或者imei(移動(dòng)端)進(jìn)行自關(guān)聯(lián),然后算出有多少用戶同時(shí)是登錄用戶和匿名用戶,最終的 日活躍用戶數(shù) = 登錄用戶+匿名用戶-匿名轉(zhuǎn)登錄用戶。

  在實(shí)際操作中需要寫復(fù)雜的HQL才能完成這部分工作,而且運(yùn)行效率低下,為此需要開(kāi)發(fā)一個(gè)UDAF函數(shù)進(jìn)行處理。

首先說(shuō)明一下函數(shù)的原理:

      
        /*
      
      
        *
 * 根據(jù)flag,uid和imei信息計(jì)算個(gè)數(shù)
 * -fla為1    : 將對(duì)應(yīng)的UID存儲(chǔ)在UID集合中,該集合代表登錄用戶
 * -flag不為1 : 將對(duì)應(yīng)的imei|wyy存儲(chǔ)在IMEI集合中,該集合代表匿名用戶
 * 將imei|wyy存儲(chǔ)一個(gè)Map當(dāng)中,并且判斷該imei|wyy對(duì)應(yīng)的flag是否同時(shí)出現(xiàn)過(guò)0和1倆個(gè)值,如果是則map中對(duì)應(yīng)的value = 2否則為flag
 * 參數(shù)原型:
 *      int itemcount(flag,uid,imei)
 * 參數(shù)說(shuō)明:
 *      flag: 1或者不為1
 *      uid: 用戶id
 *      imei: 用戶的第二個(gè)參照標(biāo)識(shí)(imei|wyy|cookie)
 *      
 * 返回值:
 *      int類型,dau值
 *      
 * 使用示例:
 *      > SELECT flag, uid, imei FROM test;
 *      1   uid1 imei1
 *      1   uid2 imei1   
 *      0   uid3 imei3   
 * 
 *      > SELECT daucount(flag,uid,imei) FROM test;
 *      1
 
      
      
        */
      
    

  其中flag參數(shù)可以用其它udf函數(shù)進(jìn)行替換,用以判斷uid是否是登錄用戶。

下面是具體的代碼塊:

      
        package
      
      
         yy.juefan.udaf;


      
      
        import
      
      
         java.util.ArrayList;

      
      
        import
      
      
         java.util.HashMap;

      
      
        import
      
      
         java.util.HashSet;

      
      
        import
      
      
         java.util.List;

      
      
        import
      
      
         java.util.Map;

      
      
        import
      
      
         java.util.Set;

      
      
        import
      
      
         java.util.Map.Entry;


      
      
        import
      
      
         org.apache.hadoop.hive.ql.exec.Description;

      
      
        import
      
      
         org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;

      
      
        import
      
      
         org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;

      
      
        import
      
      
         org.apache.hadoop.hive.ql.metadata.HiveException;

      
      
        import
      
      
         org.apache.hadoop.hive.ql.parse.SemanticException;

      
      
        import
      
      
         org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver;

      
      
        import
      
      
         org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector.PrimitiveCategory;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.StructField;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.primitive.IntObjectInspector;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;

      
      
        import
      
      
         org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;


@Description (
        name 
      
      = "dau_count"
      
        ,
        value 
      
      = "_FUNC_(flag,uid,imei)"
      
        
        )

      
      
        public
      
      
        class
      
       GenericDauCount 
      
        extends
      
      
         AbstractGenericUDAFResolver {

    
      
      
        private
      
      
        static
      
      
        final
      
      
        boolean
      
       DEBUG = 
      
        false
      
      
        ;
    
      
      
        private
      
      
        static
      
      
        final
      
      
        boolean
      
       TRACE = 
      
        false
      
      
        ;

    @Override
    
      
      
        public
      
      
         GenericUDAFEvaluator getEvaluator(TypeInfo[] parameters)
            
      
      
        throws
      
      
         SemanticException {

        
      
      
        if
      
       (parameters.length != 3
      
        ) {
            
      
      
        throw
      
      
        new
      
      
         UDFArgumentLengthException(
                    
      
      "Exactly 3 argument is expected."
      
        );
        }

        
      
      
        if
      
       (((PrimitiveTypeInfo) parameters[0]).getPrimitiveCategory() !=
      
         PrimitiveCategory.INT) {
            
      
      
        throw
      
      
        new
      
       UDFArgumentTypeException(0
      
        ,
                    
      
      "Only int argument is accepted, but "
                            + parameters[0].getTypeName() + " is passed"
      
        );
        }

        
      
      
        if
      
       (((PrimitiveTypeInfo) parameters[1]).getPrimitiveCategory() !=
      
         PrimitiveCategory.STRING) {
            
      
      
        throw
      
      
        new
      
       UDFArgumentTypeException(1
      
        ,
                    
      
      "Only string argument is accepted, but "
                            + parameters[1].getTypeName() + " is passed"
      
        );
        }

        
      
      
        if
      
       (((PrimitiveTypeInfo) parameters[2]).getPrimitiveCategory() !=
      
         PrimitiveCategory.STRING) {
            
      
      
        throw
      
      
        new
      
       UDFArgumentTypeException(2
      
        ,
                    
      
      "Only string argument is accepted, but "
                            + parameters[2].getTypeName() + " is passed"
      
        );
        }

        
      
      
        return
      
      
        new
      
      
         GenericDauCountEvaluator();
    }

    
      
      
        public
      
      
        static
      
      
        class
      
       GenericDauCountEvaluator 
      
        extends
      
      
         GenericUDAFEvaluator {
        
      
      
        //
      
      
         封裝接口
      
      
                StructField uidSetField;
        StructField imeiSetField;
        StructField imeiMapField;

        StructObjectInspector map2red;

        
      
      
        //
      
      
         for PARTIAL1 and COMPLETE
      
      
                IntObjectInspector flagIO;
        StringObjectInspector uidIO;
        StringObjectInspector imeiIO;

        
      
      
        //
      
      
         for PARTIAL2 and FINAL
      
      
                StandardListObjectInspector uidSetIO;
        StandardListObjectInspector imeiSetIO;
        StandardMapObjectInspector imeiMapIO;
        
private static class DivideAB implements AggregationBuffer { Set <String> uidSet; Set <String> imeiSet; Map <String, Integer> imeiMap; } @Override public AggregationBuffer getNewAggregationBuffer() throws HiveException { DivideAB dab = new DivideAB(); reset(dab); return dab; } @Override public void reset(AggregationBuffer agg) throws HiveException { DivideAB dab = (DivideAB) agg; dab.uidSet = new HashSet<String> (); dab.imeiSet = new HashSet<String> (); dab.imeiMap = new HashMap<String, Integer> (); } boolean warned = false ; @Override public ObjectInspector init(Mode m, ObjectInspector[] parameters) throws HiveException { super .init(m, parameters); // input if (m == Mode.PARTIAL1 || m == Mode.COMPLETE) { // for iterate assert (parameters.length == 3 ); flagIO = (IntObjectInspector) parameters[0 ]; uidIO = (StringObjectInspector) parameters[1 ]; imeiIO = (StringObjectInspector) parameters[2 ]; } else { // for merge map2red = (StructObjectInspector) parameters[0 ]; uidSetField = map2red.getStructFieldRef("uidSet" ); imeiSetField = map2red.getStructFieldRef("imeiSet" ); imeiMapField = map2red.getStructFieldRef("imeiMap" ); uidSetIO = (StandardListObjectInspector) uidSetField .getFieldObjectInspector(); imeiSetIO = (StandardListObjectInspector) imeiSetField .getFieldObjectInspector(); imeiMapIO = (StandardMapObjectInspector) imeiMapField .getFieldObjectInspector(); } if (m == Mode.PARTIAL1 || m == Mode.PARTIAL2) { ArrayList <ObjectInspector> foi = new ArrayList<ObjectInspector> (); ArrayList <String> fname = new ArrayList<String> (); foi.add(ObjectInspectorFactory .getStandardListObjectInspector(PrimitiveObjectInspectorFactory.javaStringObjectInspector)); foi.add(ObjectInspectorFactory .getStandardListObjectInspector(PrimitiveObjectInspectorFactory.javaStringObjectInspector)); foi.add(ObjectInspectorFactory .getStandardMapObjectInspector( PrimitiveObjectInspectorFactory.javaStringObjectInspector, PrimitiveObjectInspectorFactory.javaIntObjectInspector)); fname.add( "uidSet" ); fname.add( "imeiSet" ); fname.add( "imeiMap" ); return ObjectInspectorFactory.getStandardStructObjectInspector( fname, foi); } else { return PrimitiveObjectInspectorFactory.javaLongObjectInspector; } } @Override public void iterate(AggregationBuffer agg, Object[] parameters) throws HiveException { if (parameters.length != 3 ) { return ; } DivideAB dab = (DivideAB) agg; int check = PrimitiveObjectInspectorUtils.getInt(parameters[0 ], flagIO); String uid = PrimitiveObjectInspectorUtils.getString(parameters[1 ], uidIO); String imei = PrimitiveObjectInspectorUtils.getString( parameters[ 2 ], imeiIO); if (check == 1) { // 登錄用戶 dab.uidSet.add(uid); } else { // 匿名用戶 dab.imeiSet.add(imei); } if (dab.imeiMap.containsKey(imei)) { int flag = dab.imeiMap.get(imei); if (flag < 2 && flag != check) { dab.imeiMap.put(imei, 2 ); } } else { dab.imeiMap.put(imei, check); } } @Override public Object terminatePartial(AggregationBuffer agg) throws HiveException { DivideAB myagg = (DivideAB) agg; // 存儲(chǔ)中間結(jié)果 Object[] partialResult = new Object[3 ]; partialResult[ 0] = new ArrayList<String> (myagg.uidSet); partialResult[ 1] = new ArrayList<String> (myagg.imeiSet); partialResult[ 2] = new HashMap<String, Integer> (myagg.imeiMap); return partialResult; } @SuppressWarnings( "unchecked" ) @Override public void merge(AggregationBuffer agg, Object partial) throws HiveException { if (partial != null ) { DivideAB dab = (DivideAB) agg; Object uidSet = map2red .getStructFieldData(partial, uidSetField); Object imeiSet = map2red.getStructFieldData(partial, imeiSetField); Object imeiMap = map2red.getStructFieldData(partial, imeiMapField); List <Object> uidlist = (List<Object> ) uidSetIO.getList(uidSet); System.err.println( "uidList = " + uidlist.size()); if (uidlist != null ) { System.err.println( "uidSet = " + dab.uidSet.size()); for (Object obj : uidlist) { dab.uidSet.add(obj.toString()); } } List <Object> imeilist = (List<Object> ) uidSetIO .getList(imeiSet); if (imeilist != null ) { for (Object obj : imeilist) { dab.imeiSet.add(obj.toString()); } } Map <String, Integer> imeimap = (Map<String, Integer> ) imeiMapIO .getMap(imeiMap); for (Entry<?, ?> ele : imeimap.entrySet()) { Object kobj = ele.getKey(); String key = kobj.toString(); Object vobj = ele.getValue(); Object val = vobj.toString(); if (dab.imeiMap.containsKey(key)) { int flag = dab.imeiMap.get(key); if (flag < 2 && flag != Integer.parseInt(val.toString())) { dab.imeiMap.put(key, 2 ); } } else { dab.imeiMap.put(key, Integer.parseInt(val.toString())); } } } } @Override public Object terminate(AggregationBuffer agg) throws HiveException { DivideAB dab = (DivideAB) agg; int mix = 0 ; for ( int val : dab.imeiMap.values()) { if (val == 2 ) { mix ++ ; } } return ( long ) (dab.uidSet.size() + dab.imeiSet.size() - mix); } } }

?

又有工作要忙了,先把代碼放上來(lái),下次再寫分析

日活躍用戶統(tǒng)計(jì)函數(shù)


更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主

微信掃碼或搜索:z360901061

微信掃一掃加我為好友

QQ號(hào)聯(lián)系: 360901061

您的支持是博主寫作最大的動(dòng)力,如果您喜歡我的文章,感覺(jué)我的文章對(duì)您有幫助,請(qǐng)用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點(diǎn)擊下面給點(diǎn)支持吧,站長(zhǎng)非常感激您!手機(jī)微信長(zhǎng)按不能支付解決辦法:請(qǐng)將微信支付二維碼保存到相冊(cè),切換到微信,然后點(diǎn)擊微信右上角掃一掃功能,選擇支付二維碼完成支付。

【本文對(duì)您有幫助就好】

您的支持是博主寫作最大的動(dòng)力,如果您喜歡我的文章,感覺(jué)我的文章對(duì)您有幫助,請(qǐng)用微信掃描上面二維碼支持博主2元、5元、10元、自定義金額等您想捐的金額吧,站長(zhǎng)會(huì)非常 感謝您的哦!!!

發(fā)表我的評(píng)論
最新評(píng)論 總共0條評(píng)論
主站蜘蛛池模板: 日韩区| 激情五月婷婷红人馆 | 国产成人免费不卡在线观看 | 免费国产精成人品 | 国产一级黄色网 | 免费国产一区二区三区 | 中文字幕专区在线亚洲 | 色综合精品久久久久久久 | 日韩中文字幕视频在线 | 亚洲第一红杏精品久久 | 97中文字幕在线 | 国产精品一区二区久久不卡 | 欧美性猛交xx乱大交 | 国产色婷婷精品综合在线 | 九九久久免费视频 | 欧美a视频 | 亚洲综合激情六月婷婷在线观看 | 中文日韩| 精品在线观看国产 | 久国产精品久久精品国产四虎 | 四虎成人4hutv影院 | 亚洲视频观看 | 九九精品久久久久久久久 | 大色佬视频在线观看 | 妖精视频在线看免费视频 | 亚洲性生活 | 久久精品香蕉 | 久久久久久久国产高清 | 亚洲精品日韩中文字幕久久久 | 亚洲国产中文字幕 | 国产精品久久久久久久免费 | 琪琪色在线视频 | 男人天堂欧美 | 天天射天天干天天舔 | 模特啪啪 | 久久2019| 四虎4444hu4影视最新地址 | 国产专区日韩精品欧美色 | 欧洲一级黄色 | 精品综合久久久久久97超人该 | 欧美日韩亚洲无线码在线观看 |