?? ?年初領(lǐng)導讓做一個檢索熱詞的干預(yù),也就是將統(tǒng)計用戶搜索熱詞的結(jié)果,人工的指定其在排行榜中的位置。當然這任務(wù)比較惡心,咱只是個出來混飯碗的民工,不出格的事兒也可以忍了
?? ?說技術(shù)。工作流程是收集用戶的搜索日志,統(tǒng)計每個keyword在一天之中被搜索的次數(shù),根據(jù)每個keyword的統(tǒng)計歷史,使用數(shù)學方差得出它近期熱度的評分,然后降序排序給出結(jié)果列表。(如果做的更細致可以在計算前加入語義分析的部分,這樣能更好的分析出剛剛流行的網(wǎng)絡(luò)用語,我沒有做那么深,這里暫時不表)
?? ?現(xiàn)在加入人工干預(yù)的部分,排行本來就是個topN的問題,干預(yù)的也是排行的前幾個。編輯向來喜歡簡單直接粗暴的方法,把某個關(guān)鍵詞直接指定他的位置,也就是位置(priority)與得分(score)的混合排序。priority實際上就可以認為是排名的優(yōu)先級,所以組合排序的策略按priority降序,score降序。
?? ?在map/reduce框架下,排序沒啥子技術(shù)含量,只需要簡單調(diào)用方法告知job需要排序的key的類型。但多字段排序,需要實現(xiàn)WritableComparable接口的自定義Writable類型來作為排序的key,也很簡單。網(wǎng)上hadoop的中文資料比較少,我愛好裝B但缺少hadoop編程的硬貨,寫出這個難免讓您賤笑了。。
不說廢話,直接上代碼
1、KeyWritable.java
1 public static class KeyWritable implements WritableComparable < KeyWritable > {
2
3 private IntWritable priority;
4 private FloatWritable score;
5
6 public KeyWritable(){
7 priority = new IntWritable( 0 );
8 score = new FloatWritable( 0 );
9 }
10
11 public KeyWritable(IntWritable priority,FloatWritable score) {
12 set(priority,score);
13 }
14
15 public KeyWritable( int priority, long score) {
16 set( new IntWritable(priority), new FloatWritable(score));
17 }
18
19 public void set(IntWritable priority,FloatWritable score){
20 this .priority = priority;
21 this .score = score;
22 }
23
24 public IntWritable getPriority(){
25 return this .priority;
26 }
27
28 public FloatWritable getScore(){
29 return this .score;
30 }
31
32 @Override
33 public void readFields(DataInput in) throws IOException {
34 this .priority.readFields(in);
35 this .score.readFields(in);
36
37 }
38
39 @Override
40 public void write(DataOutput out) throws IOException {
41 this .priority.write(out);
42 this .score.write(out);
43 }
44
45 @Override
46 public int compareTo(KeyWritable obj) {
47 int cmp = this .priority.compareTo(obj.priority);
48 if (cmp != 0 ){
49 return cmp;
50 }
51 return this .score.compareTo(obj.score);
52 }
53
54 @Override
55 public boolean equals(Object obj) {
56 if (obj instanceof KeyWritable){
57 int result = this .compareTo((KeyWritable)obj);
58 if (result == 0 ){
59 return true ;
60 }
61 }
62 return false ;
63 }
64
65 @Override
66 public int hashCode() {
67 return score.hashCode();
68 }
69
70 @Override
71 public String toString() {
72 return super .toString();
73 }
74
75
76 /**
77 * Comparator
78 * @author zhangmiao
79 *
80 */
81 public static class Comparator extends WritableComparator {
82 public Comparator() {
83 super (KeyWritable. class );
84 }
85
86 @Override
87 public int compare( byte [] b1, int s1, int l1, byte [] b2,
88 int s2, int l2) {
89 KeyWritable key1 = new KeyWritable();
90 KeyWritable key2 = new KeyWritable();
91 DataInputBuffer buffer = new DataInputBuffer();
92
93 try {
94
95 buffer.reset(b1, s1, l1);
96 key1.readFields(buffer);
97 buffer.reset(b2, s2, l2);
98 key2.readFields(buffer);
99 } catch (IOException e) {
100 throw new RuntimeException(e);
101 }
102 return compare(key1, key2);
103 }
104
105 @Override
106 public int compare(WritableComparable a,WritableComparable b){
107 if (a instanceof KeyWritable && b instanceof KeyWritable) {
108 return ((KeyWritable) a).compareTo(((KeyWritable) b));
109 }
110 return super .compare(a, b);
111 }
112
113 }
114
115 public static class DecreasingComparator extends Comparator {
116
117 @Override
118 public int compare( byte [] b1, int s1, int l1, byte [] b2, int s2, int l2){
119 return - super .compare(b1,s1,l1,b2,s2,l2);
120 }
121 }
122 }
2、在提交job設(shè)置KeyWritable比較器
job.setOutputKeyComparatorClass(KeyWritable.DecreasingComparator. class );
(未完待續(xù))
更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號聯(lián)系: 360901061
您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點擊下面給點支持吧,站長非常感激您!手機微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對您有幫助就好】元
