日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

hive中存Array类型的数据的案例,将字符串的数组格式转成数组的字符串,自定义函数方式处理‘[12,23,23,34]‘字符串格式的数据为array<int>格式的数据。

發布時間:2024/9/27 编程问答 32 豆豆

1、創建表帶有Array的表:

create table t_afan_test ( info1 array<int>, info2 array<string> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',';

2、插入數據:

insert into t_afan_test values (array(12,23,23,34),array("what","are","this")); insert into t_afan_test values (array(12,23,23,34,56,32),array("what","are","this","aaa"));

3、查詢出的結果如下:

4、再如以下案例:

drop table if exists t_afan_test;create table t_afan_test ( info1 string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',';insert into t_afan_test values ('[12,23,23,34]'); insert into t_afan_test values ('[22,33,43,54]');select * from t_afan_test;+--------------------+ | t_afan_test.info1 | +--------------------+ | [12,23,23,34] | | [22,33,43,54] | +--------------------+drop table if exists tmp_xxx;create table tmp_xxx as select split(regexp_extract(`info1`,'^\\[(.*)]$',1),',') as key_word_label from t_afan_test where `info1` is not null limit 10;select * from tmp_xxx;+-------------------------+ | tmp_xxx.key_word_label | +-------------------------+ | ["22","33","43","54"] | | ["12","23","23","34"] | +-------------------------+select collect_list(cast(array_element as int)) int_array from (select explode(key_word_label) array_element from tmp_xxx) s;+----------------------------+ | int_array | +----------------------------+ | [22,33,43,54,12,23,23,34] | +----------------------------+select explode(key_word_label) array_element from tmp_xxx;+----------------+ | array_element | +----------------+ | 22 | | 33 | | 43 | | 54 | | 12 | | 23 | | 23 | | 34 | +----------------+

5、將’[12,23,23,34]'格式數據轉成struct<> , array 的方式:

轉成array編寫UDF,代碼如下:

package com.xxx.stringtoarray;import org.apache.hadoop.hive.ql.exec.UDF;import java.util.ArrayList; import java.util.List;/*** 將數組格式的字符串 轉成 整型數組* @author tzq*/ public final class StringToArray extends UDF {private static final String NULL_STRING = "null";/*** 如果想最后hive的數據格式是struct<>, 返回值是Integer[]的。* @param sourceText :源字符串* @return*/public List<Integer> evaluate(String sourceText) {if (isBlank(sourceText)) {return null;}if (NULL_STRING.equalsIgnoreCase(sourceText)) {return null;}String[] arr1 = sourceText.replace("[","").replace("]","").split(",");//Integer[] arr2 = new Integer[arr1.length];List<Integer> list = new ArrayList<>();for(int i = 0; i < arr1.length; i++) {list.add(Integer.parseInt(arr1[i]));}return list;}public static boolean isBlank(String str) {int strLen;if (str != null && (strLen = str.length()) != 0) {for(int i = 0; i < strLen; ++i) {if (!Character.isWhitespace(str.charAt(i))) {return false;}}return true;} else {return true;}}}

hive中創建臨時函數:

hive> add jar /xxx/xxx/xxx/xx.jar; hive> create temporary function stringToArray as 'com.xxx.stringtoarray.StringToArray';

使用的時候,類似如下:

drop table if exists t_afan_test; create table t_afan_test ( info1 string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',';insert into t_afan_test values ('[12,23,23,34]'); insert into t_afan_test values ('[22,33,43,54]');drop table if exists tmp_xxx;create table tmp_xxx as select stringToArray(`info1`) as key_word_label from t_afan_test;

查看結果:

總結

以上是生活随笔為你收集整理的hive中存Array类型的数据的案例,将字符串的数组格式转成数组的字符串,自定义函数方式处理‘[12,23,23,34]‘字符串格式的数据为array<int>格式的数据。的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。