python win10 连接hive_使用win10+python3.5+impyla 连接大数据平台hive表的步骤与问题解决...
環境硬件配置及Hadoop,Hive版本
一、安裝步驟
pip install pure-sasl
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/16/83/30eaf3765de898083
75a8358f9c15d45a3dd44ed26be991471abc0b4480b/pure_sasl-0.5.1-py2.py3-none-any.whl
pip install thrift_sasl==0.2.1 --no-deps
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/36/16dfe92d32df63cc2
b7b7be8d0e4a736617b7e52daaa7f83ae386a89d179/thrift_sasl-0.2.1.tar.gz
pip install thrift==0.9.3
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ae/58/35e3f0cd290039ff8
62c2c9d8ae8a76896665d70343d833bdc2f748b8e55/thrift-0.9.3.tar.gz
pip install impyla (上面安裝的都是依賴包)
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/93/f0d226061ee4679d5
b593c88c7b2e9e077a271c799d29facf31bf03666c1/impyla-0.14.1.tar.gz (151kB)
在安裝pip install impyla時報錯: error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": Downloads | IDE, Code, & Team Foundation Server | Visual Studio
解決方法:在這個網址上下載了Microsoft Visual C++ 2019安裝后,問題沒有解決。后來下載了別人分享的工具包安裝后,再執行pip install impyla安裝成功
別人的分享地址:忘記了.....
下載這個安裝后就可以安裝impyla了二、寫腳本
#此時可以開始寫腳本連接數據庫了
from impala.dbapi import connect
from impala.util import as_pandas
conn = connect(host='***', port=10000, auth_mechanism='PLAIN', user='***', password='***', database='***')
cursor = conn.cursor()
cursor.execute('show databases')
print(as_pandas(cursor))
三、問題解決
執行數據庫連接后,出現問題
ThriftParserError: ThriftPy does not support generating module with path in protocol ‘c’
定位到 Libsite-packagesthriftpyparserparser.py的
if url_scheme == '':
with open(path) as fh:
data = fh.read()
elif url_scheme in ('http', 'https'):
data = urlopen(path).read()
else:
raise ThriftParserError('ThriftPy does not support generating module '
'with path in protocol '{}''.format(
url_scheme))
更改為
if url_scheme == '':
with open(path) as fh:
data = fh.read()
elif url_scheme in ('c', 'd','e','f''):
with open(path) as fh:
data = fh.read()
elif url_scheme in ('http', 'https'):
data = urlopen(path).read()
else:
raise ThriftParserError('ThriftPy does not support generating module '
'with path in protocol '{}''.format(
url_scheme))
執行數據庫連接后,再次出現問題
TypeError: can’t concat str to bytes
定位到錯誤的最后一條,在init.py第94行
...
header = struct.pack(">BI", status, len(body))
self._trans.write(header + body)
...
修改為
...
header = struct.pack(">BI", status, len(body))
if(type(body) is str):
body = body.encode()
self._trans.write(header + body)
...
執行連接 成功
總結
以上是生活随笔為你收集整理的python win10 连接hive_使用win10+python3.5+impyla 连接大数据平台hive表的步骤与问题解决...的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: websocket中发生数据丢失_获取使
- 下一篇: websocket python爬虫_p