1. 首页
  2. 大数据
  3. Impala教程

【Impala教程】(三)Impala Shell命令

外部Shell命令

  • 帮助
    所有命令都可以通过help来查找并获知它的用途。
[root@cm-master shell]#  impala-shell -h
Options:
  -h, --help            show this help message and exit
  -i IMPALAD, --impalad=IMPALAD
                        <host:port> of impalad to connect to
                        [default: cm-master:21000]
  -q QUERY, --query=QUERY
                        Execute a query without the shell [default: none]
  -f QUERY_FILE, --query_file=QUERY_FILE
                        Execute the queries in the query file, delimited by ;
                        [default: none]
  -o OUTPUT_FILE, --output_file=OUTPUT_FILE
                        If set, query results are written to the given file.
                        Results from multiple semicolon-terminated queries
                        will be appended to the same file [default: none]
  -B, --delimited       Output rows in delimited mode [default: False]
  --print_header        Print column names in delimited mode when pretty-
                        printed. [default: False]
  --output_delimiter=OUTPUT_DELIMITER
                        Field delimiter to use for output in delimited mode
                        [default: \t]
  -p, --show_profiles   Always display query profiles after execution
                        [default: False]
  --quiet               Disable verbose output [default: False]
  -v, --version         Print version information [default: False]
  -c, --ignore_query_failure
                        Continue on query failure [default: False]
  -r, --refresh_after_connect
                        Refresh Impala catalog after connecting
                        [default: False]
  -d DEFAULT_DB, --database=DEFAULT_DB
                        Issues a use database command on startup
                        [default: none]
  -u USER, --user=USER  User to authenticate with. [default: root]
  --ssl                 Connect to Impala via SSL-secured connection
                        [default: False]
  --var=KEYVAL          Define variable(s) to be used within the Impala
                        session. [default: none]
  • 指定连接主机名
    [root@cm-slave1 ~]# impala-shell -i cm-slave2
    如果你当前主机有Impala Daemon,不指定也可以,默认连接你当前主机。如果你当前主机没有Impala Daemon则必须要指定连接主机名才可以进行后续操作,否则报错如下。
Error connecting: TTransportException, Could not connect to cm-master:21000
Not connected to Impala, could not execute queries.
  • 启动详细输出
    [root@cm-slave1 ~]# impala-shell -V
    能看见时间(Fetched XXX in XXXs)等详细信息。
[cm-slave2:21000] > select count(1) from emp_import;
Query: select count(1) from emp_import
+----------+
| count(1) |
+----------+
| 7        |
+----------+
Fetched 1 row(s) in 0.58s
  • 关闭详细输出
    [root@cm-slave1 ~]# impala-shell –quiet
    没有时间等详细信息。
[cm-slave2:21000] > select count(1) from emp_import;
+----------+
| count(1) |
+----------+
| 7        |
+----------+
  • 输出执行计划
    [root@cm-slave1 ~]# impala-shell -p
    image
  • 刷新元数据
    注意:在hive创建新的表impala同步不过来,要自己刷新
    [root@cm-slave1 ~]# impala-shell -r

  • 不进入impala-shell进行查询

[root@cm-master ~]# impala-shell -i cm-slave1 -q 'select * from emp_import'
Connected to cm-slave1:21000
Query: select * from emp_import
+------+--------+
| id   | name   |
+------+--------+
| 1    | neil   |
| 2    | jack   |
| 3    | martin |
| 4    | tony   |
| 5    | eric   |
+------+--------+
  • 去格式化输出
    为了减少存储空间,提高性能
    原始:
[cm-slave1:21000] > select count(1) from emp_import;
Query: select count(1) from emp_import
+----------+
| count(1) |
+----------+
| 7        |
+----------+

[root@cm-master shell]# impala-shell -B去格式化输出后:

[cm-slave1:21000] > select count(1) from emp_import;
Query: select count(1) from emp_import
7

补充:
–output_delimiter=指定分隔符
–print_header打印列名

  • 执行查询文件并输出到文件
    [root@cm-slave1 ~]# impala-shell -f stu.sql -o /out.data
    很重要,公司里经常都是用脚本方式来执行Sql。
[root@cm-master ~]# cat impala.sql 
select count(1) from emp_import;
select * from emp_import;

[root@cm-master ~]# impala-shell -i cm-slave1 -f impala.sql -o /out.txt
Connected to cm-slave1:21000
Query: select count(1) from emp_import
Fetched 1 row(s) in 0.71s
Query: select * from emp_import
Fetched 5 row(s) in 0.44s

[root@cm-master ~]# cat /out.txt 
+----------+
| count(1) |
+----------+
| 7        |
+----------+
+------+--------+
| id   | name   |
+------+--------+
| 1    | neil   |
| 2    | jack   |
| 3    | martin |
| 4    | tony   |
| 5    | eric   |
+------+--------+
  • 查询失败时继续查询
    [root@cm-slave1 ~]# impala-shell -f stu.sql -o /out1.data -c
    如果没有-c,假如我有10条sql,第2条是错误Sql,则后面的全部不执行。加上-c,第2条错了,后面的也会执行。
    作用是:假如某些业务更改了,也可以直接跑sql文件,不需要去更改那些错误的Sql语句。

内部Shell命令

  • 帮助
[cm-slave1:21000] > help;

Documented commands (type help <topic>):
========================================
compute   exit     history  quit    shell  unset   version
connect   explain  insert   select  show   use     with   
describe  help     profile  set     tip    values

Undocumented commands:
======================
alter  create  desc  drop  load  source  src  summary
  • connect 连接到指定主机
[cm-slave1:21000] > connect cm-slave2;
Connected to cm-slave2:21000
Server version: impalad version 2.6.0-cdh5.8.3 RELEASE (build c644f476b774db9db87a619628f7a6ecc5f843e0)
  • 刷新元数据
    refresh <表名> 刷新某个表
    或者全量刷新元数据库invalidate metadata(慎用,尤其是生产环境表那么多,全量刷新太占用资源了)
[cm-slave2:21000] > select count(1) from stu;
Query: select count(1) from stu
+----------+
| count(1) |
+----------+
| 32       |
+----------+
Fetched 1 row(s) in 2.33s
[cm-slave2:21000] > refresh stu;
Query: refresh stu

Fetched 0 row(s) in 0.32s
[cm-slave2:21000] > select count(1) from stu;
Query: select count(1) from stu
+----------+
| count(1) |
+----------+
| 41       |
+----------+
Fetched 1 row(s) in 0.59s
[cm-slave2:21000] > invalidate metadata;
Query: invalidate metadata
  • explain 查询执行计划、步骤信息
    性能优化很重要的一点就是查询执行计划!
[cm-slave2:21000] > explain select count(1) from stu ;
Query: explain select count(1) from stu
+------------------------------------------------------------------------------------+
| Explain String                                                                     |
+------------------------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=42.00MB VCores=1                           |
| WARNING: The following tables are missing relevant table and/or column statistics. |
| default.stu                                                                        |
|                                                                                    |
| 03:AGGREGATE [FINALIZE]                                                            |
| |  output: count:merge(1)                                                          |
| |                                                                                  |
| 02:EXCHANGE [UNPARTITIONED]                                                        |
| |                                                                                  |
| 01:AGGREGATE                                                                       |
| |  output: count(1)                                                                |
| |                                                                                  |
| 00:SCAN HDFS [default.stu]                                                         |
|    partitions=1/1 files=2 size=361B                                                |
+------------------------------------------------------------------------------------+
Fetched 14 row(s) in 0.72s
  • profile(查询完成后执行)查询最近一次查询的底层信息
[cm-slave2:21000] > select count(1) from stu;
Query: select count(1) from stu
+----------+
| count(1) |
+----------+
| 41       |
+----------+
Fetched 1 row(s) in 4.02s
[cm-slave2:21000] > profile;
Query Runtime Profile:
Query (id=a941fbd0ff26c0b6:4a099b018bcb4794):
  Summary:
    Session ID: b24731c41076fc60:396d45456098be94
    Session Type: BEESWAX
    Start Time: 2018-04-10 12:31:59.659025000
    End Time: 2018-04-10 12:32:03.680848000
    Query Type: QUERY
    Query State: FINISHED
    Query Status: OK
    Impala Version: impalad version 2.6.0-cdh5.8.3 RELEASE (build c644f476b774db9db87a619628f7a6ecc5f843e0)
    User: root
    Connected User: root
    Delegated User: 
    Network Address: ::ffff:192.168.109.141:33448
    Default Db: default
    Sql Statement: select count(1) from stu
    Coordinator: cm-slave2:22000
    Query Options (non default): 
    Plan: 
----------------
Estimated Per-Host Requirements: Memory=42.00MB VCores=1
WARNING: The following tables are missing relevant table and/or column statistics.
default.stu

03:AGGREGATE [FINALIZE]
|  output: count:merge(1)
|  hosts=2 per-host-mem=unavailable
|  tuple-ids=1 row-size=8B cardinality=1
|
02:EXCHANGE [UNPARTITIONED]
|  hosts=2 per-host-mem=unavailable
|  tuple-ids=1 row-size=8B cardinality=1
|
01:AGGREGATE
|  output: count(1)
|  hosts=2 per-host-mem=10.00MB
|  tuple-ids=1 row-size=8B cardinality=1
|
00:SCAN HDFS [default.stu, RANDOM]
   partitions=1/1 files=2 size=361B
   table stats: unavailable
   column stats: all
   hosts=2 per-host-mem=32.00MB
   tuple-ids=0 row-size=0B cardinality=unavailable
----------------
    Estimated Per-Host Mem: 44040192
    Estimated Per-Host VCores: 1
    Tables Missing Stats: default.stu
    Request Pool: root.root
    Admission result: Admitted immediately
    ExecSummary: 
Operator       #Hosts   Avg Time   Max Time  #Rows  Est. #Rows  Peak Mem  Est. Peak Mem  Detail        
-------------------------------------------------------------------------------------------------------
03:AGGREGATE        1    1s774ms    1s774ms      1           1  20.00 KB        -1.00 B  FINALIZE      
02:EXCHANGE         1    9.203us    9.203us      2           1         0        -1.00 B  UNPARTITIONED 
01:AGGREGATE        2  799.663ms    1s098ms      2           1  12.00 KB       10.00 MB                
00:SCAN HDFS        2  145.205ms  230.960ms     41          -1   9.00 KB       32.00 MB  default.stu   
  • 在impala-shell中操作linux命令
[cm-slave2:21000] > shell ps -ef;
UID         PID   PPID  C STIME TTY          TIME CMD
root          1      0  0 09:06 ?        00:00:02 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
root          2      0  0 09:06 ?        00:00:00 [kthreadd]
root          3      2  0 09:06 ?        00:00:01 [ksoftirqd/0]
root          5      2  0 09:06 ?        00:00:00 [kworker/0:0H]

BDStar原创文章。发布者:Liuyanling,转载请注明出处:http://bigdata-star.com/archives/1347

发表评论

登录后才能评论

联系我们

562373081

在线咨询:点击这里给我发消息

邮件:562373081@qq.com

工作时间:周一至周五,9:30-18:30,节假日休息

QR code