Spark SQL CLI 运行
生活随笔
收集整理的這篇文章主要介紹了
Spark SQL CLI 运行
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1:運行?./bin/spark-sql
需要先把hive-site.xml 負責到spark的conf目錄下
提示編譯的時候要帶2個參數
重新編譯:./make-distribution.sh --tgz -Phadoop-2.4 -Pyarn -DskipTests -Dhadoop.version=2.4.1 -Phive -Phive-thriftserver
2:再次運行:
[jifeng@feng02 spark-1.2.0-bin-2.4.1]$ ./bin/spark-sql Spark assembly has been built with Hive, including Datanucleus jars on classpath log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClientat org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:101)at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:606)at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClientat org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)... 9 more Caused by: java.lang.reflect.InvocationTargetExceptionat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:526)at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)... 14 more Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory NestedThrowables: java.lang.reflect.InvocationTargetExceptionat org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:606)at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)at java.security.AccessController.doPrivileged(Native Method)at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:310)at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:339)at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:248)at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)... 19 more Caused by: java.lang.reflect.InvocationTargetExceptionat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:526)at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:526)at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)... 48 more Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "dbcp-builtin" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:259)at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:85)... 66 more Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.at org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:58)at org.datanucleus.store.rdbms.connectionpool.DBCPBuiltinConnectionPoolFactory.createConnectionPool(DBCPBuiltinConnectionPoolFactory.java:49)at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)... 68 more Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties發現找不到mysql的jdbc包
3:加載jdbc運行
[jifeng@feng02 spark-1.2.0-bin-2.4.1]$ ./bin/spark-sql --driver-class-path /home/jifeng/hadoop/spark-1.2.0-bin-2.4.1/lib/mysql-connector-java-5.1.32-bin.jar Spark assembly has been built with Hive, including Datanucleus jars on classpath log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/03/04 14:30:01 INFO SecurityManager: Changing view acls to: jifeng 15/03/04 14:30:01 INFO SecurityManager: Changing modify acls to: jifeng 15/03/04 14:30:01 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jifeng); users with modify permissions: Set(jifeng) 15/03/04 14:30:02 INFO Slf4jLogger: Slf4jLogger started 15/03/04 14:30:02 INFO Remoting: Starting remoting 15/03/04 14:30:02 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@feng02:53897] 15/03/04 14:30:02 INFO Utils: Successfully started service 'sparkDriver' on port 53897. 15/03/04 14:30:02 INFO SparkEnv: Registering MapOutputTracker 15/03/04 14:30:02 INFO SparkEnv: Registering BlockManagerMaster 15/03/04 14:30:02 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20150304143002-0534 15/03/04 14:30:02 INFO MemoryStore: MemoryStore started with capacity 267.3 MB 15/03/04 14:30:02 INFO HttpFileServer: HTTP File server directory is /tmp/spark-5172cadb-24ba-4eaa-b78b-cefb2e0ffb5b 15/03/04 14:30:02 INFO HttpServer: Starting HTTP Server 15/03/04 14:30:02 INFO Utils: Successfully started service 'HTTP file server' on port 55683. 15/03/04 14:30:03 INFO Utils: Successfully started service 'SparkUI' on port 4040. 15/03/04 14:30:03 INFO SparkUI: Started SparkUI at http://feng02:4040 15/03/04 14:30:03 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@feng02:53897/user/HeartbeatReceiver 15/03/04 14:30:03 INFO NettyBlockTransferService: Server created on 39062 15/03/04 14:30:03 INFO BlockManagerMaster: Trying to register BlockManager 15/03/04 14:30:03 INFO BlockManagerMasterActor: Registering block manager localhost:39062 with 267.3 MB RAM, BlockManagerId(<driver>, localhost, 39062) 15/03/04 14:30:03 INFO BlockManagerMaster: Registered BlockManager SET spark.sql.hive.version=0.13.1 15/03/04 14:30:04 INFO HiveMetaStore: 0: get_all_databases 15/03/04 14:30:04 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_all_databases 15/03/04 14:30:04 INFO HiveMetaStore: 0: get_functions: db=default pat=* 15/03/04 14:30:04 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_functions: db=default pat=* 15/03/04 14:30:04 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. spark-sql>4:操作:
show tables;
select * from student;
> select * from student; 15/03/04 14:42:05 WARN HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead 15/03/04 14:42:05 INFO ParseDriver: Parsing command: select * from student 15/03/04 14:42:05 INFO ParseDriver: Parse Completed 15/03/04 14:42:05 INFO HiveMetaStore: 0: get_table : db=default tbl=student 15/03/04 14:42:05 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_table : db=default tbl=student 15/03/04 14:42:05 INFO deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 15/03/04 14:42:05 INFO MemoryStore: ensureFreeSpace(444571) called with curMem=4121, maxMem=280248975 15/03/04 14:42:05 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 434.2 KB, free 266.8 MB) 15/03/04 14:42:05 INFO MemoryStore: ensureFreeSpace(47070) called with curMem=448692, maxMem=280248975 15/03/04 14:42:05 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 46.0 KB, free 266.8 MB) 15/03/04 14:42:05 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:39062 (size: 46.0 KB, free: 267.2 MB) 15/03/04 14:42:05 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0 15/03/04 14:42:05 INFO SparkContext: Created broadcast 1 from broadcast at TableReader.scala:68 15/03/04 14:42:06 INFO BlockManager: Removing broadcast 0 15/03/04 14:42:06 INFO BlockManager: Removing block broadcast_0 15/03/04 14:42:06 INFO MemoryStore: Block broadcast_0 of size 2560 dropped from memory (free 279755773) 15/03/04 14:42:06 INFO BlockManager: Removing block broadcast_0_piece0 15/03/04 14:42:06 INFO MemoryStore: Block broadcast_0_piece0 of size 1561 dropped from memory (free 279757334) 15/03/04 14:42:06 INFO BlockManagerInfo: Removed broadcast_0_piece0 on localhost:39062 in memory (size: 1561.0 B, free: 267.2 MB) 15/03/04 14:42:06 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0 15/03/04 14:42:06 INFO ContextCleaner: Cleaned broadcast 0 15/03/04 14:42:06 INFO FileInputFormat: Total input paths to process : 1 15/03/04 14:42:06 INFO SparkContext: Starting job: collect at SparkPlan.scala:84 15/03/04 14:42:06 INFO DAGScheduler: Got job 1 (collect at SparkPlan.scala:84) with 2 output partitions (allowLocal=false) 15/03/04 14:42:06 INFO DAGScheduler: Final stage: Stage 1(collect at SparkPlan.scala:84) 15/03/04 14:42:06 INFO DAGScheduler: Parents of final stage: List() 15/03/04 14:42:06 INFO DAGScheduler: Missing parents: List() 15/03/04 14:42:06 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[7] at map at SparkPlan.scala:84), which has no missing parents 15/03/04 14:42:06 INFO MemoryStore: ensureFreeSpace(8616) called with curMem=491641, maxMem=280248975 15/03/04 14:42:06 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 8.4 KB, free 266.8 MB) 15/03/04 14:42:06 INFO MemoryStore: ensureFreeSpace(5184) called with curMem=500257, maxMem=280248975 15/03/04 14:42:06 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 5.1 KB, free 266.8 MB) 15/03/04 14:42:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:39062 (size: 5.1 KB, free: 267.2 MB) 15/03/04 14:42:06 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0 15/03/04 14:42:06 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:838 15/03/04 14:42:06 INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (MappedRDD[7] at map at SparkPlan.scala:84) 15/03/04 14:42:06 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks 15/03/04 14:42:06 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, ANY, 1318 bytes) 15/03/04 14:42:06 INFO Executor: Running task 0.0 in stage 1.0 (TID 1) 15/03/04 14:42:06 INFO HadoopRDD: Input split: hdfs://feng01:9000/user/hive/warehouse/student/stu.txt:0+28 15/03/04 14:42:06 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id 15/03/04 14:42:06 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 15/03/04 14:42:06 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 15/03/04 14:42:06 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition 15/03/04 14:42:06 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id 15/03/04 14:42:06 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1843 bytes result sent to driver 15/03/04 14:42:06 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2, localhost, ANY, 1318 bytes) 15/03/04 14:42:06 INFO Executor: Running task 1.0 in stage 1.0 (TID 2) 15/03/04 14:42:06 INFO HadoopRDD: Input split: hdfs://feng01:9000/user/hive/warehouse/student/stu.txt:28+29 15/03/04 14:42:06 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 267 ms on localhost (1/2) 15/03/04 14:42:06 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 1828 bytes result sent to driver 15/03/04 14:42:06 INFO DAGScheduler: Stage 1 (collect at SparkPlan.scala:84) finished in 0.313 s 15/03/04 14:42:06 INFO StatsReportListener: Finished stage: org.apache.spark.scheduler.StageInfo@61f0612d 15/03/04 14:42:06 INFO DAGScheduler: Job 1 finished: collect at SparkPlan.scala:84, took 0.338182 s 15/03/04 14:42:06 INFO StatsReportListener: task runtime:(count: 2, mean: 165.000000, stdev: 102.000000, max: 267.000000, min: 63.000000) 15/03/04 14:42:06 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:42:06 INFO StatsReportListener: 63.0 ms 63.0 ms 63.0 ms 63.0 ms 267.0 ms 267.0 ms 267.0 ms267.0 ms 267.0 ms 1 nick 24 2 doping 25 3 caizhi 26 4 liaozhi 27 5 wind 30 Time taken: 1.442 seconds 15/03/04 14:42:06 INFO CliDriver: Time taken: 1.442 seconds spark-sql> 15/03/04 14:42:06 INFO StatsReportListener: task result size:(count: 2, mean: 1835.500000, stdev: 7.500000, max: 1843.000000, min: 1828.000000) 15/03/04 14:42:06 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:42:06 INFO StatsReportListener: 1828.0 B 1828.0 B 1828.0 B 1828.0 B 1843.0 B1843.0 B 1843.0 B 1843.0 B 1843.0 B 15/03/04 14:42:06 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 63 ms on localhost (2/2) 15/03/04 14:42:06 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 15/03/04 14:42:06 INFO StatsReportListener: executor (non-fetch) time pct: (count: 2, mean: 73.640093, stdev: 5.386124, max: 79.026217, min: 68.253968) 15/03/04 14:42:06 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:42:06 INFO StatsReportListener: 68 % 68 % 68 % 68 % 79 % 79 % 79 % 79 % 79 % 15/03/04 14:42:06 INFO StatsReportListener: other time pct: (count: 2, mean: 26.359907, stdev: 5.386124, max: 31.746032, min: 20.973783) 15/03/04 14:42:06 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:42:06 INFO StatsReportListener: 21 % 21 % 21 % 21 % 32 % 32 % 32 % 32 % 32 % JOIN操作
select a.*,b.* from student a ?join course b where a.id=b.id ;
select a.*,b.* from student a join course b where a.id=b.id ; 15/03/04 14:45:18 WARN HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead 15/03/04 14:45:18 INFO ParseDriver: Parsing command: select a.*,b.* from student a join course b where a.id=b.id 15/03/04 14:45:18 INFO ParseDriver: Parse Completed 15/03/04 14:45:18 INFO HiveMetaStore: 0: get_table : db=default tbl=student 15/03/04 14:45:18 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_table : db=default tbl=student 15/03/04 14:45:18 INFO HiveMetaStore: 0: get_table : db=default tbl=course 15/03/04 14:45:18 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_table : db=default tbl=course 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(444571) called with curMem=1014900, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 434.2 KB, free 265.9 MB) 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(47070) called with curMem=1459471, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 46.0 KB, free 265.8 MB) 15/03/04 14:45:19 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on localhost:39062 (size: 46.0 KB, free: 267.1 MB) 15/03/04 14:45:19 INFO BlockManagerMaster: Updated info of block broadcast_8_piece0 15/03/04 14:45:19 INFO SparkContext: Created broadcast 8 from broadcast at TableReader.scala:68 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(444627) called with curMem=1506541, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 434.2 KB, free 265.4 MB) 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(47091) called with curMem=1951168, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 46.0 KB, free 265.4 MB) 15/03/04 14:45:19 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on localhost:39062 (size: 46.0 KB, free: 267.1 MB) 15/03/04 14:45:19 INFO BlockManagerMaster: Updated info of block broadcast_9_piece0 15/03/04 14:45:19 INFO SparkContext: Created broadcast 9 from broadcast at TableReader.scala:68 15/03/04 14:45:19 INFO FileInputFormat: Total input paths to process : 1 15/03/04 14:45:19 INFO SparkContext: Starting job: collect at BroadcastHashJoin.scala:53 15/03/04 14:45:19 INFO DAGScheduler: Got job 4 (collect at BroadcastHashJoin.scala:53) with 2 output partitions (allowLocal=false) 15/03/04 14:45:19 INFO DAGScheduler: Final stage: Stage 4(collect at BroadcastHashJoin.scala:53) 15/03/04 14:45:19 INFO DAGScheduler: Parents of final stage: List() 15/03/04 14:45:19 INFO DAGScheduler: Missing parents: List() 15/03/04 14:45:19 INFO DAGScheduler: Submitting Stage 4 (MappedRDD[23] at map at BroadcastHashJoin.scala:53), which has no missing parents 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(6552) called with curMem=1998259, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 6.4 KB, free 265.4 MB) 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(3857) called with curMem=2004811, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 3.8 KB, free 265.4 MB) 15/03/04 14:45:19 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory on localhost:39062 (size: 3.8 KB, free: 267.1 MB) 15/03/04 14:45:19 INFO BlockManagerMaster: Updated info of block broadcast_10_piece0 15/03/04 14:45:19 INFO SparkContext: Created broadcast 10 from broadcast at DAGScheduler.scala:838 15/03/04 14:45:19 INFO DAGScheduler: Submitting 2 missing tasks from Stage 4 (MappedRDD[23] at map at BroadcastHashJoin.scala:53) 15/03/04 14:45:19 INFO TaskSchedulerImpl: Adding task set 4.0 with 2 tasks 15/03/04 14:45:19 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 7, localhost, ANY, 1320 bytes) 15/03/04 14:45:19 INFO Executor: Running task 0.0 in stage 4.0 (TID 7) 15/03/04 14:45:19 INFO HadoopRDD: Input split: hdfs://feng01:9000/user/hive/warehouse/course/course.txt:0+60 15/03/04 14:45:19 WARN LazyStruct: Extra bytes detected at the end of the row! Ignoring similar problems. 15/03/04 14:45:19 INFO Executor: Finished task 0.0 in stage 4.0 (TID 7). 2091 bytes result sent to driver 15/03/04 14:45:19 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 8, localhost, ANY, 1320 bytes) 15/03/04 14:45:19 INFO Executor: Running task 1.0 in stage 4.0 (TID 8) 15/03/04 14:45:19 INFO HadoopRDD: Input split: hdfs://feng01:9000/user/hive/warehouse/course/course.txt:60+61 15/03/04 14:45:19 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 7) in 26 ms on localhost (1/2) 15/03/04 14:45:19 INFO Executor: Finished task 1.0 in stage 4.0 (TID 8). 2044 bytes result sent to driver 15/03/04 14:45:19 INFO DAGScheduler: Stage 4 (collect at BroadcastHashJoin.scala:53) finished in 0.042 s 15/03/04 14:45:19 INFO StatsReportListener: Finished stage: org.apache.spark.scheduler.StageInfo@40fa4929 15/03/04 14:45:19 INFO StatsReportListener: task runtime:(count: 2, mean: 25.500000, stdev: 0.500000, max: 26.000000, min: 25.000000) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 25.0 ms 25.0 ms 25.0 ms 25.0 ms 26.0 ms 26.0 ms 26.0 ms 26.0 ms 26.0 ms 15/03/04 14:45:19 INFO DAGScheduler: Job 4 finished: collect at BroadcastHashJoin.scala:53, took 0.075328 s 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(2264) called with curMem=2008668, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_11 stored as values in memory (estimated size 2.2 KB, free 265.3 MB) 15/03/04 14:45:19 INFO StatsReportListener: task result size:(count: 2, mean: 2067.500000, stdev: 23.500000, max: 2091.000000, min: 2044.000000) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 2044.0 B 2044.0 B 2044.0 B 2044.0 B 2.0 KB 2.0 KB 2.0 KB 2.0 KB 2.0 KB 15/03/04 14:45:19 INFO TaskSetManager: Finished task 1.0 in stage 4.0 (TID 8) in 25 ms on localhost (2/2) 15/03/04 14:45:19 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks have all completed, from pool 15/03/04 14:45:19 INFO StatsReportListener: executor (non-fetch) time pct: (count: 2, mean: 51.153846, stdev: 8.846154, max: 60.000000, min: 42.307692) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 42 % 42 % 42 % 42 % 60 % 60 % 60 % 60 % 60 % 15/03/04 14:45:19 INFO StatsReportListener: other time pct: (count: 2, mean: 48.846154, stdev: 8.846154, max: 57.692308, min: 40.000000) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 40 % 40 % 40 % 40 % 58 % 58 % 58 % 58 % 58 % 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(350) called with curMem=2010932, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_11_piece0 stored as bytes in memory (estimated size 350.0 B, free 265.3 MB) 15/03/04 14:45:19 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory on localhost:39062 (size: 350.0 B, free: 267.1 MB) 15/03/04 14:45:19 INFO BlockManagerMaster: Updated info of block broadcast_11_piece0 15/03/04 14:45:19 INFO SparkContext: Created broadcast 11 from broadcast at BroadcastHashJoin.scala:55 15/03/04 14:45:19 INFO FileInputFormat: Total input paths to process : 1 15/03/04 14:45:19 INFO SparkContext: Starting job: collect at SparkPlan.scala:84 15/03/04 14:45:19 INFO DAGScheduler: Got job 5 (collect at SparkPlan.scala:84) with 2 output partitions (allowLocal=false) 15/03/04 14:45:19 INFO DAGScheduler: Final stage: Stage 5(collect at SparkPlan.scala:84) 15/03/04 14:45:19 INFO DAGScheduler: Parents of final stage: List() 15/03/04 14:45:19 INFO DAGScheduler: Missing parents: List() 15/03/04 14:45:19 INFO DAGScheduler: Submitting Stage 5 (MappedRDD[29] at map at SparkPlan.scala:84), which has no missing parents 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(11896) called with curMem=2011282, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_12 stored as values in memory (estimated size 11.6 KB, free 265.3 MB) 15/03/04 14:45:19 INFO MemoryStore: ensureFreeSpace(6623) called with curMem=2023178, maxMem=280248975 15/03/04 14:45:19 INFO MemoryStore: Block broadcast_12_piece0 stored as bytes in memory (estimated size 6.5 KB, free 265.3 MB) 15/03/04 14:45:19 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory on localhost:39062 (size: 6.5 KB, free: 267.1 MB) 15/03/04 14:45:19 INFO BlockManagerMaster: Updated info of block broadcast_12_piece0 15/03/04 14:45:19 INFO SparkContext: Created broadcast 12 from broadcast at DAGScheduler.scala:838 15/03/04 14:45:19 INFO DAGScheduler: Submitting 2 missing tasks from Stage 5 (MappedRDD[29] at map at SparkPlan.scala:84) 15/03/04 14:45:19 INFO TaskSchedulerImpl: Adding task set 5.0 with 2 tasks 15/03/04 14:45:19 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 9, localhost, ANY, 1318 bytes) 15/03/04 14:45:19 INFO Executor: Running task 0.0 in stage 5.0 (TID 9) 15/03/04 14:45:19 INFO HadoopRDD: Input split: hdfs://feng01:9000/user/hive/warehouse/student/stu.txt:0+28 15/03/04 14:45:19 INFO Executor: Finished task 0.0 in stage 5.0 (TID 9). 1930 bytes result sent to driver 15/03/04 14:45:19 INFO TaskSetManager: Starting task 1.0 in stage 5.0 (TID 10, localhost, ANY, 1318 bytes) 15/03/04 14:45:19 INFO Executor: Running task 1.0 in stage 5.0 (TID 10) 15/03/04 14:45:19 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 9) in 27 ms on localhost (1/2) 15/03/04 14:45:19 INFO HadoopRDD: Input split: hdfs://feng01:9000/user/hive/warehouse/student/stu.txt:28+29 15/03/04 14:45:19 INFO Executor: Finished task 1.0 in stage 5.0 (TID 10). 1884 bytes result sent to driver 15/03/04 14:45:19 INFO DAGScheduler: Stage 5 (collect at SparkPlan.scala:84) finished in 0.066 s 15/03/04 14:45:19 INFO StatsReportListener: Finished stage: org.apache.spark.scheduler.StageInfo@4942722a 15/03/04 14:45:19 INFO DAGScheduler: Job 5 finished: collect at SparkPlan.scala:84, took 0.094749 s 1 nick 24 1 英語 中文 法文 日文 2 doping 25 2 中文 法文 3 caizhi 26 3 中文 法文 日文 4 liaozhi 27 4 中文 法文 拉丁 5 wind 30 5 中文 法文 德文 Time taken: 0.612 seconds 15/03/04 14:45:19 INFO CliDriver: Time taken: 0.612 seconds spark-sql> 15/03/04 14:45:19 INFO StatsReportListener: task runtime:(count: 2, mean: 34.000000, stdev: 7.000000, max: 41.000000, min: 27.000000) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 27.0 ms 27.0 ms 27.0 ms 27.0 ms 41.0 ms 41.0 ms 41.0 ms 41.0 ms 41.0 ms 15/03/04 14:45:19 INFO StatsReportListener: task result size:(count: 2, mean: 1907.000000, stdev: 23.000000, max: 1930.000000, min: 1884.000000) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 1884.0 B 1884.0 B 1884.0 B 1884.0 B 1930.0 B1930.0 B 1930.0 B 1930.0 B 1930.0 B 15/03/04 14:45:19 INFO StatsReportListener: executor (non-fetch) time pct: (count: 2, mean: 71.725384, stdev: 8.762421, max: 80.487805, min: 62.962963) 15/03/04 14:45:19 INFO TaskSetManager: Finished task 1.0 in stage 5.0 (TID 10) in 41 ms on localhost (2/2) 15/03/04 14:45:19 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 63 % 63 % 63 % 63 % 80 % 80 % 80 % 80 % 80 % 15/03/04 14:45:19 INFO StatsReportListener: other time pct: (count: 2, mean: 28.274616, stdev: 8.762421, max: 37.037037, min: 19.512195) 15/03/04 14:45:19 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 14:45:19 INFO StatsReportListener: 20 % 20 % 20 % 20 % 37 % 37 % 37 % 37 % 37 %
5:集群運行
是CLI啟動一個SparkSQL應用程序的參數,如果不設置--master的話,將在啟動spark-sql的機器以local方式運行,只能通過http://機器名:4040進行監控
./bin/spark-sql --master spark://feng02:7077 --driver-class-path /home/jifeng/hadoop/spark-1.2.0-bin-2.4.1/lib/mysql-connector-java-5.1.32-bin.jar
在集群監控頁面可以看到啟動了SparkSQL應用程序:
數據量少的時候慢好多 > select a.*,b.* from student a join course b where a.id=b.id ; 15/03/04 15:44:53 WARN HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead 15/03/04 15:44:53 INFO ParseDriver: Parsing command: select a.*,b.* from student a join course b where a.id=b.id 15/03/04 15:44:53 INFO ParseDriver: Parse Completed 15/03/04 15:44:55 INFO HiveMetaStore: 0: get_table : db=default tbl=student 15/03/04 15:44:55 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_table : db=default tbl=student 15/03/04 15:44:55 INFO HiveMetaStore: 0: get_table : db=default tbl=course 15/03/04 15:44:55 INFO audit: ugi=jifeng ip=unknown-ip-addr cmd=get_table : db=default tbl=course 15/03/04 15:44:55 INFO deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 15/03/04 15:44:56 INFO MemoryStore: ensureFreeSpace(436469) called with curMem=0, maxMem=280248975 15/03/04 15:44:56 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 426.2 KB, free 266.9 MB) 15/03/04 15:44:56 INFO MemoryStore: ensureFreeSpace(46914) called with curMem=436469, maxMem=280248975 15/03/04 15:44:56 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 45.8 KB, free 266.8 MB) 15/03/04 15:44:56 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on feng02:44473 (size: 45.8 KB, free: 267.2 MB) 15/03/04 15:44:56 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0 15/03/04 15:44:56 INFO SparkContext: Created broadcast 0 from broadcast at TableReader.scala:68 15/03/04 15:44:56 INFO MemoryStore: ensureFreeSpace(436525) called with curMem=483383, maxMem=280248975 15/03/04 15:44:56 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 426.3 KB, free 266.4 MB) 15/03/04 15:44:56 INFO MemoryStore: ensureFreeSpace(46946) called with curMem=919908, maxMem=280248975 15/03/04 15:44:56 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 45.8 KB, free 266.3 MB) 15/03/04 15:44:56 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on feng02:44473 (size: 45.8 KB, free: 267.2 MB) 15/03/04 15:44:56 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0 15/03/04 15:44:56 INFO SparkContext: Created broadcast 1 from broadcast at TableReader.scala:68 15/03/04 15:44:57 WARN HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead 15/03/04 15:44:58 INFO FileInputFormat: Total input paths to process : 1 15/03/04 15:44:58 INFO SparkContext: Starting job: collect at BroadcastHashJoin.scala:53 15/03/04 15:44:58 INFO DAGScheduler: Got job 0 (collect at BroadcastHashJoin.scala:53) with 2 output partitions (allowLocal=false) 15/03/04 15:44:58 INFO DAGScheduler: Final stage: Stage 0(collect at BroadcastHashJoin.scala:53) 15/03/04 15:44:58 INFO DAGScheduler: Parents of final stage: List() 15/03/04 15:44:58 INFO DAGScheduler: Missing parents: List() 15/03/04 15:44:58 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[4] at map at BroadcastHashJoin.scala:53), which has no missing parents 15/03/04 15:44:58 INFO MemoryStore: ensureFreeSpace(6544) called with curMem=966854, maxMem=280248975 15/03/04 15:44:58 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 6.4 KB, free 266.3 MB) 15/03/04 15:44:58 INFO MemoryStore: ensureFreeSpace(3848) called with curMem=973398, maxMem=280248975 15/03/04 15:44:58 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 3.8 KB, free 266.3 MB) 15/03/04 15:44:58 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on feng02:44473 (size: 3.8 KB, free: 267.2 MB) 15/03/04 15:44:58 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0 15/03/04 15:44:58 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:838 15/03/04 15:44:58 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[4] at map at BroadcastHashJoin.scala:53) 15/03/04 15:44:58 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 15/03/04 15:44:58 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, feng02, NODE_LOCAL, 1320 bytes) 15/03/04 15:44:59 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on feng02:60870 (size: 3.8 KB, free: 267.3 MB) 15/03/04 15:45:01 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, feng01, ANY, 1320 bytes) 15/03/04 15:45:02 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on feng01:60774 (size: 3.8 KB, free: 265.4 MB) 15/03/04 15:45:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on feng02:60870 (size: 45.8 KB, free: 267.2 MB) 15/03/04 15:45:03 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on feng01:60774 (size: 45.8 KB, free: 265.4 MB) 15/03/04 15:45:06 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 4445 ms on feng01 (1/2) 15/03/04 15:45:07 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 9432 ms on feng02 (2/2) 15/03/04 15:45:07 INFO DAGScheduler: Stage 0 (collect at BroadcastHashJoin.scala:53) finished in 9.436 s 15/03/04 15:45:07 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 15/03/04 15:45:07 INFO StatsReportListener: Finished stage: org.apache.spark.scheduler.StageInfo@4ba4074f 15/03/04 15:45:07 INFO DAGScheduler: Job 0 finished: collect at BroadcastHashJoin.scala:53, took 9.687128 s 15/03/04 15:45:08 INFO MemoryStore: ensureFreeSpace(2264) called with curMem=977246, maxMem=280248975 15/03/04 15:45:08 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 2.2 KB, free 266.3 MB) 15/03/04 15:45:08 INFO MemoryStore: ensureFreeSpace(350) called with curMem=979510, maxMem=280248975 15/03/04 15:45:08 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 350.0 B, free 266.3 MB) 15/03/04 15:45:08 INFO BlockManager: Removing broadcast 2 15/03/04 15:45:08 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on feng02:44473 (size: 350.0 B, free: 267.2 MB) 15/03/04 15:45:08 INFO BlockManager: Removing block broadcast_2 15/03/04 15:45:08 INFO BlockManagerMaster: Updated info of block broadcast_3_piece0 15/03/04 15:45:08 INFO SparkContext: Created broadcast 3 from broadcast at BroadcastHashJoin.scala:55 15/03/04 15:45:08 INFO MemoryStore: Block broadcast_2 of size 6544 dropped from memory (free 279275659) 15/03/04 15:45:08 INFO BlockManager: Removing block broadcast_2_piece0 15/03/04 15:45:08 INFO MemoryStore: Block broadcast_2_piece0 of size 3848 dropped from memory (free 279279507) 15/03/04 15:45:08 INFO StatsReportListener: task runtime:(count: 2, mean: 6938.500000, stdev: 2493.500000, max: 9432.000000, min: 4445.000000) 15/03/04 15:45:08 INFO BlockManagerInfo: Removed broadcast_2_piece0 on feng02:44473 in memory (size: 3.8 KB, free: 267.2 MB) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0 15/03/04 15:45:08 INFO BlockManagerInfo: Removed broadcast_2_piece0 on feng01:60774 in memory (size: 3.8 KB, free: 265.4 MB) 15/03/04 15:45:08 INFO StatsReportListener: 4.4 s 4.4 s 4.4 s 4.4 s 9.4 s 9.4 s 9.4 s 9.4 s 9.4 s 15/03/04 15:45:08 INFO StatsReportListener: fetch wait time:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO StatsReportListener: 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 0.0 ms 15/03/04 15:45:08 INFO StatsReportListener: remote bytes read:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO StatsReportListener: 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B 15/03/04 15:45:08 INFO BlockManagerInfo: Removed broadcast_2_piece0 on feng02:60870 in memory (size: 3.8 KB, free: 267.2 MB) 15/03/04 15:45:08 INFO ContextCleaner: Cleaned broadcast 2 15/03/04 15:45:08 INFO StatsReportListener: task result size:(count: 2, mean: 2146.000000, stdev: 105.000000, max: 2251.000000, min: 2041.000000) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO StatsReportListener: 2041.0 B 2041.0 B 2041.0 B 2041.0 B 2.2 KB 2.2 KB 2.2 KB 2.2 KB 2.2 KB 15/03/04 15:45:08 INFO StatsReportListener: executor (non-fetch) time pct: (count: 2, mean: 96.539434, stdev: 1.668793, max: 98.208227, min: 94.870641) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO StatsReportListener: 95 % 95 % 95 % 95 % 98 % 98 % 98 % 98 % 98 % 15/03/04 15:45:08 INFO StatsReportListener: fetch wait time pct: (count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO StatsReportListener: 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 % 0 % 15/03/04 15:45:08 INFO StatsReportListener: other time pct: (count: 2, mean: 3.460566, stdev: 1.668793, max: 5.129359, min: 1.791773) 15/03/04 15:45:08 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:08 INFO StatsReportListener: 2 % 2 % 2 % 2 % 5 % 5 % 5 % 5 % 5 % 15/03/04 15:45:09 INFO FileInputFormat: Total input paths to process : 1 15/03/04 15:45:09 INFO SparkContext: Starting job: collect at SparkPlan.scala:84 15/03/04 15:45:09 INFO DAGScheduler: Got job 1 (collect at SparkPlan.scala:84) with 2 output partitions (allowLocal=false) 15/03/04 15:45:09 INFO DAGScheduler: Final stage: Stage 1(collect at SparkPlan.scala:84) 15/03/04 15:45:09 INFO DAGScheduler: Parents of final stage: List() 15/03/04 15:45:09 INFO DAGScheduler: Missing parents: List() 15/03/04 15:45:09 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[10] at map at SparkPlan.scala:84), which has no missing parents 15/03/04 15:45:09 INFO MemoryStore: ensureFreeSpace(11896) called with curMem=969468, maxMem=280248975 15/03/04 15:45:09 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 11.6 KB, free 266.3 MB) 15/03/04 15:45:09 INFO MemoryStore: ensureFreeSpace(6628) called with curMem=981364, maxMem=280248975 15/03/04 15:45:09 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 6.5 KB, free 266.3 MB) 15/03/04 15:45:09 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on feng02:44473 (size: 6.5 KB, free: 267.2 MB) 15/03/04 15:45:09 INFO BlockManagerMaster: Updated info of block broadcast_4_piece0 15/03/04 15:45:09 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:838 15/03/04 15:45:09 INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (MappedRDD[10] at map at SparkPlan.scala:84) 15/03/04 15:45:09 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks 15/03/04 15:45:09 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, feng02, NODE_LOCAL, 1318 bytes) 15/03/04 15:45:09 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on feng02:60870 (size: 6.5 KB, free: 267.2 MB) 15/03/04 15:45:09 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on feng02:60870 (size: 45.8 KB, free: 267.2 MB) 15/03/04 15:45:09 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on feng02:60870 (size: 350.0 B, free: 267.2 MB) 15/03/04 15:45:09 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, feng02, NODE_LOCAL, 1318 bytes) 15/03/04 15:45:09 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 788 ms on feng02 (1/2) 15/03/04 15:45:09 INFO DAGScheduler: Stage 1 (collect at SparkPlan.scala:84) finished in 0.847 s 15/03/04 15:45:09 INFO StatsReportListener: Finished stage: org.apache.spark.scheduler.StageInfo@4379057 15/03/04 15:45:09 INFO StatsReportListener: task runtime:(count: 2, mean: 432.000000, stdev: 356.000000, max: 788.000000, min: 76.000000) 15/03/04 15:45:09 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:09 INFO StatsReportListener: 76.0 ms 76.0 ms 76.0 ms 76.0 ms 788.0 ms 788.0 ms 788.0 ms788.0 ms 788.0 ms 15/03/04 15:45:09 INFO DAGScheduler: Job 1 finished: collect at SparkPlan.scala:84, took 0.898729 s 15/03/04 15:45:09 INFO StatsReportListener: task result size:(count: 2, mean: 1904.000000, stdev: 23.000000, max: 1927.000000, min: 1881.000000) 15/03/04 15:45:09 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:09 INFO StatsReportListener: 1881.0 B 1881.0 B 1881.0 B 1881.0 B 1927.0 B1927.0 B 1927.0 B 1927.0 B 1927.0 B 15/03/04 15:45:09 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 76 ms on feng02 (2/2) 15/03/04 15:45:09 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 15/03/04 15:45:09 INFO StatsReportListener: executor (non-fetch) time pct: (count: 2, mean: 79.889126, stdev: 15.415442, max: 95.304569, min: 64.473684) 15/03/04 15:45:09 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:09 INFO StatsReportListener: 64 % 64 % 64 % 64 % 95 % 95 % 95 % 95 % 95 % 15/03/04 15:45:09 INFO StatsReportListener: other time pct: (count: 2, mean: 20.110874, stdev: 15.415442, max: 35.526316, min: 4.695431) 15/03/04 15:45:09 INFO StatsReportListener: 0% 5% 10% 25% 50% 75% 90% 95% 100% 15/03/04 15:45:09 INFO StatsReportListener: 5 % 5 % 5 % 5 % 36 % 36 % 36 % 36 % 36 % 1 nick 24 1 英語 中文 法文 日文 2 doping 25 2 中文 法文 3 caizhi 26 3 中文 法文 日文 4 liaozhi 27 4 中文 法文 拉丁 5 wind 30 5 中文 法文 德文 Time taken: 16.941 seconds 15/03/04 15:45:10 INFO CliDriver: Time taken: 16.941 seconds
總結
以上是生活随笔為你收集整理的Spark SQL CLI 运行的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: spark sql 1.2.0 测试
- 下一篇: spark SQL Running th