1、 加压spark
tar -zxvf spark-3.3.0-bin-hadoop3.tgz
2、 修改配置文件
2.1 配置文件修改
mv spark-env.sh.template spark-env.sh
mv slaves.template slaves
2.2 修改spark-env.sh
2.2.1 修改命令
vi spark-env.sh
2.2.2 修改内容
export JAVA_HOME=/home/server/jdk1.8.0_333
export HADOOP_CONF_DIR=/home/server/hadoop/hadoop-3.2.3/etc/hadoop
export SCALA_HOME=/home/server/scala
export SPARK_LOCAL_IP=192.168.0.103 //填写公网ip 用来让别人访问的地址
export HADOOP_HOME=/home/server/hadoop/hadoop-3.2.3
export SPARK_MASTER_HOST=10001
export SPARK_MASTER_IP=192.168.0.103
注意:SPARK_LOCAL_IP 如果填写localhost 会访问不到;启动时如果报端口占用,可以修改SPARK_MASTER_HOST
2.3 重命名其他配置文件
2.4 配置环境变量
vim /etc/profile
source /etc/profile
3、 验证spark
3.1 启动spark
cd /home/server/spark/sbin
./start-all.sh
注意:spark 的管理页面默认端口8080
修改可以在spark安装目录中的文件sbin/ start-master.sh内搜索8080
改为自定义的端口
注意:导入mysql驱动包到 $SPARK_HOME/jars
登录spark-shell
spark-shell --master local[2];
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/08/10 22:43:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/10 22:43:46 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://192.168.0.103:4041
Spark context available as 'sc' (master = local[2], app id = local-1660196626279).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.3.0
/_/
Using Scala version 2.12.15 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_333)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql(“show databases”)
22/08/10 22:32:40 INFO HiveMetaStore: 0: get_databases: *
22/08/10 22:32:40 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_databases: *
+---------+
|namespace|
+---------+
|db_hive |
|default |
+---------+