GitHub - phoeph/AuraSparkTraining: Spark Training Example For Aura.cn

sparktraining

Examples for Spark Training in aura.cn

本地运行Spark方法

下载spark安装包
解压spark安装包
进入spark解压目录下，运行：

$ bin/spark-shell

在命令行提示符下拷贝以下代码并查看执行结果

import scala.math.random

val tasks = 10
val n = tasks * 100000

val count = sc.parallelize(1 until n, tasks).map { i =>
  val x = random * 2 - 1
  val y = random * 2 - 1
  if (x*x + y*y <= 1) 1 else 0
}.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / n )

分布式运行Spark方法

搭建hadoop集群

Hadoop YARN/HDFS配置文件参考：conf/hadoop目录

配置Spark客户端，并启动spark history server

Spark客户端配置文件参考：conf/spark目录
启动spark history server: sbin/start-history-server.sh

将spark-shell运行在yarn client或cluster模式

yarn client模式：bin/spark-shell --master yarn --deploy-mode client
yarn cluster：bin/spark-shell --master yarn --deploy-mode cluster

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
bin		bin
conf		conf
data		data
doc		doc
script		script
src/main		src/main
.gitignore		.gitignore
AuraSparkTraining.iml		AuraSparkTraining.iml
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sparktraining

本地运行Spark方法

分布式运行Spark方法

搭建hadoop集群

配置Spark客户端，并启动spark history server

将spark-shell运行在yarn client或cluster模式

About

Releases

Packages

Languages

phoeph/AuraSparkTraining

Folders and files

Latest commit

History

Repository files navigation

sparktraining

本地运行Spark方法

分布式运行Spark方法

搭建hadoop集群

配置Spark客户端，并启动spark history server

将spark-shell运行在yarn client或cluster模式

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages