BigDataDevKit

Big Data Development Kit (Hadoop / Spark / Zeppelin / IntelliJ) in Amazon AWS

Usage:

Launch an instance on Amazon WebServices

Specify in the initialization of the vm

#!/bin/bash
wget -O- http://bit.ly/1PjnNB3 |\
 PASSWORD="changeme"\
 TOKEN="duck-dns-token"\
 HOST="duckdnshost"\
 EMAIL="your@email"\
 bash

NOTE!

You need to specify Amazon Linux 64 bit
You need at least an image small (2g) better 4g for running all the services
You need at least 20 GB space
Create a secondary block on sdb and the /app folder with all your data will be mounted there, thus preserved from termination
change the password (change the string within the quotes with your password) to the one you want
register an hostname in www.duckdns.org and get the token, and replace them in the TOKEN and HOST variables
specify your email (user for let's encrypt service)
if you have a backup of let's encrypt (for example in Dropbox) specify LETGZ="...." to the url of your backup
before you launch the instance add a rule to open the HTTPS ports to the world

Spark / Hadoop / Zeppelin devkit

Docker kit for Hadoop, Spark and Zeppelin

Devenv with IntelliJ, SBT and Ammonite accessible via web

Usage:

First, get a docker machine and configure your docker to access it. Refer to docker documentation to learn how to do it.

The script sh build.sh <password> builds the enviroment.

Start it with docker-compose up -d.

That is all.

What is in the kit

Access the shell with http://youserver:3000 and the desktop with http://yourserver:6080

In the kit there is Intellij free edition, a terminal with sbt and ammonite

Inside the kit you have also Zeppelin, internally accessible as

http://zeppelin.loc:8000, Hadoop accessible as hdfs://hadoop.loc:8020 and Spark on http://spark.loc

(to fix) You can also ssh (without password) on hadoop.loc, spark.loc and zeppelin.loc

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
base-hadoop		base-hadoop
base-java		base-java
base-novnc		base-novnc
base-scala		base-scala
base-ssh		base-ssh
devenv		devenv
hadoop		hadoop
sparkjobserver		sparkjobserver
zeppelin		zeppelin
.gitignore		.gitignore
README.md		README.md
TODO.txt		TODO.txt
ami-init.sh		ami-init.sh
build.sh		build.sh
centos6-init.sh		centos6-init.sh
cleanup.sh		cleanup.sh
docker-compose.yml		docker-compose.yml
dropbox.sh		dropbox.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BigDataDevKit

Spark / Hadoop / Zeppelin devkit

Usage:

What is in the kit

About

Releases

Packages

Languages

sciabarra/BigDataDevKit

Folders and files

Latest commit

History

Repository files navigation

BigDataDevKit

Spark / Hadoop / Zeppelin devkit

Usage:

What is in the kit

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages