Data Analysis using Spark

In this tutorial, you'll learn the basic steps to load and analyze data with Apache Spark in DatabeanStalk PySpark environment.

Click on JupyterHub from left menu or click on QuickStart Spark or PySpark, it will open up new secure window with your same user credentials

Select server options for Run Managed Apache Spark environment and click on start.

Jupyter notebook start with multiple Spark runtime kernels like native python, PySpark, R or core Spark.

Initialize spark session with "sc" in notebook and Databeanstalk create Spark driver and two executors in kubernetes Databeanstalk plateform.

Last updated 3 years ago