jupyter notebok1 Pyspark (on Jupyter notebook) In [1]: from IPython.core.display import display, HTML display(HTML("")) View Source In [1]: !pip list | grep pyspark pyspark 3.3.0 In [2]: from pyspark.sql import SparkSession 1. hdfs 에 파일올리기¶ $ hadoop fs -ls $ hadoop fs -mkdir /user/philhoonoh $ hadoop fs hadoop fs -put /Users/philhoonoh/Desktop/Hadoop/data.csv /user/philhoonoh/ 2. Spark Session 으로 data.csv를 dataframe 으로 불러오기¶ In [3]: spark = .. 2022. 8. 24. 이전 1 다음