IDEA在Java项目里编写Scala代码,并在本地调试Spark程序

    技术2022-07-10  125

    下载IDEA插件

    下载Scala插件 下载完成后重启IDEA

    设置SDK

    引入编译Scala对应的maven插件

    <build> <pluginManagement> <plugins> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.2</version> </plugin> <plugin> <artifactId>maven-compiler-plugin</artifactId> <version>3.5.1</version> </plugin> </plugins> </pluginManagement> <plugins> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> <plugin> <artifactId>maven-compiler-plugin</artifactId> <executions> <execution> <phase>compile</phase> <goals> <goal>compile</goal> </goals> </execution> </executions> </plugin> </plugins> </build>

    编写Scala代码

    注意项目目录结构

    引入Spark依赖

    <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>3.0.0</version> </dependency>

    注意这里的_2.12是Scala语言版本,版本必须对齐,否则会报类型错误。

    object WordCount { def main(args: Array[String]): Unit = { val sparkConf = new SparkConf().setAppName("wordCountTest").setMaster("local[4]") val sc = new SparkContext(sparkConf) val lineRdd: RDD[String] = sc.parallelize(List("Wap","Xu","Wap")) val teacherAndOne: RDD[(String, Int)] = lineRdd.map(line => { (line, 1) }) val reduced: RDD[(String, Int)] = teacherAndOne.reduceByKey(_+_) val resultArray: Array[(String, Int)] = reduced.collect() print(resultArray.toList) sc.stop() } }

    最后run或者debug,在本地执行Spark程序。

    Processed: 0.009, SQL: 9