# bigdata-module-7-spark-sourcecode

**Repository Path**: penglanglang/bigdata-module-7-spark-sourcecode

## Basic Information

- **Project Name**: bigdata-module-7-spark-sourcecode
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-10-16
- **Last Updated**: 2021-11-02

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# bigdata-module-7-spark-sourcecode

#### 题目

简答题：

以下代码：
```scala

import org.apache.spark.rdd.RDD

import org.apache.spark.{SparkConf, SparkContext}

object JoinDemo {

def main(args: Array[String]): Unit = {

val conf = new SparkConf().setAppName(this.getClass.getCanonicalName.init).setMaster("local[*]")

val sc = new SparkContext(conf)

sc.setLogLevel("WARN")

val random = scala.util.Random

val col1 = Range(1, 50).map(idx => (random.nextInt(10), s"user$idx"))

val col2 = Array((0, "BJ"), (1, "SH"), (2, "GZ"), (3, "SZ"), (4, "TJ"), (5, "CQ"), (6, "HZ"), (7, "NJ"), (8, "WH"), (0,"CD"))

val rdd1: RDD[(Int, String)] = sc.makeRDD(col1)

val rdd2: RDD[(Int, String)] = sc.makeRDD(col2)

val rdd3: RDD[(Int, (String, String))] = rdd1.join(rdd2)

println(rdd3.dependencies)

val rdd4: RDD[(Int, (String, String))] = rdd1.partitionBy(new HashPartitioner(3)).join(rdd2.partitionBy(newHashPartitioner(3)))

println(rdd4.dependencies)

sc.stop()

}

}

```

两个打印语句的结果是什么，对应的依赖是宽依赖还是窄依赖，为什么会是这个结果；

join 操作何时是宽依赖，何时是窄依赖；

借助 join 相关源码，回答以上问题。

#### 回答
结果有些看不懂

从打印结果看都是OneToOneDependency
rdd3 join 的时候使用的是默认分区，
rdd4 join 的时候使用的是制定分区