javascript
java动态json入库_从JSon File动态生成模式
一些興趣點:
1)您不需要數(shù)據(jù)幀來加載您的json架構 . 模式在驅動程序上加載和執(zhí)行,因為不需要分發(fā)那些不必要的開銷
2)我構造了一個JColumn對象的List,并將它傳遞給StructType以動態(tài)構造模式
3)inferSchema應該是false,因為我們明確定義了schema
4)我假設您的數(shù)據(jù)庫表使用“null”表示空值
5)調整映射修改typeMapping
import org.json4s._
import org.json4s.native.JsonMethods
case class JColumn(trim: Boolean, name: String, nullable: Boolean, id: Option[String], position: BigInt, table: String, _type: String, primaryKey: Boolean)
val path = """your_path\schema.json"""
val input = scala.io.Source.fromFile(path)
val json = JsonMethods.parse(input.reader())
val typeMapping = Map(
"double" -> DoubleType,
"integer" -> IntegerType,
"string" -> StringType,
"date" -> DateType,
"bool" -> BooleanType)
var rddSchema = ListBuffer[StructField]()
implicit val formats = DefaultFormats
val schema = json.extract[Array[JColumn]]
//schema.foreach(c => println(s"name:${c.name} type:${c._type} isnullable:${c.nullable}"))
schema.foreach { c =>
rddSchema += StructField(c.name, typeMapping(c._type), c.nullable, Metadata.empty)
}
val in_emp = spark.read
.format("com.databricks.spark.csv")
.schema(StructType(rddSchema.toList))
.option("inferSchema", "false")
.option("dateFormat", "yyyy.MM.dd")
.option("header", "false")
.option("delimiter", ",")
.option("nullValue", "null")
.option("treatEmptyValuesAsNulls", "true")
.csv("""your_path\employee.csv""")
in_emp.printSchema()
in_emp.collect()
in_emp.show()
我使用以下模式進行測試:
[
{
"trim": true,
"name": "id",
"nullable": true,
"id": null,
"position": 0,
"table": "employee",
"_type": "integer",
"primaryKey": true
},
{
"trim": true,
"name": "salary",
"nullable": true,
"id": null,
"position": 1,
"table": "employee",
"_type": "double",
"primaryKey": false
},
{
"trim": true,
"name": "dob",
"nullable": true,
"id": null,
"position": 2,
"table": "employee",
"_type": "date",
"primaryKey": false
},
{
"trim": true,
"name": "department",
"nullable": true,
"id": null,
"position": 3,
"table": "employee",
"_type": "string",
"primaryKey": false
}
]
以及下一個數(shù)據(jù)(employee.csv):
1211,3500.0,null,marketing
1212,3000.0,2016.12.08,IT
1213,4000.0,2017.10.20,HR
1214,3000.0,2017.10.20,finance
總結
以上是生活随笔為你收集整理的java动态json入库_从JSon File动态生成模式的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: C语言的中常用的函数
- 下一篇: gradle idea java ssm