А

Size: a a a
А

AS
>>> df = sqlContext.createDataFrame([(1,),(2,),(3,)], ["тест пассед"])
>>> df.write.parquet('df.parquet')
21/12/03 13:49:55 WARN DFSClient: Slow ReadProcessor read fields took 59255ms (threshold=30000ms); ack: seqno: 38 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 3476206 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[10.144.43.16:50010,DS-0a231050-09a2-416a-bb03-b97fe78a367a,DISK], DatanodeInfoWithStorage[10.144.10.5:50010,DS-82ac218a-32c0-4eaf-b7bc-5e98276d6f24,DISK], DatanodeInfoWithStorage[10.144.10.16:50010,DS-83425a8c-2b84-46c9-a334-29031ffe4978,DISK]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/pyspark/sql/readwriter.py", line 1249, in parquet
self._jwrite.parquet(path)
File "/usr/local/share/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
File "/usr/local/lib/python3.6/site-packages/pyspark/sql/utils.py", line 117, in deco
raise converted from None
pyspark.sql.utils.AnalysisException: Attribute name "тест пассед" contains invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.
>>> df1 = df.withColumnRenamed("тест пассед", "тест")
>>> df
DataFrame[тест пассед: bigint]
>>> df1
DataFrame[тест: bigint]
>>> df1.write.parquet('df.parquet')
>>>
А
G
t
G
t
t
G
A
М
ЕГ
A
ЕГ
VI
ЕГ
ЕГ
VI
ЕГ
VI