當前位置：首頁 > 编程语言 > python >内容正文

python

python按索引删除字符串,基于特定的索引字符串值选择记录，然后通过python删除后续字段...

發布時間：2024/7/5 python 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 python按索引删除字符串,基于特定的索引字符串值选择记录，然后通过python删除后续字段... 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

I have a .csv file named file01.csv that contains many records. Some records are required and some are not. I find that the required records has a string variable “Mi”, but it is not exist into the unnecessary records. So, I want to select the required records based on string value “Mi” in the field for every records.

Finally I want to delete the subsequent fields of each record from the field that contains value “Mi”. Any suggestion and advice is appreciated.

Optional:

In addition, I want to delete the first column.

Split column BB into two column named as a_id, and c_id. Separate the value by _ (underscore) and left side will go to a_id, and right side will go to c_id.

My fileO.csv is as follows:

AA BB CC DD EE FF GG

1 1_1.csv (=0 =10" 27" =57 "Mi"

0.97 0.9 0.8 NaN 0.9 od 0.2

2 1_3.csv (=0 =10" 27" "Mi" 0.5

0.97 0.5 0.8 NaN 0.9 od 0.4

3 1_6.csv (=0 =10" "Mi" =53 cnt

0.97 0.9 0.8 NaN 0.9 od 0.6

4 2_6.csv No Bi 000 000 000 000

5 2_8.csv No Bi 000 000 000 000

6 6_9.csv less 000 000 000 000

7 7_9.csv s(=0 =26" =46" "Mi" 121

My Expected results files (outFile.csv):

a_id b_id CC DD EE FF GG

1 1 0 10 27 57

1 3 0 10 27

1 6 0 10

7 9 0 26 46

解決方案

The following approach should work fine using Python csv module:

import csv

import re

import string

output_header = ['a_id', 'b_id', 'CC', 'DD', 'EE', 'FF', 'GG']

sanitise_table = string.maketrans("","")

nodigits_table = sanitise_table.translate(sanitise_table, string.digits)

def find_mi(row):

for index, col in enumerate(row):

if col.find('Mi') != -1:

return index

return -1

def sanitise_cell(cell):

return cell.translate(sanitise_table, nodigits_table) # Keep digits

f_input = open('fileO.csv', 'rb')

f_output = open('outFile.csv', 'wb')

csv_input = csv.reader(f_input)

csv_output = csv.writer(f_output)

input_header = next(f_input)

csv_output.writerow(output_header)

for row in csv_input:

#print '%2d %s' % (len(row), row)

if len(row) >= 2:

bb = re.match(r'(\d+)__(\d+).0\.csv', row[1])

mi = find_mi(row)

if bb and mi != -1:

row[:] = row[:mi] + [''] * (len(row) - mi)

row[:] = [sanitise_cell(col) for col in row]

row[0] = bb.group(1)

row[1] = bb.group(2)

csv_output.writerow(row)

f_input.close()

f_output.close()

outFile.csv will contain the following:

a_id,b_id,CC,DD,EE,FF,GG

1,1,0,10,27,57,

1,3,0,10,27,,

1,6,0,10,,,

7,9,0,26,46,,

Tested using Python 2.6.6

總結

以上是生活随笔為你收集整理的python按索引删除字符串,基于特定的索引字符串值选择记录，然后通过python删除后续字段...的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Python实现给指定的微信朋友发信息
下一篇： websocket python爬虫_p