當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

python单词统计、给定一个段落()_数训营第一课笔记：Python基础知识

發(fā)布時(shí)間：2025/4/16 python 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 python单词统计、给定一个段落()_数训营第一课笔记：Python基础知识小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1.help()和dir()函數(shù)

help()函數(shù)與dir()函數(shù)都是幫助函數(shù)：

help()函數(shù)能夠提供詳細(xì)的幫助信息，dir()函數(shù)僅是簡(jiǎn)單的羅列可用的方法。

2.基礎(chǔ)數(shù)據(jù)結(jié)構(gòu)

基礎(chǔ)數(shù)據(jù)類型：數(shù)值型、布爾型和字符串型。

2.1 數(shù)值型數(shù)據(jù)有整型(int)和浮點(diǎn)型(float)兩種。

數(shù)值型數(shù)據(jù)的計(jì)算方法：

加 x+y

減 x-y

乘 x*y

除 x/y

冪次方 x**3

加等 x += 2 在x原有值的基礎(chǔ)上再加2

減等 x -= 2 在x原有值的基礎(chǔ)上減2

2.2 布爾型數(shù)據(jù)只有兩種類型：True 和 False

2.3 字符串類型

字符串類型用單引號(hào)或雙引號(hào)引起，引號(hào)內(nèi)的內(nèi)容即為字符串類型(str)。

s1 = 'hello world'

s2= "hello world"

len(s1) = 11 # 字符串計(jì)數(shù)包含空格

long_str = ' I am a teacher. '

# 去掉字符串左右兩邊的空格

long_str.strip() -> 'I am a teacher.'

# 去掉字符串左邊的空格

long_str.lstrip() -> 'I am a teacher. '

# 去掉字符串右邊的空格

long_str.rstrip() -> ' I am a teacher.'

# 將字符串中的teacher替換為student

long_str.strip().replace('teacher', 'student')

->I am a student.

num_str = '123456'

num_str.isdigit() # 判斷變量是否為數(shù)值型變量

字符串切片

str = 'hello'

# str 01234

# str -5-4-3-2-1

str[1:4] -> 'ell' # 序數(shù)從0開始，右區(qū)間為開區(qū)間

str[1:] -> 'ello'

str[:3] -> 'hel'

# 逆序數(shù)

s[-5:] -> 'hello'

s[-4:-1] -> 'ell'

格式化輸出

num_s = 100

num_t = 3

test_str = 'hello'

text = 'there are %d students in the classroom and %s' %(num_s, test_str)

text

-> 'there are 100 students in the classroom and hello'

# %d格式化輸出數(shù)字，%s格式化輸出字符串, 括號(hào)內(nèi)的參數(shù)按語(yǔ)句中對(duì)應(yīng)的順序排列。

s = 'hello world hello bigdata hello China'

s.split(' ')

-> ['hello', 'world', 'hello', 'bigdata', 'hello', 'China']

# 以空格作為分隔符將文本拆分為列表

3.判斷語(yǔ)句

3.1 if判斷

score = 59

if score > 90:

print 'very good.'

elif score > 85:

print 'good.'

elif socre > 70:

print 'just so so.'

else:

print 'not so good'

# if--elif--else

# 每個(gè)判斷語(yǔ)句寫完后記得加冒號(hào)":"

3.2 邏輯操作：與(and)或(or)非(not)

A = True

B = False

A and B = False

A or B = True

(not A) or B = False

4.容器

4.1 列表/list

string_list = ['this', 'is', 'a', 'string']

len(string_list) = 4 # len()函數(shù)計(jì)算列表中元素的個(gè)數(shù)

string_list[1:3] -> ['is', 'a'] # 列表的切片，同字符串的切片操作類似

列表中的元素沒有類型限制，不同數(shù)據(jù)類型可以添加到同一個(gè)列表中。

mass = ['this', 'is', 'good', 3.14, 10]

# 用for循環(huán)輸出list元素的數(shù)據(jù)類型

for item in mass:

print type(item)

# 用索引號(hào)index輸出個(gè)性化結(jié)果

for index in range(5)

if index % 2 == 0: # 索引號(hào)為偶數(shù)

print mass[index]

# 用while循環(huán)(注重于結(jié)束的條件)

index = 0

while index < 2:

print mass[index]

index += 1

append與extend的區(qū)別

append可將任何對(duì)象添加到list中，甚至包括list。

extend只會(huì)將list中的元素添加進(jìn)去。

sort與sorted的區(qū)別

sort()函數(shù)改變列表本身的順序

sorted()函數(shù)不改變列表本身的順序

高級(jí)排序

num_list = [1, 2, 3, 4, 5, 6, 7]

print sorted(num_list, reverse = True) # 逆序排列

->[7, 6, 5, 4, 3, 2, 1]

a = ['aa', 'bbbbb', 'ccc', 'dddd']

print sorted(a, key = len) # 按字符串長(zhǎng)度排列

->['bbbbb', 'dddd', 'ccc', 'aa']

4.2 字典/dict

字典的查找速度較列表要快很多，原因是字典采用哈希算法(hash)，即一個(gè)key對(duì)應(yīng)一個(gè)value，使用花括號(hào)"{}"表示。

Dict = {key1: value1, key2: value2, key3: value3}

pets = {'dogs':3, 'cats':2, 'birds':4}

print pets['dogs'] # 查找鍵值時(shí)使用中括號(hào)"[]"

->3

if 'cats' in pets:

print 'I have ' + str(pets['cats']) +'cats.' # 只有字符串才能用"+"號(hào)連接，所以pets['cats']返回的值必須用str()函數(shù)轉(zhuǎn)換為字符串。

->I have 2 cats.

# for循環(huán)遍歷字典

for pet in pets:

print 'I have ' + str(pets[pet]) + pet

-> I have 3 dogs

I have 2 cats

I have 4 birds

# 只想取出key

pets.keys() # 會(huì)得到由鍵值組成的列表對(duì)象

sum(pets.keys()) # 會(huì)得到列表中所有數(shù)字的加總和

# 從字典中成對(duì)取出數(shù)據(jù)

for (pet, num) in pets.items(): # 字典中的每一個(gè)元素都是一對(duì)鍵值對(duì)。

print pet, '=' , num

->dogs = 3

cats = 2

birds = 4

# 字典添加新的鍵值對(duì)

pets['ducks'] = 5

# 字典刪除鍵值對(duì)

del pets['ducks']

4.3 文件的讀寫

in_file = 'shanghai.txt'

for line in open(in_file):

print line.strip().splite(" ")

# 使用"for line in open(file):"這種方式打開的文件不需要關(guān)閉句柄。

# strip()函數(shù)去除了每個(gè)段落前后的空格

# splite(" ")函數(shù)將每個(gè)段落中的單詞以空格作為分隔符拆分為單個(gè)的列表元素。

#最后的拆分結(jié)果，每個(gè)段落組成一個(gè)列表，每個(gè)段落中的單詞成為對(duì)應(yīng)列表中的一個(gè)元素

4.4 統(tǒng)計(jì)文件中每個(gè)單詞出現(xiàn)的頻次

# 選用字典作為容器

words_count = {} # 創(chuàng)建一個(gè)空字典

for line in open('shanghai.txt'):

words = line.strip().splite(" ") # 對(duì)文本做處理，去掉段落前后的空格，并以空格作為分隔符拆分段落中的單詞，構(gòu)建列表。

for word in words:

if word in words_count:

words_count[word] += 1 # 如果字典中存在該單詞，對(duì)應(yīng)的值+1

else:

words_count[word] = 1 # 如果字典中不存在該單詞，在字典中添加一對(duì)新的鍵值對(duì)。

#字典里存儲(chǔ)的是詞和詞頻

for word in words_count:

print word, words_count[word] # 使用for循環(huán)遍歷并輸出字典中的單詞和詞頻

4.5 定義函數(shù)

定義函數(shù)要用下面的形式：

def 函數(shù)名(函數(shù)參數(shù)):

函數(shù)內(nèi)容

例如：

def add_num(x,y):

return x+y

add_num(3, 4)

->7

def my_func(list_x):

new_list = []

for item in list_x:

new_list.append(item**3)

return new_list

my_test = [1, 2, 3, 4, 5]

my_func(my_test)

->[1, 8, 27, 64, 125]

# 定義函數(shù)自動(dòng)讀取文件并輸出文件中的單詞和詞頻

def count_words(in_file, out_file):

words_count = {}

# 對(duì)每一行去前后兩端的空格，用單詞間的空格作為分隔符分拆單詞，采用字典記錄

for line in open(in_file):

for word in line.strip().rstrip('.').splite(" "):

if word in words_count:

words_count[word] += 1

else:

words_count[word] = 1

# 打開文件并寫入結(jié)果

out = open(out_file, 'w') # 'w' 代表 'w'riting，這種打開文件的方式最后需要關(guān)閉句柄。

for word in words_count:

out.write(word + "#" + str(words_count[word]) + "\n") # 將單詞和詞頻用一定的格式寫入文件

out.close # 關(guān)閉句柄

# 調(diào)用函數(shù)

count_words('shanghai.txt', 'words_count.txt')

4.6 list comprehension

當(dāng)需要對(duì)于列表中的每一個(gè)元素做相同的操作時(shí)，可以采用list comprehension方法。

[需要對(duì)item做的操作 for item in list (可選部分：對(duì)item的限制條件)]

test_list = [1, 2, 3, 4]

[item**3 for item in test_list]

->[1, 8, 27, 64]

['num_' + str(item) for item in test_list]

->['num_1', 'num_2', 'num_3', 'num_4']

[item**3 for item in test_list if item % 2 == 0] # 對(duì)列表中為偶數(shù)的元素做立方處理，并輸出新的列表

->[8, 64]

[item**4 for item in test_list if item % 2 == 0 and item > 3] # 對(duì)列表中為偶數(shù)且大于3的元素乘4次方，并輸出新的列表。

->[256]

總結(jié)

以上是生活随笔為你收集整理的python单词统计、给定一个段落()_数训营第一课笔记：Python基础知识的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： mysql数据库二级233_MySQL数
下一篇： mysql 57授权失败_MYSQL教程