當前位置：首頁 >

【OpenMP实现】任意线程数并行化快排结合Merge排序100w--10线程下只用0.06s

發布時間：2025/4/16 20 豆豆

生活随笔收集整理的這篇文章主要介紹了【OpenMP实现】任意线程数并行化快排结合Merge排序100w--10线程下只用0.06s 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

簡述

通過這個標題也大概能猜測出算法的思想。

Merge操作是O(n)的（出自于MergeSort歸并排序）
根據線程數將數據劃分為thread_count塊（較為均勻點就好了）
每段上用qsort（快排）
之后再用一個merge將所有的快排結果合并起來

算法思想很簡單，但是效果卻不錯。

之前寫過一篇用雙進程寫快排然后實現100w的排序只用0.2s

這里測試過用兩個的情況，效果類似

PS D:\Code\C++\repo\openMP_test\x64\Debug> ./openMP_test.exe 2 reading finish sort finished Total time Serial: 0.223s

代碼

只有一個線程的時候不進行~

#include <iostream> #include <omp.h> #include <fstream> #include <ctime> using namespace std; #pragma warning(disable : 4996) int *a, n, thread_count, *output;void QSort(int begin, int end) {if (begin >= end) return;// 樞軸元素就需求最前里面的那個int i = begin, j = end, key = a[begin]; // 坑在begin點 while (i < j) {while (i < j && a[j] >= key) --j;if (i < j) a[i] = a[j];while (i < j && a[i] <= key) ++i;if (i < j) a[j] = a[i];}a[i] = key;QSort(begin, i - 1);QSort(i + 1, end); } void sort() {// 分成thread_count段;int *keys = new int[thread_count];bool *finished = new bool[thread_count];int localn = n / thread_count, i;for (i = 0; i < thread_count; ++i) {keys[i] = i * localn; // begin keysfinished[i] = false;} #pragma omp parallel for num_threads (thread_count) default(none) shared(a, n,localn) private(i)for (i = 0; i < thread_count; ++i) {if (i < thread_count - 1) {QSort(i*localn, (i + 1)*localn - 1);}else {QSort(i*localn, n - 1);}}bool allfinished = false, first=true;int min=0, minkey_index, globalindex=0;while (!allfinished) {allfinished = false;first = true;// choose min indexfor (i = 0; i < thread_count; ++i) {if (finished[i]) continue;if (first) { min = a[keys[i]]; first = false; minkey_index = i;} else if (a[keys[i]] < min){min = a[keys[i]]; minkey_index = i;}}if (first) { break; }output[globalindex++] = min;keys[minkey_index]++;if (minkey_index == thread_count - 1 && keys[minkey_index] == n) finished[minkey_index] = true;else if (minkey_index < thread_count - 1 && keys[minkey_index] == (minkey_index + 1) * localn) { finished[minkey_index] = true; }}delete[]finished;delete[]keys; }int main(int argc, char **argv) {if (argc < 2) return 0;thread_count = strtol(argv[1], NULL, 10);int i;ifstream cin("input.txt");cin >> n;a = new int[n];output = new int[n];for (i = 0; i < n; ++i) cin >> a[i];cout << "reading finish\n";clock_t starttime, endtime;starttime = clock();sort();endtime = clock();cout << "sort finished\n";ofstream out("output.txt");for (i = 0; i < n; ++i) out << output[i] << " ";cout << "Total time sorted: " << (double)(endtime - starttime) / CLOCKS_PER_SEC << "s" << endl;delete[] a;delete[]output; }

調用之前寫好的生成數據的腳本來生成數據
然后用生成出來的exe來進行排序，后面的數值是線程數目
運行代碼其實并不消耗多少時間，主要的問題是在寫入文件的時候消耗了點時間，IO操作嘛畢竟

PS D:\Code\C++\repo\openMP_test\x64\Debug> python generate.py 1000000 PS D:\Code\C++\repo\openMP_test\x64\Debug> ./openMP_test.exe 2 reading finish sort finished Total time Serial: 0.223s PS D:\Code\C++\repo\openMP_test\x64\Debug> ./openMP_test.exe 10 reading finish sort finished Total time Serial: 0.068s PS D:\Code\C++\repo\openMP_test\x64\Debug> ./openMP_test.exe 10 reading finish sort finished Total time sorted: 0.06s PS D:\Code\C++\repo\openMP_test\x64\Debug> python check.py right PS D:\Code\C++\repo\openMP_test\x64\Debug>

生成數據的腳本

import sys import numpy as npN = int(sys.argv[1]) a = np.random.randint(- int(np.sqrt(N)), int(np.sqrt(N)), N) c = sorted(a) with open("input.txt", 'w') as f:f.write(str(N) + '\n')f.write(' '.join(list(map(str, a)))+'\n') with open("ans.txt", 'w') as f:for i in range(N):f.write(str(c[i]) + ' ')

檢查結果的代碼

with open('ans.txt', 'r')as f, open('output.txt', 'r') as r:a = f.readline()b = r.readline()print('right' if a == b else 'wrong')

總結

以上是生活随笔為你收集整理的【OpenMP实现】任意线程数并行化快排结合Merge排序100w--10线程下只用0.06s的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： MPI派生数据类型发送接收，降低发送时间
下一篇： R语言--查看数据类型+类型判断