當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

【编译原理】让我们来构建一个简单的解释器（Let’s Build A Simple Interpreter. Part 2.）（python/c/c++版）（笔记）

發(fā)布時(shí)間：2025/3/20 python 25 豆豆

生活随笔收集整理的這篇文章主要介紹了【编译原理】让我们来构建一个简单的解释器（Let’s Build A Simple Interpreter. Part 2.）（python/c/c++版）（笔记）小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

【編譯原理】讓我們來構(gòu)建一個(gè)簡(jiǎn)單的解釋器（Let’s Build A Simple Interpreter. Part 2.）

文章目錄

- python代碼
- c代碼
- 總結(jié)

讓我們?cè)俅紊钊胙芯拷忉屍骱途幾g器。

今天，我將向您展示第 1 部分中計(jì)算器的新版本，它將能夠：

1、處理輸入字符串中任意位置的空白字符
2、從輸入中使用多位整數(shù)
3、兩個(gè)整數(shù)相減（目前只能相加）

與第 1 部分的版本相比，主要的代碼變化是：

1、該get_next_token方法重構(gòu)了一下。增加pos指針的邏輯被分解為一個(gè)單獨(dú)的方法Advance。
2、添加了另外兩種方法：skip_whitespace忽略空白字符和integer處理輸入中的多位整數(shù)。
3、該EXPR方法被修改，以識(shí)別INTEGER - > MINUS - > INTEGER短語(yǔ)除了INTEGER - > PLUS - > INTEGER短語(yǔ)。該方法現(xiàn)在還可以在成功識(shí)別相應(yīng)短語(yǔ)后解釋加法和減法。

python代碼

# -*- coding: utf-8 -*- """ # -*- coding: utf-8 -*- """ @File : calc2.py @Time : 2021/7/8 10:32 @Author : Dontla @Email : sxana@qq.com @Software: PyCharm """ # EOF (end-of-file) token is used to indicate that # there is no more input left for lexical analysis INTEGER, PLUS, MINUS, EOF = 'INTEGER', 'PLUS', 'MINUS', 'EOF'class Token(object):def __init__(self, type_, value):# token type: INTEGER, PLUS, MINUS, or EOFself.type = type_# token value: non-negative integer value, '+', '-', or Noneself.value = valuedef __str__(self):"""String representation of the class instance.Examples:Token(INTEGER, 3)Token(PLUS '+')"""return 'Token({type}, {value})'.format(type=self.type,value=repr(self.value))def __repr__(self):return self.__str__()class Interpreter(object):def __init__(self, text):# client string input, e.g. "3 + 5", "12 - 5", etcself.text = text# self.pos is an index into self.textself.pos = 0# current token instanceself.current_token = Noneself.current_char = self.text[self.pos]def error(self):raise Exception('Error parsing input')def advance(self):"""Advance the 'pos' pointer and set the 'current_char' variable."""self.pos += 1if self.pos > len(self.text) - 1:self.current_char = None # Indicates end of inputelse:self.current_char = self.text[self.pos]def skip_whitespace(self):while self.current_char is not None and self.current_char.isspace():self.advance()def integer(self):"""Return a (multidigit) integer consumed from the input."""result = ''while self.current_char is not None and self.current_char.isdigit():result += self.current_charself.advance()return int(result)def get_next_token(self):"""Lexical analyzer (also known as scanner or tokenizer)This method is responsible for breaking a sentenceapart into tokens."""while self.current_char is not None:if self.current_char.isspace():self.skip_whitespace()continueif self.current_char.isdigit():return Token(INTEGER, self.integer())if self.current_char == '+':self.advance()return Token(PLUS, '+')if self.current_char == '-':self.advance()return Token(MINUS, '-')self.error()return Token(EOF, None)def eat(self, token_type):# compare the current token type with the passed token# type and if they match then "eat" the current token# and assign the next token to the self.current_token,# otherwise raise an exception.if self.current_token.type == token_type:self.current_token = self.get_next_token()else:self.error()def expr(self):"""Parser / Interpreterexpr -> INTEGER PLUS INTEGERexpr -> INTEGER MINUS INTEGER"""# set current token to the first token taken from the inputself.current_token = self.get_next_token()# we expect the current token to be an integerleft = self.current_tokenself.eat(INTEGER)# we expect the current token to be either a '+' or '-'op = self.current_tokenif op.type == PLUS:self.eat(PLUS)else:self.eat(MINUS)# we expect the current token to be an integerright = self.current_tokenself.eat(INTEGER)# after the above call the self.current_token is set to# EOF token# at this point either the INTEGER PLUS INTEGER or# the INTEGER MINUS INTEGER sequence of tokens# has been successfully found and the method can just# return the result of adding or subtracting two integers,# thus effectively interpreting client inputif op.type == PLUS:result = left.value + right.valueelse:result = left.value - right.valuereturn resultdef main():while True:try:# To run under Python3 replace 'raw_input' call# with 'input'# text = raw_input('calc> ')text = input('calc> ')except EOFError:breakif not text:continueinterpreter = Interpreter(text)result = interpreter.expr()print(result)if __name__ == '__main__':main()

運(yùn)行結(jié)果：

D:\python_virtualenv\my_flask\Scripts\python.exe C:/Users/Administrator/Desktop/編譯原理/python/calc2.py calc> 33234 - 324 32910

c代碼

注意，跟上一課代碼不同的是，eat函數(shù)在最后才調(diào)用get_next_token函數(shù)，使得最后得到的token不是當(dāng)前token而是下一個(gè)token

#include <stdio.h> #include <stdlib.h> #include <memory.h> #include <string.h> #include<math.h>#define flag_digital 0 #define flag_plus 1 #define flag_minus 2 #define flag_EOF 3struct Token {int type;int value; };struct Interpreter {char* text;int pos;struct Token current_token; };void error() {printf("輸入非法！\n");exit(-1); }void skip_whitespace(struct Interpreter* pipt) {while (pipt->text[pipt->pos] == ' ') {pipt->pos++;} }//判斷Interpreter中當(dāng)前pos是不是數(shù)字 int is_integer(char c) {if (c >= '0' && c <= '9')return 1;elsereturn 0; }void advance(struct Interpreter* pipt) {pipt->pos++; }char current_char(struct Interpreter* pipt) {return(pipt->text[pipt->pos]); }//獲取數(shù)字token的數(shù)值（把數(shù)字字符數(shù)組轉(zhuǎn)換為數(shù)字） int integer(struct Interpreter* pipt) {char temp[20];int i = 0;while (is_integer(pipt->text[pipt->pos])) {temp[i] = pipt->text[pipt->pos];i++;advance(pipt);}int result = 0;int j = 0;int len = i;while (j < len) {result += (temp[j] - '0') * pow(10, len - j - 1);j++;}return result; }void get_next_token(struct Interpreter* pipt) {if (pipt->pos > (strlen(pipt->text) - 1)) {pipt->current_token = { flag_EOF, NULL };return;}if (current_char(pipt) == ' ')skip_whitespace(pipt);if (is_integer(current_char(pipt))) {pipt->current_token = { flag_digital, integer(pipt) };return;}if (current_char(pipt) == '+') {pipt->current_token = { flag_plus, NULL };pipt->pos++;return;}if (current_char(pipt) == '-') {pipt->current_token = { flag_minus, NULL };pipt->pos++;return;}error();//如果都不是以上的字符，則報(bào)錯(cuò)并退出程序 }int eat(struct Interpreter* pipt, int type) {int current_token_value = pipt->current_token.value;if (pipt->current_token.type == type) {get_next_token(pipt);return current_token_value;}else {error();} }int expr(char* text) {struct Interpreter ipt = { text, 0 };get_next_token(&ipt);int left = eat(&ipt, flag_digital);//斷言第一個(gè)token是數(shù)字//int left = ipt.current_token.value;int op = ipt.current_token.type;//斷言第二個(gè)token是加號(hào)或減號(hào)if (op == flag_plus) {eat(&ipt, flag_plus);}else {eat(&ipt, flag_minus);}int right = eat(&ipt, flag_digital);//斷言第三個(gè)token是數(shù)字int result = 0;if (op == flag_plus) {result = left + right;}else if (op == flag_minus) {result = left - right;}return result; }int main() {char text[50];while (1){printf("請(qǐng)輸入算式：\n");//scanf_s("%s", text, sizeof(text));//sanf沒法輸入空格？int i = 0;while ((text[i] = getchar()) != '\n') {//putchar(text[i]);i++;}text[i] = '\0';int result = expr(text);printf("= %d\n\n", result);}return 0; }

運(yùn)行結(jié)果：（能夠自動(dòng)跳過空格，那么有人會(huì)問，數(shù)字之間的空格呢，能跳過嗎？通常來說，我們一般將數(shù)字間的空格視為輸入非法的，空格有其另外的作用！）

請(qǐng)輸入算式： 3+5 = 8請(qǐng)輸入算式： 44 + 55 = 99請(qǐng)輸入算式：555 + 555 = 1110請(qǐng)輸入算式：

總結(jié)

在第 1 部分中，您學(xué)習(xí)了兩個(gè)重要概念，即token標(biāo)記和詞法分析器lexical analyzer。今天我想談?wù)勗~素lexeme、解析parsing和解析器parsers。

您已經(jīng)了解標(biāo)記。但是為了讓我完成對(duì)標(biāo)記的討論，我需要提及詞素。什么是詞素？詞素是形成一個(gè)標(biāo)記的字符序列。在下圖中，您可以看到一些標(biāo)記和示例詞素的示例，希望它可以使它們之間的關(guān)系變得清晰：

現(xiàn)在，還記得我們的朋友expr方法嗎？我之前說過，這就是算術(shù)表達(dá)式的解釋實(shí)際發(fā)生的地方。但在解釋一個(gè)表達(dá)式之前，您首先需要識(shí)別它是什么類型的短語(yǔ)，例如，是加法還是減法。這就是expr方法本質(zhì)上所做的：它在從get_next_token方法獲得的標(biāo)記流中找到結(jié)構(gòu)，然后解釋已識(shí)別的短語(yǔ)，生成算術(shù)表達(dá)式的結(jié)果。

在標(biāo)記流中找到結(jié)構(gòu)的過程，或者換句話說，識(shí)別標(biāo)記流中的短語(yǔ)的過程稱為解析。執(zhí)行該工作的解釋器或編譯器的部分稱為解析器。

所以現(xiàn)在您知道expr方法是解析和解釋發(fā)生的解釋器的一部分- expr方法首先嘗試識(shí)別（解析）INTEGER -> PLUS -> INTEGER或INTEGER -> MINUS -> INTEGER短語(yǔ)標(biāo)記流，在成功識(shí)別（解析）其中一個(gè)短語(yǔ)后，該方法會(huì)對(duì)其進(jìn)行解釋，并將兩個(gè)整數(shù)的加法或減法結(jié)果返回給調(diào)用者。

現(xiàn)在又到了鍛煉的時(shí)候了。

1、擴(kuò)展計(jì)算器以處理兩個(gè)整數(shù)的乘法
2、擴(kuò)展計(jì)算器以處理兩個(gè)整數(shù)的除法
3、修改代碼以解釋包含任意數(shù)量的加法和減法的表達(dá)式，例如“9 - 5 + 3 + 11”

檢查你的理解。

1、什么是詞素lexeme？【詞素是形成一個(gè)標(biāo)記的字符序列】
2、在標(biāo)記流中找到結(jié)構(gòu)的過程的名稱是什么，或者換句話說，識(shí)別該標(biāo)記流中某個(gè)短語(yǔ)的過程的名稱是什么？【解析】
3、執(zhí)行解析的解釋器（編譯器）部分的名稱是什么？【解析器】

總結(jié)

以上是生活随笔為你收集整理的【编译原理】让我们来构建一个简单的解释器（Let’s Build A Simple Interpreter. Part 2.）（python/c/c++版）（笔记）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：【编译原理】让我们来构建一个简单的解释器
下一篇：【编译原理】让我们来构建一个简单的解释器