C 语言 LL(1) 解析器程序

7 Jan 2025 | 7 分钟阅读

在本文中，我们将讨论 C 语言中的 **LL(1) 解析器** 程序。但在讨论 LL(1) 解析器的实现之前，我们需要了解 LL(1) 解析器及其规则。

什么是 LL(1) 解析器？

LL(1) 是一种 **自顶向下解析器**。它处理 LL(1) 类文法。第一个 **"L"** 表示输入是从 **左** 到 **右** 扫描的，第二个 **"L"** 表示输入是 **从左** 推导的，而 **"1"** 表示用于做出决定的前看符号的数量。

在创建 **LL(1)** 解析表之前，我们必须确定文法是否为 **LL(1)**。

构建 LL(1) 解析表的规则

我们可以使用以下规则来理解 C 语言中的 **LL(1) 解析**。

步骤 1：文法求值

首先，检查以 **BNF (巴克斯范式)** 编写的给定上下文无关文法。
确保文法是 **LL(1) 可解析** 且 **不是左递归** 的。可以使用适当的转换来消除 **左递归**。
将文法格式化为对分析有用的 **格式**，例如解析规则表或 **产生式规则** 的集合。

步骤 2：FIRST 集和 FOLLOW 集

对于文法中的每个 **非终结符**，确定其 **FIRST 集。非终结符 **A** 的 **FIRST 集** 包含所有可以派生自 **A** 的字符串的 **终结符**。
对于文法中的每个非终结符，确定其 **FOLLOW 集。非终结符 A 的 **FOLLOW 集** 包含所有可以紧跟派生自 **A** 的字符串的终结符。

步骤 3：如何构建解析表

在 **A** 是 **非终结符** 且 α 是由终结符和/或非终结符组成的字符串的每种情况下
对于 **FIRST(α)** 中的每个终结符 **t**，将 **A->α** 添加到表中的 **[A, t]** 条目。
如果 **ε (epsilon)** 在 **FIRST(α)** 中，则对于 **FOLLOW(A)** 中的每个终结符 **t**，将 **A -> α** 添加到表中的 **[A, t]** 条目。
如果 **FIRST()** 中包含 **ε** 且 **FOLLOW(A)** 中包含 **$ (输入结束符)**，则将 **A->** 添加到表项 **[A, $]**。
如果解析表中存在任何冲突（同一单元格中有多个条目），则该文法不是 **LL(1)** 的，不能使用 **LL(1) 解析**。

步骤 4：如何使用解析表

首先，从头开始创建一个 **解析栈**，并将文法的开始符号放入其中。
**读取** 当前的输入标记。
使用当前 **输入标记** 作为 **列索引**，并使用栈顶作为行索引来查询 **解析表**。
如果表项为空或包含错误，则表示输入存在语法 **错误**。
如果表项包含产生式 **A -> α**，则将 **A** 从栈中 **弹出**，然后按相反顺序将其 **压入** 栈中。

C 语言 LL(1) 解析表示例程序

在此程序中，程序读取一个名为 **"text.txt"** 的文件来读取文法。Epsilon 由符号 **"^"** 表示。程序认为输入文法为 **LL(1)**。

text.txt

E->TA
A->+TA|^
T->FB
B->*FB|^
F->t|(E)

程序

#include<stdio.h>
#include<string.h>
#define TSIZE 128
// If the input is jth non-terminal, table[i][j] retains the index of production that must be applied to the ith variable.
int table[100][TSIZE];
// keeps a complete list of terminals
 //When using the ASCII value to index terminals, terminal[i] = 1 indicates that the character has an ASCII value.
char terminal[TSIZE];
// only saves the list of terminals that begin with the upper case letters "A" through "Z."
 //Non-terminal[i] denotes that the grammar is non-terminal and the alphabet is present.
char non-terminal[26];
// structure to hold each production
// str[] stores the production
// len is the length of production
struct product {
    char str[100];
    int len;
}pro[20];
// no of productions in form A->ß
int no_pro;
char first[26][TSIZE];
char follow[26][TSIZE];
// stores first of each production in form A->ß
char first_rhs[100][TSIZE];
// check if the symbol is non-terminal
int isNT(char c) {
    return c >= 'A' && c <= 'Z';
}
// reading data from the file
void readFromFile() {
    FILE* fptr;
fptr = fopen("text.txt", "r");
    char buffer[255];
    int i;
    int j;
    while (fgets(buffer, sizeof(buffer), fptr)) {
printf("%s", buffer);
        j = 0;
        non-terminal[buffer[0] - 'A'] = 1;
        for (i = 0; i<strlen(buffer) - 1; ++i) {
            if (buffer[i] == '|') {
                ++no_pro;
pro[no_pro - 1].str[j] = '\0';
pro[no_pro - 1].len = j;
                pro[no_pro].str[0] = pro[no_pro - 1].str[0];
                pro[no_pro].str[1] = pro[no_pro - 1].str[1];
                pro[no_pro].str[2] = pro[no_pro - 1].str[2];
                j = 3;
            }
            else {
                pro[no_pro].str[j] = buffer[i];
                ++j;
                if (!isNT(buffer[i]) && buffer[i] != '-' && buffer[i] != '>') {
                    terminal[buffer[i]] = 1;
                }
            }
        }
        pro[no_pro].len = j;
        ++no_pro;
    }
}
void add_FIRST_A_to_FOLLOW_B(char A, char B) {
    int i;
    for (i = 0; i< TSIZE; ++i) {
        if (i != '^')
follow[B - 'A'][i] = follow[B - 'A'][i] || first[A - 'A'][i];
    }
}
void add_FOLLOW_A_to_FOLLOW_B(char A, char B) {
    int i;
    for (i = 0; i< TSIZE; ++i) {
        if (i != '^')
follow[B - 'A'][i] = follow[B - 'A'][i] || follow[A - 'A'][i];
    }
}
void FOLLOW() {
    int t = 0;
    int i, j, k, x;
    while (t++ <no_pro) {
        for (k = 0; k < 26; ++k) {
            if (!non-terminal[k])    continue;
            char nt = k + 'A';
            for (i = 0; i<no_pro; ++i) {
                for (j = 3; j < pro[i].len; ++j) {
                    if (nt == pro[i].str[j]) {
                        for (x = j + 1; x < pro[i].len; ++x) {
                            char sc = pro[i].str[x];
                            if (isNT(sc)) {
add_FIRST_A_to_FOLLOW_B(sc, nt);
                                if (first[sc - 'A']['^'])
                                    continue;
                            }
                            else {
follow[nt - 'A'][sc] = 1;
                            }
                            break;
                        }
                        if (x == pro[i].len)
add_FOLLOW_A_to_FOLLOW_B(pro[i].str[0], nt);
                    }
                }
            }
        }
    }
}
void add_FIRST_A_to_FIRST_B(char A, char B) {
    int i;
    for (i = 0; i< TSIZE; ++i) {
        if (i != '^') {
first[B - 'A'][i] = first[A - 'A'][i] || first[B - 'A'][i];
        }
    }
}
void FIRST() {
    int i, j;
    int t = 0;
    while (t <no_pro) {
        for (i = 0; i<no_pro; ++i) {
            for (j = 3; j < pro[i].len; ++j) {
                char sc = pro[i].str[j];
                if (isNT(sc)) {
add_FIRST_A_to_FIRST_B(sc, pro[i].str[0]);
                    if (first[sc - 'A']['^'])
                        continue;
                }
                else {
                    first[pro[i].str[0] - 'A'][sc] = 1;
                }
                break;
            }
            if (j == pro[i].len)
                first[pro[i].str[0] - 'A']['^'] = 1;
        }
        ++t;
    }
}
void add_FIRST_A_to_FIRST_RHS__B(char A, int B) {
    int i;
    for (i = 0; i< TSIZE; ++i) {
        if (i != '^')
first_rhs[B][i] = first[A - 'A'][i] || first_rhs[B][i];
    }
}
// Calculates FIRST(ß) for each A->ß
void FIRST_RHS() {
    int i, j;
    int t = 0;
    while (t <no_pro) {
        for (i = 0; i<no_pro; ++i) {
            for (j = 3; j < pro[i].len; ++j) {
                char sc = pro[i].str[j];
                if (isNT(sc)) {
add_FIRST_A_to_FIRST_RHS__B(sc, i);
                    if (first[sc - 'A']['^'])
                        continue;
                }
                else {
first_rhs[i][sc] = 1;
                }
                break;
            }
            if (j == pro[i].len)
first_rhs[i]['^'] = 1;
        }
        ++t;
    }
}
int main() {
readFromFile();
    follow[pro[0].str[0] - 'A']['$'] = 1;
FIRST();
FOLLOW();
    FIRST_RHS();
    int i, j, k;

    // display first of each variable
printf("\n");
    for (i = 0; i<no_pro; ++i) {
        if (i == 0 || (pro[i - 1].str[0] != pro[i].str[0])) {
            char c = pro[i].str[0];
printf("FIRST OF %c: ", c);
            for (j = 0; j < TSIZE; ++j) {
                if (first[c - 'A'][j]) {
printf("%c ", j);
                }
            }
printf("\n");
        }
    }

    // display follow of each variable
printf("\n");
    for (i = 0; i<no_pro; ++i) {
        if (i == 0 || (pro[i - 1].str[0] != pro[i].str[0])) {
            char c = pro[i].str[0];
printf("FOLLOW OF %c: ", c);
            for (j = 0; j < TSIZE; ++j) {
                if (follow[c - 'A'][j]) {
printf("%c ", j);
                }
            }
printf("\n");
        }
    }
    // display first of each variable ß
    // in form A->ß
printf("\n");
    for (i = 0; i<no_pro; ++i) {
printf("FIRST OF %s: ", pro[i].str);
        for (j = 0; j < TSIZE; ++j) {
            if (first_rhs[i][j]) {
printf("%c ", j);
            }
        }
printf("\n");
    }
    // the parse table contains '$'
    // set terminal['$'] = 1
    // to include '$' in the parse table
terminal['$'] = 1;

    // the parse table do not read '^'
    // as input
    // so we set terminal['^'] = 0
    // to remove '^' from terminals
terminal['^'] = 0;

    // printing parse table
printf("\n");
printf("\n\t**************** LL(1) PARSING TABLE *******************\n");
printf("\t--------------------------------------------------------\n");
printf("%-10s", "");
    for (i = 0; i< TSIZE; ++i) {
        if (terminal[i])   printf("%-10c", i);
    }
printf("\n");
    int p = 0;
    for (i = 0; i<no_pro; ++i) {
        if (i != 0 && (pro[i].str[0] != pro[i - 1].str[0]))
            p = p + 1;
        for (j = 0; j < TSIZE; ++j) {
            if (first_rhs[i][j] &&j != '^') {
                table[p][j] = i + 1;
            }
            else if (first_rhs[i]['^']) {
                for (k = 0; k < TSIZE; ++k) {
                    if (follow[pro[i].str[0] - 'A'][k]) {
                        table[p][k] = i + 1;
                    }
                }
            }
        }
    }
    k = 0;
    for (i = 0; i<no_pro; ++i) {
        if (i == 0 || (pro[i - 1].str[0] != pro[i].str[0])) {
printf("%-10c", pro[i].str[0]);
            for (j = 0; j < TSIZE; ++j) {
                if (table[k][j]) {
printf("%-10s", pro[table[k][j] - 1].str);
                }
                else if (terminal[j]) {
printf("%-10s", "");
                }
            }
            ++k;
printf("\n");
        }
    }
}

输出

FIRST OF E: ( t
FIRST OF A: + ^
FIRST OF T: ( t
FIRST OF B: * ^
FIRST OF F: ( t

FOLLOW OF E: $ )
FOLLOW OF A: $ )
FOLLOW OF T: + $ )
FOLLOW OF B: + $ )
FOLLOW OF F: * + ) $ 

FIRST OF TA: ( t
FIRST OF +TA: + ^
FIRST OF FB: ( t
FIRST OF *FB: * ^
FIRST OF t: t
FIRST OF (E): (

(   )   *   +   t   $
E           TA              TA
A           ^   ^   ^   +TA   ^   
T           FB              FB
B           ^   ^   *FB   ^   ^   
F           (E)              t

下一主题C 语言编程测试

← 上一个下一个 →

C 语言 LL(1) 解析器程序

什么是 LL(1) 解析器？

构建 LL(1) 解析表的规则

C 语言 LL(1) 解析表示例程序

程序

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

C 语言教程

C 语言控制语句

C 语言函数

C 语言数组

C 语言指针

C 语言动态内存

C 语言字符串

C 语言数学

C 语言结构体和联合体

C 语言文件处理

C 语言预处理器

C 语言命令行

C 语言程序

C 语言面试

选择题

C 语言编程测试

C 语言基础测试

C 语言控制语句测试

C 语言函数测试

C 语言数组测试

C 语言指针测试

C 语言字符串测试

C 语言结构体测试

C 语言预处理器测试

数学

C 语言杂项

C 语言 LL(1) 解析器程序

什么是 LL(1) 解析器？

构建 LL(1) 解析表的规则

C 语言 LL(1) 解析表示例程序

程序

相关帖子

C 语言二次探测程序

C 语言 fcvt()

C 语言 iconv_close() 函数

C 语言 Char_bit

C 语言表达式

C 语言嵌套 switch case

C 语言和 Shell 脚本的区别

C 语言缓冲区溢出攻击

C 语言 Gety() 函数

C 语言静态变量和寄存器变量的区别

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器